2

I have multiple aprx's in a folder where I would like to update the data sources. I have written out the code below in arcpy, but it gets stuck looping on the first aprx and doesn't move onto the next one

import arcpy, os
ws = arcpy.env.workspace = r"C:\Data" #folder where multiple aprx's are stored
#list the APRXs of the workspace folder
 for root, dirs, files, in os.walk(ws):
 for f in files:
 if f.endswith(".aprx"):
 print("checking " + f)
 folder = f.replace(".aprx", "")
 item = ws + "\\" + folder + "\\" + f
 print(item)
 for path in root:
 aprx = arcpy.mp.ArcGISProject(item)
 aprx.updateConnectionProperties(r'C:\Users.sde', #old data source
 r'C:\Users2.sde') #new data source
 aprx.saveACopy(item)
 print("saved new version of " + f)
 del aprx
 print("done with " + f) 

I assume I am doing something wrong that stops the code moving onto the next aprx, as what happens is it fixes the data source for the first aprx, saves it, then opens it again and fixes it again. Seems to be an infinite loop!

PolyGeo
65.5k29 gold badges115 silver badges349 bronze badges
asked Jun 11, 2024 at 18:58
1
  • 4
    Your indentation is off in the code. Sometimes it's better to compile a list from the os.walk, then visit only the objects of interest in a second loop. Commented Jun 11, 2024 at 19:44

2 Answers 2

3

You don't need the for path in root: loop. This is causing your infinity loop.

You also don't create the directory that you try to save to which I think would cause problems.

I created some code below that does what you need it to do - except for the updateConnectionProperties line which you can add back in (I don't have any .sde's made on this machine).

Your code is good it just looks like you were having some trouble understanding the initial for loop where you walk the directories. That initial loop goes through each subdir looking for aprx files and root becomes that entire filepath including the subdirectory. For this reason, you don't need a for loop for each path in root.

import arcpy, os
ws = arcpy.env.workspace = r"C:\Data" #folder where multiple aprx's are stored
#list the APRXs of the workspace folder
for root, dirs, files, in os.walk(ws):
 for f in files:
 if f.endswith(".aprx"):
 print(os.path.join(root, f))
 folder = f.replace(".aprx", "")
 item = os.path.join(root, f)
 outItem = os.path.join(root, folder, f)
 try:
 os.mkdir(os.path.join(root, folder))
 except:
 print('Folder exists')
 aprx = arcpy.mp.ArcGISProject(item)
 print(os.path.join(root, folder, f))
 aprx.saveACopy(outItem)
 print("saved new version of " + f) 
answered Jun 11, 2024 at 19:47
1

As @jdavid05 said, the secondary loop within the file loop is causing the error. the root returned by a os.walk is a string:

[ ('<root>', ['<dir>', ...], ['<file>', ...]), ... ]

with fake paths:

# root dirs files
('C:\User', ['MyPhotos', 'MyDocuments'], ['invoice.txt', 'meme.jpg']),
('C:\User\MyPhotos', ['Vacation'], ['profile.jpg']),
('C:\User\MyPhotos\Vacation', [], ['arrival.jpg', 'departure.jpg']),
('C:\User\MyDocuments', ['Memoir'], ['bill.pdf']),
('C:\User\MyDocuments\Memoir', [], []),

With each subsequent tuple in that os.walk iterator being the original root joined to the next dir in the dir list so:

for root, dirs, files in os.walk(r'<path>'):
 for file in files:
 ...
 for path in root:
 print(path)

Would result in:

>>> 'C'
>>> ':'
>>> '\'
...

for every aprx file below the top of the walk.

The only thing you need to do with os.walk to get every file in every subdirectory of a root directory is:

for root, _, files in os.walk(r'<path>'):
 for file in files:
 if not file.endswith('.<extension>'): continue
 file_path = os.path.join(root, file)
 # do something with file path

The third line there allows you to save an indent level by advancing the iterator on a pattern mismatch instead of shielding your code with a pattern match

So knowing how walk works, now we can implement some clean code that uses it properly:

import arcpy, os
def batch_update_connections(workspace: os.PathLike, old_path: os.PathLike, new_path: os.PathLike) -> None:
 for root, _, files in os.walk(workspace):
 for file in files:
 # Skip if not an ArcGIS project file
 if not file.endswith('.aprx'): continue
 
 # move into the root folder
 os.chdir(root)
 old_path = os.path.abspath(old_path)
 new_path = os.path.abspath(new_path)
 # Create output folder <root>/<file>_updated
 out_folder = os.path.join(root, f"{file.split('.')[0]}_updated")
 out_file = os.path.join(out_folder, file)
 os.makedirs(out_folder, exist_ok=True)
 
 # Skip and warn if output file already exists
 if os.path.exists(out_file): print(f"{out_file} in {out_folder} already updated") ;continue
 
 # Update connections and save to output folder
 aprx = arcpy.mp.ArcGISProject(os.path.join(root, file))
 aprx.updateConnectionProperties(old_path, new_path)
 aprx.saveACopy(out_file)
 
 # Notify user of completion
 print(f'Updated connections in {file}, saved to {out_folder}.')
def main():
 workspace = r'\path\to\workspace'
 #workspace = os.getcwd() # Use current working directory
 old_path = r'\path\to\old\connection'
 new_path = r'\path\to\new\connection'
 batch_update_connections(workspace, old_path, new_path)
if __name__ == '__main__':
 main()
  1. Define a batch update function that takes a workspace and the connection paths
  2. Walk the workspace ignoring dirs (root, _, files)
  3. Immediately skip any file that does not end with .aprx
  4. Move into the current root to allow for relative paths to be used (../data)
  5. Convert the paths to absolute paths if relative paths were used
  6. Create the output folder path and the output file path
  7. Use os.makedirs to avoid having to write a try: except block
  8. Skip and warn the user if the project has already been updated
  9. Open the project, update the connection, and save it to the output folder
  10. Wrap all the functionality in a main() function and call it in if __name__ == '__main__' block

The last step makes this a standalone script that can be run directly from the terminal. If a lot of people need to use this and all you want them to do is run it, you can toggle the os.getcwd() line and have them run it from the root of their project directory. All you need to do is set up the relative or absolute paths for the old and new datasources.

answered Jun 21, 2024 at 0:52

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.