I am using a python script to search for files in a specific folder, the folder contains over 25K files
I have an excel with filenames that the script needs to search for.
Part of the script looks like this:
Start searching and downloading
for file_name in file_names_to_download:
print(f"Searching for {file_name} …“)
search_results = client.search().query(query=f”{file_name}“, limit=10, ancestor_folder_ids=[folder_id])
exact_matches = [item for item in search_results if item.name == f”{file_name}"]
if not exact_matches:
print(f"No exact match found for {file_name}.")
df.loc[df.iloc[:, 0] == file_name, 'Status'] = 'Not Found'
continue
if len(exact_matches) > 1:
print(f"More than one exact match found for {file_name}. Using the first match.")
item_to_download = exact_matches[0]
print(f"Found {file_name}. Downloading ...")
found_files.append(item_to_download.name)
item_download_path = os.path.join(download_path, item_to_download.name)
try:
with open(item_download_path, 'wb') as f:
item_to_download.download_to(f)
print(f"Download completed for {item_to_download.name}.")
df.loc[df.iloc[:, 0] == file_name, 'Status'] = 'Downloaded'
except Exception as e:
print(f"Failed to download {item_to_download.name}: {e}")
failed_downloads.append(item_to_download.name)
df.loc[df.iloc[:, 0] == file_name, 'Status'] = 'Failed'
I have noticed it takes a very long time for the API to return a result. Is there a way do you a faster search?
Is something wrong with my code?