Hi @josh.pinkerton
Welcome to the forum!
I’ve asked around to the engineering team and it seems to be a legacy bug.
I can’t say when they will get it fixed.
I does look like you’re using an SDK, can you pinpoint exactly which one?
Hello @rbarbosa ,
Thank you for getting back to me. I am using the box-java-sdk https://github.com/box/box-java-sdk, version 4.7.0 which was released January 2024. The API call is BoxFolder.canUpload
which calls this Preflight Check endpoint https://developer.box.com/reference/options-files-content/ that is a critical API call as far as I know for determining if a file exists since it is necessary to know if the File re-upload APIs should be called or regular Folder upload file APIs should be called.
Our use case is we have a batch job that uploads files to Box but will need to re-upload them again if the job is rerun. Is there a workaround your engineering team suggests for efficiently determining if a file exists already, and to retrieve its ID which is necessary for re-uploading the file?
Thanks!
Hi @josh.pinkerton ,
That is the recommended way, in fact I use it all the time in the Python SDK to determine if I nedd to upload or update a document.
For example:
def file_upload(client: Client, file_path: str, folder: Folder) -> File:
"""upload a file to box"""
file_size = os.path.getsize(file_path)
file_name = os.path.basename(file_path)
file_id = None
try:
pre_flight_arg = PreflightFileUploadCheckParent(id=folder.id)
client.uploads.preflight_file_upload_check(name=file_name, size=file_size, parent=pre_flight_arg)
except BoxAPIError as err:
if err.response_info.body.get("code", None) == "item_name_in_use":
file_id = err.response_info.body["context_info"]["conflicts"]["id"]
else:
raise err
upload_arg = UploadFileAttributes(file_name, UploadFileAttributesParentField(folder.id))
if file_id is None:
# upload new file
files: Files = client.uploads.upload_file(upload_arg, file=open(file_path, "rb"))
file = files.entries[0]
else:
# upload new version
files: Files = client.uploads.upload_file_version(file_id, upload_arg, file=open(file_path, "rb"))
file = files.entries[0]
return file
Please note the above sample is using the new Next Gen SDK, which is not yet available for Java. These Gen SDKs are automatically generated from the OpenAPI spec and are much “closer” to the API than the classic ones, meaning less “intelligence” on the SDK side.
But I haven’t found that particular issue, where the pre-flight check responds differently (sting vs array).
I think the pre-flight check can return 2 types of errors, the file exists or you exceeded the storage limit, and I haven’t found the last one, or both at the same time.
However it is possible that the error from a pre-flight check is different from the error if you try to always upload the file first, and then fall back to the update file.
For example, in this method to create a folder, I first always try to create the folder, and if I get an error I read the existing folder id from the conflicts. Notice that in this case is an array:
def create_box_folder(client: Client, folder_name: str, parent_folder: Folder) -> Folder:
"""create a folder in box"""
try:
folder = client.folders.create_folder(folder_name, CreateFolderParent(id=parent_folder.id))
except BoxAPIError as err:
if err.response_info.body.get("code", None) == "item_name_in_use":
folder_id = err.response_info.bodys"context_info"]n"conflicts"]"0]n"id"]
folder = client.folders.get_folder_by_id(folder_id)
else:
raise err
logging.info("Folder %s with id: %s", folder.name, folder.id)
return folder
I’m not versed in Java but hopefully these examples help to illustrate my point.