Skip to main content
Solved

Is there an API to find a file by leaf name given a folder ID?

  • September 1, 2023
  • 4 replies
  • 387 views

ncw
  • New Participant
  • 2 replies

I have a folder ID and I have a leaf name. What I want is the full file details.


At the moment I list the folder and look for the leaf name. This works but for large folders is very slow and requires lots of API calls.


Is there an API which takes as input a folder ID and a leaf name and returns a file info just for that file?


The best I’ve seen is get-search but that has several problems - it does a recursive search when given folder IDs which I don’t want, and I want an exact match for the file name, not a fuzzy one (its still fuzzy if you use “double quotes”).


Any ideas?


BTW this is for rclone!

Best answer by rbarbosa Box

HI @ncw


What a pleasure to get the chance to support rclone.


With our end of life for WebDav in April 2023, and our Box drive app only being available for windows and mac, rclone becomes the only option for linux users.


To your point, it all depends on the use case, mainly why you want to perform that check and what are you going to do after.


Nevertheless here is an option that come to mind, hopefully it will be applicable.


Assuming you have the folder id, the file name, and optionally it’s size, you can perform an HTTP OPTIONS to the /files/content endpoint.


We call this the pre-flight check.


In the body you pass:


{

  "name": "File.mp4",

  "size": 1024,

  "parent": {

    "id": "123"

  }

}


If this returns a 200 it means the file does not exist.

if you get a 409 conflict that can mean a couple of things, but those will be included in the rest of the error message:



  • The file already exists

  • You exceeded your quota


So for example with a file that exists:


curl --location --request OPTIONS 'https://api.box.com/2.0/files/content' \

--header 'Content-Type: application/json' \

--header 'Authorization: Bearer 1!M7xj...2QN' \

--data '{

  "name": "test_upload.txt",

  "size": 7409,

  "parent": {

    "id": "198948099055"

  }

}'


Results in:


{

    "type": "error",

    "status": 409,

    "code": "item_name_in_use",

    "context_info": {

        "conflicts": {

            "type": "file",

            "id": "1164548866790",

            "file_version": {

                "type": "file_version",

                "id": "1268299299590",

                "sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709"

            },

            "sequence_id": "0",

            "etag": "0",

            "sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",

            "name": "test_upload.txt"

        }

    },

    "help_url": "http://developers.box.com/docs/#errors",

    "message": "Item with the same name already exists",

    "request_id": "8kp70chhplprfrjr"

}


From here you can decide to upload a new file version, or abort the operation entirely.


You can learn more on the preflight check:



A quick note on the search option.

The major downsize of the search is that it can take minutes to index a recently created file.

The “exact” search query is not really that exact but helps a lot with the fuzzy search.

However you can restrict the search a lot further:



  • Limit the search to a specific folder only [ancestor_folder_ids]

  • Limit the search to look only in file name [content_types]

    (by default it searches name, description, tags, comments and the first 10kbytes of the file)

  • Limit the object returned to a specific type like only files [type]

  • Limit by file extension [file_extensions]

  • Limit the amount of information returned [fields]

    (this should improve performance, e.g. id,type,name,size)


Always check if the returned file name is in fact the same.


For more information checkout:



Let us know if this helps with your use case.


Cheers

View original
Did this topic help you find an answer to your question?

4 replies

smartoneinok Box
Forum|alt.badge.img
  • Senior Developer Advocate
  • 181 replies
  • September 5, 2023

Hello! 👋


I’m not familiar with rclone… but I’m assuming you have seen this?


Also, I’m not sure what you mean by leaf? Do you mean the file name?


Thanks,

Alex, Box Developer Advocate 🥑


ncw
  • Author
  • New Participant
  • 2 replies
  • September 6, 2023

Sorry I should have made myself clearer! I’m the lead developer of rclone and I wrote the box integration.


What I want is an API method to get a file by name given a folder ID. The current way to do this is to list the folder which takes a long time if the folder has 1,000s of entries.


rbarbosa Box
  • Developer Advocate
  • 553 replies
  • Answer
  • September 8, 2023

HI @ncw


What a pleasure to get the chance to support rclone.


With our end of life for WebDav in April 2023, and our Box drive app only being available for windows and mac, rclone becomes the only option for linux users.


To your point, it all depends on the use case, mainly why you want to perform that check and what are you going to do after.


Nevertheless here is an option that come to mind, hopefully it will be applicable.


Assuming you have the folder id, the file name, and optionally it’s size, you can perform an HTTP OPTIONS to the /files/content endpoint.


We call this the pre-flight check.


In the body you pass:


{

  "name": "File.mp4",

  "size": 1024,

  "parent": {

    "id": "123"

  }

}


If this returns a 200 it means the file does not exist.

if you get a 409 conflict that can mean a couple of things, but those will be included in the rest of the error message:



  • The file already exists

  • You exceeded your quota


So for example with a file that exists:


curl --location --request OPTIONS 'https://api.box.com/2.0/files/content' \

--header 'Content-Type: application/json' \

--header 'Authorization: Bearer 1!M7xj...2QN' \

--data '{

  "name": "test_upload.txt",

  "size": 7409,

  "parent": {

    "id": "198948099055"

  }

}'


Results in:


{

    "type": "error",

    "status": 409,

    "code": "item_name_in_use",

    "context_info": {

        "conflicts": {

            "type": "file",

            "id": "1164548866790",

            "file_version": {

                "type": "file_version",

                "id": "1268299299590",

                "sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709"

            },

            "sequence_id": "0",

            "etag": "0",

            "sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",

            "name": "test_upload.txt"

        }

    },

    "help_url": "http://developers.box.com/docs/#errors",

    "message": "Item with the same name already exists",

    "request_id": "8kp70chhplprfrjr"

}


From here you can decide to upload a new file version, or abort the operation entirely.


You can learn more on the preflight check:



A quick note on the search option.

The major downsize of the search is that it can take minutes to index a recently created file.

The “exact” search query is not really that exact but helps a lot with the fuzzy search.

However you can restrict the search a lot further:



  • Limit the search to a specific folder only [ancestor_folder_ids]

  • Limit the search to look only in file name [content_types]

    (by default it searches name, description, tags, comments and the first 10kbytes of the file)

  • Limit the object returned to a specific type like only files [type]

  • Limit by file extension [file_extensions]

  • Limit the amount of information returned [fields]

    (this should improve performance, e.g. id,type,name,size)


Always check if the returned file name is in fact the same.


For more information checkout:



Let us know if this helps with your use case.


Cheers


ncw
  • Author
  • New Participant
  • 2 replies
  • September 9, 2023

Thank you very much for your informative answer.


The preflight upload check is exactly what I need. And in fact rclone already uses it before uploads, but I hadn’t twigged that it is a useful method for finding out file IDs given the file names.


The preflight check seems to give back a File Mini - it would be perfect if it could give back the full File that is what I need but that is only one API call away and I think looking up a File from an ID is probably very quick and easy for the API servers.


I did look at the search API too but I think it running behind real time will make it not work for rclone’s needs.


So I think this is going to work out - thank you 🙂 I’ve send a copy to the users for testing so we can see how they get on. This discussion is on the rclone forum if you are interested.


I think it might be worth calling out in the docs more the fact that the upload preflight is useful for turning (fileName, directoryID) into fileID as though I knew it, had used it, I still didn’t find it in the docs when I was looking 🙂


Thank you for your help.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings