Skip to main content

I have a folder ID and I have a leaf name. What I want is the full file details.



At the moment I list the folder and look for the leaf name. This works but for large folders is very slow and requires lots of API calls.



Is there an API which takes as input a folder ID and a leaf name and returns a file info just for that file?



The best I’ve seen is get-search but that has several problems - it does a recursive search when given folder IDs which I don’t want, and I want an exact match for the file name, not a fuzzy one (its still fuzzy if you use “double quotes”).



Any ideas?



BTW this is for rclone!

Hello! 👋



I’m not familiar with rclone… but I’m assuming you have seen this?



Also, I’m not sure what you mean by leaf? Do you mean the file name?



Thanks,


Alex, Box Developer Advocate 🥑


Sorry I should have made myself clearer! I’m the lead developer of rclone and I wrote the box integration.



What I want is an API method to get a file by name given a folder ID. The current way to do this is to list the folder which takes a long time if the folder has 1,000s of entries.


HI @ncw



What a pleasure to get the chance to support rclone.



With our end of life for WebDav in April 2023, and our Box drive app only being available for windows and mac, rclone becomes the only option for linux users.



To your point, it all depends on the use case, mainly why you want to perform that check and what are you going to do after.



Nevertheless here is an option that come to mind, hopefully it will be applicable.



Assuming you have the folder id, the file name, and optionally it’s size, you can perform an HTTP OPTIONS to the /files/content endpoint.



We call this the pre-flight check.



In the body you pass:



{

"name": "File.mp4",

"size": 1024,

"parent": {

"id": "123"

}

}



If this returns a 200 it means the file does not exist.


if you get a 409 conflict that can mean a couple of things, but those will be included in the rest of the error message:





  • The file already exists


  • You exceeded your quota




So for example with a file that exists:



curl --location --request OPTIONS 'https://api.box.com/2.0/files/content' \

--header 'Content-Type: application/json' \

--header 'Authorization: Bearer 1!M7xj...2QN' \

--data '{

"name": "test_upload.txt",

"size": 7409,

"parent": {

"id": "198948099055"

}

}'



Results in:



{

"type": "error",

"status": 409,

"code": "item_name_in_use",

"context_info": {

"conflicts": {

"type": "file",

"id": "1164548866790",

"file_version": {

"type": "file_version",

"id": "1268299299590",

"sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709"

},

"sequence_id": "0",

"etag": "0",

"sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",

"name": "test_upload.txt"

}

},

"help_url": "http://developers.box.com/docs/#errors",

"message": "Item with the same name already exists",

"request_id": "8kp70chhplprfrjr"

}



From here you can decide to upload a new file version, or abort the operation entirely.



You can learn more on the preflight check:





A quick note on the search option.


The major downsize of the search is that it can take minutes to index a recently created file.


The “exact” search query is not really that exact but helps a lot with the fuzzy search.


However you can restrict the search a lot further:





  • Limit the search to a specific folder only ancestor_folder_ids]


  • Limit the search to look only in file name content_types]


    (by default it searches name, description, tags, comments and the first 10kbytes of the file)


  • Limit the object returned to a specific type like only files ctype]


  • Limit by file extension nfile_extensions]


  • Limit the amount of information returned ufields]


    (this should improve performance, e.g. id,type,name,size)




Always check if the returned file name is in fact the same.



For more information checkout:





Let us know if this helps with your use case.



Cheers


Thank you very much for your informative answer.



The preflight upload check is exactly what I need. And in fact rclone already uses it before uploads, but I hadn’t twigged that it is a useful method for finding out file IDs given the file names.



The preflight check seems to give back a File Mini - it would be perfect if it could give back the full File that is what I need but that is only one API call away and I think looking up a File from an ID is probably very quick and easy for the API servers.



I did look at the search API too but I think it running behind real time will make it not work for rclone’s needs.



So I think this is going to work out - thank you 🙂 I’ve send a copy to the users for testing so we can see how they get on. This discussion is on the rclone forum if you are interested.



I think it might be worth calling out in the docs more the fact that the upload preflight is useful for turning (fileName, directoryID) into fileID as though I knew it, had used it, I still didn’t find it in the docs when I was looking 🙂



Thank you for your help.


Reply