Hello! 👋
I’m not familiar with rclone… but I’m assuming you have seen this?
Also, I’m not sure what you mean by leaf? Do you mean the file name?
Thanks,
Alex, Box Developer Advocate 🥑
Sorry I should have made myself clearer! I’m the lead developer of rclone and I wrote the box integration.
What I want is an API method to get a file by name
given a folder ID
. The current way to do this is to list the folder which takes a long time if the folder has 1,000s of entries.
HI @ncw
What a pleasure to get the chance to support rclone.
With our end of life for WebDav in April 2023, and our Box drive app only being available for windows and mac, rclone becomes the only option for linux users.
To your point, it all depends on the use case, mainly why you want to perform that check and what are you going to do after.
Nevertheless here is an option that come to mind, hopefully it will be applicable.
Assuming you have the folder id, the file name, and optionally it’s size, you can perform an HTTP OPTIONS
to the /files/content
endpoint.
We call this the pre-flight check.
In the body you pass:
{
"name": "File.mp4",
"size": 1024,
"parent": {
"id": "123"
}
}
If this returns a 200
it means the file does not exist.
if you get a 409 conflict
that can mean a couple of things, but those will be included in the rest of the error message:
- The file already exists
- You exceeded your quota
So for example with a file that exists:
curl --location --request OPTIONS 'https://api.box.com/2.0/files/content' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer 1!M7xj...2QN' \
--data '{
"name": "test_upload.txt",
"size": 7409,
"parent": {
"id": "198948099055"
}
}'
Results in:
{
"type": "error",
"status": 409,
"code": "item_name_in_use",
"context_info": {
"conflicts": {
"type": "file",
"id": "1164548866790",
"file_version": {
"type": "file_version",
"id": "1268299299590",
"sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709"
},
"sequence_id": "0",
"etag": "0",
"sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
"name": "test_upload.txt"
}
},
"help_url": "http://developers.box.com/docs/#errors",
"message": "Item with the same name already exists",
"request_id": "8kp70chhplprfrjr"
}
From here you can decide to upload a new file version, or abort the operation entirely.
You can learn more on the preflight check:
A quick note on the search option.
The major downsize of the search is that it can take minutes to index a recently created file.
The “exact” search query is not really that exact but helps a lot with the fuzzy search.
However you can restrict the search a lot further:
- Limit the search to a specific folder only
ancestor_folder_ids
]
- Limit the search to look only in file name
content_types
]
(by default it searches name, description, tags, comments and the first 10kbytes of the file)
- Limit the object returned to a specific type like only
files
ctype
]
- Limit by file extension n
file_extensions
]
- Limit the amount of information returned u
fields
]
(this should improve performance, e.g. id,type,name,size)
Always check if the returned file name is in fact the same.
For more information checkout:
Let us know if this helps with your use case.
Cheers
Thank you very much for your informative answer.
The preflight upload check is exactly what I need. And in fact rclone already uses it before uploads, but I hadn’t twigged that it is a useful method for finding out file IDs given the file names.
The preflight check seems to give back a File Mini - it would be perfect if it could give back the full File that is what I need but that is only one API call away and I think looking up a File from an ID is probably very quick and easy for the API servers.
I did look at the search API too but I think it running behind real time will make it not work for rclone’s needs.
So I think this is going to work out - thank you 🙂 I’ve send a copy to the users for testing so we can see how they get on. This discussion is on the rclone forum if you are interested.
I think it might be worth calling out in the docs more the fact that the upload preflight is useful for turning (fileName
, directoryID
) into fileID
as though I knew it, had used it, I still didn’t find it in the docs when I was looking 🙂
Thank you for your help.