Skip to main content
Question

Bulk download programmatically from public Box Enterprise folder

  • May 22, 2025
  • 19 replies
  • 108 views

Forum|alt.badge.img

Hi all,

 

I'd like to bulk download from a publicly shared Enterprise folder (https://nrcs.app.box.com/v/naip/). The size of the data is huge (~16TB), so I'd like to download it programmatically through either an API or a command line utility. I'm not sure how.

 

I'm using a Linux cluster, so the Box CLI is of no use to me. I also tried using the box API, but it looks like I can't access another organization's Enterprise folder through the API (I could only, for example, search from my own account).

 

Any suggestion will help, and thanks in advance!

19 replies

Forum|alt.badge.img

Perhaps FTP (https://community.box.com/t5/Upload-and-Download-Files-and/Using-Box-with-FTP-or-FTPS/ta-p/26050) would work?

 

The LFTP client on Linux can also make things a little easier/more reliable, BTW.

 

Hope that helps.


Forum|alt.badge.img

 You can use the API to access publicly shared folders, you just need to pass the `BoxApi` header with the shared link in it along with every call to let the API know that you should have access to that folder.

 

The general flow is this:

  1. Call the `GET /shared_items` endpoint to resolve the shared link to a folder with the BoxApi header
  2. Make whatever calls against the folder using the ID you get back from Step 1 (in your case, lots of `GET /folders/ID/items` and subsequent `GET /files/ID/content` calls) with the BoxApi header

As an aside, my team will be releasing an updated version of the Box CLI with Linux support next month, which should make this a lot easier for you!


Forum|alt.badge.img

Many thanks to both of you,  and , for your timely response!

I read from this post that it's not recommended to use FTP as the primary access method, so followed 's suggestion of adding the extra BoxApi header and it worked like a charm.

It's great to know that Box is working on a Linux CLI. I can imagine how helpful it's going to be for Linux cluster users like me.


Forum|alt.badge.img

Hi mwiller

 

I have the same issues, I want to download zipped images from 

this public_dataset folder under images folder, ( I can download them locally by click download button for each zipped file one by one, but I want to download them on Linux Server)

the data is publically and contains several different zipped files, so how should I download them in Linux command line, I checked the document, but I don't know the shared link, and password.

curl https://api.box.com/2.0/shared_items?fields=type,id
-H "Authorization: Bearer ACCESS_TOKEN"
-H "BoxApi: shared_link=SHARED_LINK_URL&shared_link_password=PASSWORD"

 

Thank you

 


Forum|alt.badge.img

can you give the example for this case


Forum|alt.badge.img

 The following curl call worked for me:

 

curl https://api.box.com/2.0/shared_items \
-H "Authorization: Bearer " \
-H "BoxApi: shared_link=https://nihcc.app.box.com/v/ChestXray-NIHCC"

That will give you the information about the shared folder; you can then make API calls like this to retrieve the folder contents:

 

curl https://api.box.com/2.0/folders//items \
-H "Authorization: Bearer " \
-H "BoxApi: shared_link=https://nihcc.app.box.com/v/ChestXray-NIHCC"

Forum|alt.badge.img

Hi

 

got up to the getting files ID  , GET /files/ID/content not able to download, how to download 


Forum|alt.badge.img


GET /files/ID/content not able to download


Could you clarify what you meant here? What's your request and what error message did you get?


Forum|alt.badge.img

I tried

curl https://api.box.com/2.0/shared_items \
-H "Authorization: Bearer " \
-H "BoxApi: shared_link=https://nihcc.app.box.com/v/ChestXray-NIHCC"

 but nothing happened in command line, no error report, no any information. I don't know what is access_token, how should I know this token?


Forum|alt.badge.img

Hi ,

 

i want to download the file, Get /files/ID/content gets me nothing. Do I need to use any scrapper to download the files


Forum|alt.badge.img

 wrote:


GET /files/ID/content not able to download


Could you clarify what you meant here? What's your request and what error message did you get?



Hi ,

 

i want to download the file, Get /files/ID/content gets me nothing. Do I need to use any scrapper to download the files


Forum|alt.badge.img

 An access token is required to authenticate with the Box API, even for public shared resources.  Please see the setup documentation for help getting started setting up an app and getting an access token.


Forum|alt.badge.img

 I apologize if this question sounds silly, but is there a pythonic way to access data using the shared link? All your answers seem to be pointing towards the cli solution and I don't have access to mac/windows

 

I created a client using JWT authentication by creating an enterprise developer account, and when I use the following code:

 
from boxsdk import JWTAuth
from boxsdk import Client

# Configure JWT auth object
sdk = JWTAuth.from_settings_file()

# Get auth client
client = Client(sdk)

SHARED_LINK_URL = 'https://nrcs.app.box.com/v/naip/folder/'
shared_item = client.get_shared_item(SHARED_LINK_URL) print(shared_item.name)

I have also verified that the shared link points to a public box file.

 

Forum|alt.badge.img

 You cannot append the `/folder/XYZ` to the URL when using the API — instead you'll need to do something like this:

 

shared_client = client.with_shared_link(SHARED_LINK_URL)
shared_folder = shared_client.get_shared_item(SHARED_LINK_URL)

folder_contents = shared_folder.get_items()
// OR
subfolder = shared_client.folder(FOLDERID).get()

Forum|alt.badge.img

Hi  ,

 

Thank you for all your help upthread. I'm endeavoring to follow all these instructions (with the NAIP shared image folder

https://nrcs.app.box.com/v/naip, same as OP) to programmatically download from a public Box Enterprise folder, but getting 404s on the second step.

 

`curl https://api.box.com/2.0/shared_items -H "Authorization: Bearer myToken" -H "BoxApi: shared_link=https://nrcs.app.box.com/v/naip" | jq .id` returns `"17936490251"` as expected.

 

However, all of the following fail with a 404 or another Not Found message.

`box folders:items 17936490251 --fields=shared_link`

`curl https://api.box.com/2.0/folders/17936490251/items -H "Authorization: Bearer myToken" -H "BoxApi: shared_link=https://nrcs.app.box.com/v/naip/"`

`box shared-links:get nrcs.app.box.com/v/naip/`

 

Can you shed any light on the best way to do this? I feel like I'm so close (thanks to your help), but not quite there.


Forum|alt.badge.img

Update: it looks like the trailing slash in the shared_link was the problem. 

 

```curl https://api.box.com/2.0/folders/17936490251/items -H "Authorization: Bearer myToken" -H "BoxApi: shared_link=https://nrcs.app.box.com/v/naip"``` works. But add the trailing slash after "naip", and it doesn't. Hope this helps someone out. 🙂


Forum|alt.badge.img

Hi ,

I have to perform a similar task except I want to download all the files from https://uta.app.box.com/s/e7nsmloj8xmblosvfg98q42fgqnjy6dv.

I have understood the procedure using curl.

But could you please help me in generating the ACCESS_TOKEN?

I have searched online but the methods suggest ways to generate the same for your own app.

Thank You


Forum|alt.badge.img

I am writing a code in python to download a file (files or folder) from https://nrcs.app.box.com/v/soils .

it is public and I can download with some click without any username or password. 

I am suing this code block (the code in this comment) and it asks for password. here is the error:

with_shared_link() missing 1 required positional argument: 'shared_link_password'

 

is there any other method to download?

 

 

 

code block:

shared_client = client.with_shared_link(SHARED_LINK_URL)
shared_folder = shared_client.get_shared_item(SHARED_LINK_URL)

folder_contents = shared_folder.get_items()
// OR
subfolder = shared_client.folder(FOLDERID).get()


Forum|alt.badge.img

Hi  /  ,

 

Were you able to figure out how to download publicly available data through a python script?

 

I am getting the same error that 'shared link password' is missing.

 

I tried running the code by passing an empty string as the password then I got the following error,

 

 

boxsdk.exception.BoxAPIException: Message: Could not find the specified resource
Status: 404
Code: not_found
Request ID: thwf9wgdl2kk2d2l

 

 

 

My code:

from boxsdk import JWTAuth
from boxsdk import Client

# Configure JWT auth object
sdk = JWTAuth.from_settings_file('box_config.json')

# Get auth client
client = Client(sdk)
user = client.user().get()
print('The current user ID is {0}'.format(user.id))

SHARED_LINK_URL = 'https://stonybrookmedicine.app.box.com/v/cellreportspaper'


shared_client = client.with_shared_link(SHARED_LINK_URL,'')
shared_folder = shared_client.get_shared_item(SHARED_LINK_URL)

folder_contents = shared_folder.get_items()
subfolder = shared_client.folder(4***phone number removed for privacy***).get()

for item in subfolder.get_items(limit=1000):
client.file(file_id=item.id).content()

Any help in figuring this out is much appreciated!!🙂