Skip to main content

Hi,


I need to download all the files in


https://app.box.com/s/rf6p81j3o507e8c5saywtlc1p91f8po9


I cannot download it through the browser because the files are > 150 GB. Thus, I want to create a script that downloads the files one by one.



To do so, I signed up for Box, and created a Custom App. On the ‘purpose’ drop-down menu, I selected ‘other’. Next, I selected ‘server authentication (with JWT)’. I then navigated to the configuration tab and added clicked ‘Genereate a Public/Private Keypair’, which downloaded a file named 0_XXXXXenv_config.json where the X’s are random digits/characters. I renamed this file to config.json and then tried to run:



from boxsdk import JWTAuth, Client



auth = JWTAuth.from_settings_file('config.json')

client = Client(auth)

auth.authenticate_instance()



shared_folder = client.get_shared_item("https://app.box.com/s/rf6p81j3o507e8c5saywtlc1p91f8po9")

for item in shared_folder.get_items(limit=1000):

client.file(file_id=item.id).download_to(item.name)

break



but I get:



boxsdk.exception.BoxOAuthException: 

Message: Please check the 'sub' claim. The 'sub' specified is invalid.

Status: 400

URL: https://api.box.com/oauth2/token

Method: POST

Headers: {'Date': 'Tue, 23 Apr 2024 08:49:18 GMT', 'Content-Type': 'application/json', 'Strict-Transport-Security': 'max-age=31536000', 'Set-Cookie': 'box_visitor_id=6627760e46a336.78777513; expires=Wed, 23-Apr-2025 08:49:18 GMT; Max-Age=31536000; path=/; domain=.box.com; secure; SameSite=None, bv=DSYS-1179; expires=Tue, 30-Apr-2024 08:49:18 GMT; Max-Age=604800; path=/; domain=.app.box.com; secure, cn=4; expires=Wed, 23-Apr-2025 08:49:18 GMT; Max-Age=31536000; path=/; domain=.app.box.com; secure, site_preference=desktop; path=/; domain=.box.com; secure', 'Cache-Control': 'no-store', 'Via': '1.1 google', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'Transfer-Encoding': 'chunked'}



I cannot list the files, let alone download them. What am I doing wrong?

Hi @TravisPetit , welcome to the forum!



Looks like something is off in your config.json file.



The sub claim represents the box_subject_id and it should be your enterprise id if the box_sub_type is enterprise, which should be. As a side note you could also specify a sub type of user and the user id in the sub claim.



First I would check the config.json, here is a sample:



{

"boxAppSettings": {

"clientID": "fe...so",

"clientSecret": "3N...xi",

"appAuth": {

"publicKeyID": "39749s1s",

"privateKey": "-----BEGIN ENCRYPTED PRIVATE KEY-----\nMI...G\nlCE=\n-----END ENCRYPTED PRIVATE KEY-----\n",

"passphrase": "fa...c3"

}

},

"enterpriseID": "877840855",

"webhooks": {

"primaryKey": "zX...E0",

"secondaryKey": "Ew...nf"

}

}



Make sure your enterprise id matches (Admin console → Billing)



Another common error I make is to forget to re-authorize the JWT application every time I make a change to it’s configuration.



Make sure you have submitted your application under Authorization in the developer console:





And then on the admin console → Apps, under the custom apps manager, authorize the app:





This seems to be quite a popular use case, there are several posts on the forum mentioning the need to build a script to download data.



Just for fun here is an example, similar to yours, and ignoring the .gz files:



"""demo to download files from a box web link"""



import os

from boxsdk import JWTAuth, Client





def main():

auth = JWTAuth.from_settings_file(".jwt.config.json")

auth.authenticate_instance()

client = Client(auth)



web_link_url = "https://app.box.com/s/rf6p81j3o507e8c5saywtlc1p91f8po9"



user = client.user().get()

print(f"User: {user.id}:{user.name}")



shared_folder = client.get_shared_item(web_link_url, "")

print(f"Shared Folder: {shared_folder.id}:{shared_folder.name}")

print("#" * 80)



print("Type\tID\t\tName")

os.chdir("downloads")

items = shared_folder.get_items()

download_items(items)

os.chdir("..")





def download_items(items):



for item in items:

if item.type == "folder":

if not os.path.exists(item.name):

os.mkdir(item.name)

os.chdir(item.name)



# print the folder name

print("-" * 80)

print(f"\n\n{item.type}\t{item.id}\t{item.name}")

print("-" * 80)



download_items(item.get_items())

os.chdir("..")



if item.type == "file":

print(f"{item.type}\t{item.id}\t{item.name}", end="")



# check if item name ends with .tar.gz

if item.name.endswith(".gz"):

print("\t .gz skipped")

continue



with open(item.name, "wb") as download_file:

item.download_to(download_file)

print("\tdone")





if __name__ == "__main__":

main()

print("Done")





Resulting in:



User: 20344589936:UI-Elements-Sample

Shared Folder: 193110430595:INTERVAL_Metabolon_GWAS_summary_stats

################################################################################

Type ID Name

--------------------------------------------------------------------------------





folder 193712488944 M00053

--------------------------------------------------------------------------------

file 1134408152107 INTERVAL_M00053_formattedForMeta_sorted_chr_1.txt.gz .gz skipped

file 1134416904386 INTERVAL_M00053_formattedForMeta_sorted_chr_1.txt.gz.tbi done

file 1134408265055 INTERVAL_M00053_formattedForMeta_sorted_chr_10.txt.gz .gz skipped

file 1134423566812 INTERVAL_M00053_formattedForMeta_sorted_chr_10.txt.gz.tbi done

file 1134409796066 INTERVAL_M00053_formattedForMeta_sorted_chr_11.txt.gz .gz skipped

file 1134417464230 INTERVAL_M00053_formattedForMeta_sorted_chr_11.txt.gz.tbi done

file 1134414789727 INTERVAL_M00053_formattedForMeta_sorted_chr_12.txt.gz .gz skipped

file 1134410779924 INTERVAL_M00053_formattedForMeta_sorted_chr_12.txt.gz.tbi done

file 1134416678707 INTERVAL_M00053_formattedForMeta_sorted_chr_13.txt.gz .gz skipped

file 1134418213051 INTERVAL_M00053_formattedForMeta_sorted_chr_13.txt.gz.tbi done

file 1134417716501 INTERVAL_M00053_formattedForMeta_sorted_chr_14.txt.gz .gz skipped

file 1134412411878 INTERVAL_M00053_formattedForMeta_sorted_chr_14.txt.gz.tbi done

file 1134411158800 INTERVAL_M00053_formattedForMeta_sorted_chr_15.txt.gz .gz skipped

file 1134417029597 INTERVAL_M00053_formattedForMeta_sorted_chr_15.txt.gz.tbi done



Let us know if this helps


Hi @rbarbosa, thank you so much for your reply.


I do not have an authorization tab. Instead I have a ‘General Settings’, ‘Configuration’ and ‘App Diagnostics’ tab, as shown in this screenshot.





Am I doing something wrong? I chose JWT as an authentication method.



Thanks again.



Best


Travis


Hi @TravisPetit



That explains it!



Seems to me that either you have a free account or a “Personal Pro” account, that does not have the admin console or the application approval.



These accounts do not support CCG or JWT applications, only OAuth 2.0



We do have a free developer account that will enable you to do this.



You have a couple of options moving forward:





  • Create a new free developer account and discard the existing one. Check your current usage, files, shared links, etc


  • Use current account with OAuth 2.0 - Not ideal for scripting but doable




Help us understand which would you prefer.


Feel free to send me a private message with more details so I can identify your current account (is it the one associated with your forum user email?).


Hi @rbarbosa



I created a new developer account, and authorized a new JWT app. I can run your script now.


Thank you very much!



Best,


Travis


Perfect, happy to help!


This topic was automatically closed 4 days after the last reply. New replies are no longer allowed.


Reply