Can I get the content URL for items in a folder?

When using the endpoint to list items in a folder is there any way to include representations in the nested items?

Page 1 / 1

Hi, welcome to the forum!

Can you elaborate what you mean by representations in the nested items?

Something like a path to the item?

Yep! The URL returns a collection of entries. If possible I’d like to be able to list the contents of a folder and for any included sub-folders include the item count and for files include the URL to retrieve the file.

Well technically you can of course.

However please consider that Box is not an operating system file system and each folder can contain millions of files and folders, and so on, so it doesn’t know how many items are inside a folder, unlike a FAT32 file system.

In fact this concern is present in the SDK’s, where the .get_items() return an iterator, for which you don’t get the length until you iterate until the end. Resource wise this is not efficient.

So here is an example using a python script:

BASE_FOLDER_ID = "0"  # 0 is the root folder





def get_folder_by_id(client, folder_id) -> Folder:

    """Get the details of a folder"""

    folder = client.folder(folder_id).get()

    return folder





def print_folder_info(client: Client, folder_id: str, depth: int = 0):

    """Print info about a folder"""

    folder = get_folder_by_id(client, folder_id)

    folder_items = folder.get_items()

    item_count = 0

    for item in folder_items:

        item_count += 1

    print(f"{' ' * depth*2}{folder.name}/ ({item_count} items)")





def list_folder(client, folder_id, depth: int = 0):

    """List the contents of a folder"""

    items = client.folder(folder_id).get_items()

    for item in items:

        if item.type == "folder":

            print_folder_info(client, item.id, depth)

            list_folder(client, item.id, depth + 1)  # recursion

        else:

            print(f"{' ' * depth*2}{item.name}")





def main():

    """

    Simple script to demonstrate how to use the Box SDK

    with oAuth2 authentication

    """



    client = get_client(conf)



    print_folder_info(client, BASE_FOLDER_ID, 0)

    list_folder(client, BASE_FOLDER_ID, 1)





if __name__ == "__main__":

    main()

Note that I’ve made it worst by using recursion to list all sub folders 🙂

The result is:

All Files/ (3 items)

  My Signed Documents/ (1 items)

    Box-Dive-Waiver.pdf 2023-08-16 12.08.09/ (2 items)

      Box-Dive-Waiver Signing Log.pdf

      Box-Dive-Waiver.pdf

  UIE Samples/ (9 items)

      Audio.mp3

      Document (PDF).pdf

      Document (Powerpoint).pptx

      HTML.html

      JS-Small.js

      JSON.json

      Preview SDK Sample Excel.xlsx

      Single Page.docx

      ZIP.zip

      Get Started with Box.pdf

Cheers

I might have been unclear. I do understand that I could iterate over a paged list of items to build this list myself.

The example response for the listing of folder contents does include a collection of items and the total count of items. So is that inaccurate or does Box in fact know how many files are in each folder?

Each of these items is represented like:

{

  "id": "12345",

  "etag": "1",

  "type": "file",

  "sequence_id": "3",

  "name": "Contract.pdf",

  "sha1": "85136C79CBF9FE36BB9D05D0639C70C265C18D37",

  "file_version": {

    "id": "12345",

    "type": "file_version",

    "sha1": "134b65991ed521fcfe4724b7d814ab8ded5185dc"

  }

}

In this case this is a “file” and I would like that to include another property to list the content url (even if this was nested within that structure.)

For a “folder” type I’d like to include that item count.

My elaboration probably just muddy this concept. I’m fine with iterating the entries in the root folder for example, if I request the first 1000 (maximum limit value) of a root folder with millions of files and folders within it. I just don’t want to have to iterate over those just to get IDs so I can then request their information from another endpoint and finally assemble the URL to retrieve the item.

Ah! The total_count of the API call, of course. I stand corrected.

Unfortunately the Python SDK, does not exposes that property and returns one item at a time using and iterator. I’m not sure about the other SDK’s.

So for now I’ll assume you’re querying the API end points directly.

If the total_count solves the first issue, let me try to tackle the URL to the item.

For files and folders there are different URL’s you can consider, depending on the use case.

The box web app direct link which a user can click and take them to the box app.

The shared link (if exists), typically used to share the file or folder with another user.

Perhaps you can elaborate on your goal/use case, and also what tools/stacks are you using.

Cheers

Thanks for your patience, but I’m not sure I fully understand what you’re suggesting. We can put the Python SDK’s missing information aside for now as I’m not using it. I am in fact, for all our purposes, querying the API directly. As a side note here, is there something about the category or this post that lead you to believe I was using Python?

The total_count does not solve the issue for a couple reasons. One of these that as you mentioned these folders can contain an extremely large number of items; even without recursing into sub-folders. The API will reject calls to this endpoint if the offset parameter is greater than 10000. It’s entirely possible that I will encounter folders with more items than this limit and this appears to restrict me to using marker pagination for which no total_count will be returned. Secondly, knowing the total count (when possible) does not provide me any additional information about the items themselves.

Let’s assume that you wish to display a graphical representation of the contents of a user’s Box account so that they might select which items they wish to import into your application. Let’s assume this user is dealing only with items for which they have full permissions and any for which they would not have full permissions we are not concerned with. If you were to query the API for /folders/0/items you would start at the root directory and you’d get a list of items. Maybe a handful are folders, and a handful are files. The fact that “weblinks” might exist or the types of each file is not important at this time, mostly to keep this simple. You have a list of items with various attributes but most importantly for each of these items you have a name, ID, and type.

Your graphical representation may start looking just like a list:

My First Sub-folder

Family Outings

my_dog.png

my_cat.png

But quickly you realize maybe your user wants to differentiate these things more quickly. So maybe you add an icon this list based on the type (which you already know, conveniently!):

📂 My First Sub-folder

📂 Family Outings

🖼 my_dog.png

🖼 my_cat.png

Things are going well! The client is very happy but then they ask, “Hey, could you tell me how many things are in those folders before I drill down? Could I see the actual thumbnail of pictures instead of an icon?” It’s at this point a small bead of sweat runs down your brow. You hesitate and answer, “Maybe?” They would expect something more like:

📂 My First Sub-folder (21 items)

📂 Family Outings (123 items)

🐩 my_dog.png

😸 my_cat.png

You’ve got an Items resource and it has nested several other resources (specifically File, and Folder) but they’re only the mini representation. More complete versions of the File resource would have information about the share link, or representations that you could use. More complete versions of the Folder resource have an item_collection property that includes item_collection.total_count.

Is there any way to include these fields for nested items? Because right now you have the IDs. You know you can query the API to get the full representations and determine these values. In this tiny example though that’s one request for the list, 4 requests for full representations of items (2 to the folder endpoint and 2 to the file endpoint), then 2 more requests for the actual images because they’re not public and require authenticated API requests to retrieve. That’s 7 requests, but you can see how this could quickly scale! There’s nothing we can do, or that I expect, the remove the need to make a request for each image thumbnail. I would have that I could reduce to:

1 request to get the list, included folder contents counts and a thumbnail URL for files

2 requests to retrieve thumbnails.

Sure, this makes my life a little easier to not have to iterate over things and all, but it also reduces the traffic overall.

Also, because you’re trying to be helpful, I’m using PHP to build this and the only real “tool” I can think of is the Laravel framework. I did look but I didn’t see any Box recommended PHP SDK or Laravel package.

Hi, thanks for the detailed contextualization.

Let’s start with the easy stuff.

Not really, this is on me.

Whenever anyone asks a question without giving enough context related to tooling, and in my effort to be proactive, I include a Python example.

There are a couple repos mentioned in the community project, but one is gone and the other hasn’t been updated in 4 years, for what is worth, look in here.

Not really, sorry.

On the /folders/:folder_id/items end point we are limited to the mini versions of files, folders and web links.

I don’t see a way around in having to make extra requests to get the information you want.

It’s the same on our Javascript UI Elements, at best they show some static icons and no extra information.

Here is an example:

I’m mentioning the Box UI Elements because it seems to me that you are implementing some sort of box content browser by hand.

Next your customer is probably going to ask for preview, uploads, and file/folder picker, all of these exist in the box UI Elements but are not customizable per your examples.

But I got to ask, in your view, is there real business value in your customer request (assuming the above is just an illustration) or is it a nice to have?

There are a couple repos mentioned in the community project, but one is gone and the other hasn’t been updated in 4 years, for what is worth, look in …

Maybe eventually I can lend a hand there.

I don’t see a way around in having to make extra requests to get the information you want.

It’s the same on our Javascript UI Elements, at best they show some static icons and no extra information.

I wanted to confirm that with you. Yes, we’ve done similarly to your JS UI Elements in the past as well.

Next your customer is probably going to ask for preview, uploads, and file/folder picker, all of these exist in the box UI Elements but are not customizable per your examples.

Fortunately I can be fairly certain I will not require an upload functionality.

But I got to ask, in your view, is there real business value in your customer request (assuming the above is just an illustration) or is it a nice to have?

In my opinion a “nice to have” can be real business value. Sometimes it’s those quality of life improvements that lead to a customer choosing one product over another. That being said, if someone wanted to import images from Box into another application they would probably use some sort of Box content browser. While maybe their files are all appropriately named and well organized it’s more likely they are not. In these cases having the preview when making selections is a real value. he same may be true if they can select entire folders from this list and knowing how many items they are including with that single selection may be helpful.

Reply

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded