Skip to main content

We have been on Box for almost 8 years and have around 275TB of data and shuffle around 8.5TB/month. Much of that is legacy data archived on service accounts and files with many versions no longer relevant. Keynote collaboration for example saves hundreds if not thousands of versions of our files as we build decks...but we’re not reverting or looking at hundreds of versions of files that were presented 5 years ago.

 

I’ve written a script to prune out old versions and will leave behind the 3 most recent versions plus 7 additional versions spaced out over the file’s life. In this way, I can find an archive folder with data many years old, provide that to my script, let it search, sort, and remove the excess data no longer needed.

 

Other than reducing extra junk on the platform, will removing unneeded data (most of which is not visible to many users and stored only in service accounts for archival purposes) improve the performance on Box? Specifically navigation and search. Is it better to just leave this data alone since we have unlimited data? We’re on Enterprise but don’t have Box Archive, would that product help segment this data?

Hi ​@MONOGrant 👋  Welcome to the Box Community! I've submitted a support ticket on your behalf so our Product Support team can review and assist with improving the performance of Box.

Please keep an eye out for an email response to address your inquiries. Have a great day! 💙


Reply