Skip to main content
Question

Metadata extract using Box AI

  • August 14, 2025
  • 3 replies
  • 223 views

  • Participating Frequently
  • I have a lease contract that is around 10 pages long in PDF and have a metadata extract template to show 10 fields (execution date, lease squared footage, etc.). 
  • So the extracted data is coming from multiple pages (few fields are coming from page 1 and some fields are coming from Page 3, 5 and 10)
     
  • What I am doing?
    I am using Box AI to extract the data out.  So all works well, but I am unable to see the what exact text has been extracted from what page. 

What is missing - I need your help here?

  1. Ideally I would like the data in the PDF document highlighted either in yellow or boxed around in some way, so it is easy to review.  Somewhat of a citation to the value i see that is extracted.
     
  2. On a related question -- once I review the extracted data, is there a way to lock the extracted data once I approve everything is OK.

    Think of a scenario that there are 10 contracts, and for 6 contracts I have reviewed and all and it looks good or I can update it to fix any metadata extract errors to make it OK.  Now, I want to lock this extracted metadata, so no junior person or my AI agent can change it as I have already reviewed this. How can I achieve this?
  1. To continue to the above example, how can I pull the metadata from all the reviewed contracts in an excel file.  So as I have reviewed 6 contracts, I would like to have an excel file with 6 rows of data.

    Thanks!

 

3 replies

  • Author
  • Participating Frequently
  • August 14, 2025

@Anna Spallino Box  -- please see above.  Thanks in advance!


thomasdeely Box
Forum|alt.badge.img
  • Sr. Community Manager
  • August 19, 2025

@sunil there are a few topics in your question, and some of this touches on our roadmap, would you be willing to talk with our product team? cc ​@amandanielsen Box 


  • Author
  • Participating Frequently
  • August 27, 2025

Sure ​@thomasdeely Box -- more than happy to talk.  

FYI -- For the data that is extracted plan to add the outputs to Salesforce, so any workarounds to get the approval process in Salesforce also works.