Box utilizes a Retrieval Augmented Generation (RAG) process using vector embeddings as a way to provide the best AI-powered answers across Box content. This approach allows us to support user permissions, incorporate new information immediately, avoid training models, and generally provide AI capabilities across multiples files and large amount of content.
These efforts utilize our existing expertise in AI, search, metadata, and file storage to design solutions that are flexible and scaleable. This not only powers the functionality that you see in the product, but customers can build on top of our APIs so they don’t have to build it themselves.
To achieve this functionality, we employ various tools and technologies in our environment: (In our environment is an important piece of information so customers know their content is not being used externally).
1. AI model (LLMs): We utilize multiple AI models based that meet our enterprise-grade requirements (see AI Principles: https://blog.box.com/box-ai-principles) such as Azure Open AI (such as GPT-4, Gemini, etc,).
2. Metadata Extraction API: We utilize AI models to extract information from files (along with advanced extraction techniques)
3. Vector DB: Box uses a vector database that is combined with our Search infrastructure to index the vector embeddings and provide RAG
4. Vector Embedding: Box uses vector embeddings from our AI vendors (these are evolving based on quality and support for multi-modality)
Contributions by several Boxers including Ben Kus & Tyan Hynes