Video files aren’t supported yet. For photos, it should work as long as they’re in one of the supported formats: JPEG, JPG, TIF, TIFF, or PNG
This could be along the lines of EXIF data too, which isn't really an AI thing, but rather just a data layer Box could scrape. https://photographylife.com/what-is-exif-data
Box Consulting has a custom offering for both one-time EXIF scrape from a set of existing files and go-forward application of EXIF data.
A workaround is extract from the video transcript, although you don’t get the full context of the video.