I have an application that consistently makes requests to the Box Enterprise Events API to get and monitor events for a customer. The requests occur every 10 minutes, and every request extracts the next_stream_position cursor and uses it as the stream_position parameter for the subsequent request. The purpose here being that we can retrieve events in 10 minute intervals.
One thing that I noticed was that the Box Enterprise Events API is unreliable in the way we are using it, as we have seen a number of events missed/not returned by the API. We were able to confirm this by doing the following:
- Look at all of the events our request cycle retrieved that have a timestamp dated yesterday (say Monday)
- Start a secondary API request cycle, using yesterday's date (Monday) as the starting point
- Compare the events retrieved by the secondary cycle that have a timestamp of Monday to the events from the primary cycle
- The results were that the secondary cycle retrieved events (dated for Monday) that the primary cycle did not. So it retrieved MORE events, meaning that some were missed by the primary cycle
The primary difference here is that the secondary request cycle looked for events a day after they occurred, while the primary request cycle is looking for events that have occurred in the previous 10 minutes. But that being said, the Box API uses stream_position, so I am confused how some events would be missed even if there is some latency between when events occur and when they become available in the API. Should the stream_position not guarantee that the events are complete, even if they might come out of order?