Skip to content

Data Export API

The data export api is responsible for providing a way of fetching raw data in the form of player events and sessions.

Using the API

Exporting data

The main way of using the API for data export is by providing either a customer or a customer/business unit and a time period. The API will then return a streamed response that can be consumed in any way the user sees fit.

There are five endpoints provided by the API for this, all accessed using a GET request (or a GET and POST request for the limited sessions endpoint):

/v1/export/customer/{customer_name}/events for raw event data corresponding to an entire customer.
/v1/export/customer/{customer_name}/business_unit/{business_unit}/events for raw event data corresponding to a specific business unit.
/v1/export/customer/{customer_name}/sessions for session summaries corresponding to an entire customer.
/v1/export/customer/{customer_name}/business_unit/{business_unit}/sessions for session summaries corresponding to a specific business unit.
/v1/export/customer/{customer_name}/business_unit/{business_unit}/sessions/limited for a limited session summaries, with some filter functionality.

In order to select the time period the start and end query parameters are provided, these are required in order to make a request and should be provided in ISO 8601 standard.

After making a request to one of the above endpoints the API will return the response as a data stream (for limited export there is also an option for a non-streaming response), and therefore the client fetching the data will need to consume it as a stream in order to avoid any potential memory problems. The data returned by the api will consist json objects separated by a new line. Do note that since the Analytics pipeline at Red Bee is constantly evolving to suit new needs the information in events and session summaries might changed over time. Due to this a exported record might contain several empty fields, this is usually an indcation that these fields are no longer ingested into the analytics database but kept for legacy reasons. In the same vein, due to the evolving nature of the analytics pipeline we do note guarantee any sort of backwards compatability in case the data structure changes.

Limited session export

Sometimes you might not be interested in a complete data dump and instead want to fetch individual sessions from a specific user or asset. In that case you can use the limited sessions export which comes with some basic filtering functionality. In this case you can use the limited sessions export functionality. When sending in filters a POST request should be used instead of a regular GET.

The export takes start and end query paramters in order to select a time period. In addition to this limit can also be provided (default is 10, maximum is 100) which tells the API how many records to return. The order of the records returned will be from newest to oldest. In order to filter for a specific user or asset id, one or more filters can be supplied as a request body.

By putting this together the following request can be made:

/v1/export/customer/FantasticTV/business_unit/FantasticContent/sessions/limited?start=2022-01-01&end=2022-02-01&limit=10
{
    "filters": {
        "user.meta.account_id" : ["user1"]
    }
}

This would fetch the last 10 sessions played for the user with the id user1.

The following filters are available when using the endpoint:

Name Description
content.asset.id String uniquely identifying the content in our system
user.meta.id String uniquely identifying the user in our system
user.meta.account_id String uniquely identifying the account in our system
viewing.meta.started Flag to show only sessions that started playback, as in recieved a frame of video content

The repsonse returned from the limited export will be a json with all the sessions in a list

{
    "data" : [
        records-go-here
    ]
}
The api can also be set to return the data as a streaming response, this is done by including the query parameter as_json (default is True). If this is set to false then the response returned from the api will be sent as a stream of records instead.

GDPR data export

In order to provide the ability to fetch what GDPR sensitive data is stored for a customer the data export api provides two endpoints for fetching GDPR sensitive data.

/v1/export/customer/{customer_name}/business_unit/{business_unit}/account/{account_id}/sessions for fetching all sessions for a specific account and showing relevant GDPR fields.
/v1/export/customer/{customer_name}/business_unit/{business_unit}/account/{account_id}/ip for fetching all IPs stored in for a specific account.

Account session data

When fetching account session data the response is returned a streaming response and should be consumed as such for the client doing the request.

The data returned by the api will consist json objects separated by a new line and will contain the following fields.

Field Description
time The time (in UTC) when the session was created
account_id The account id of the account playing the session
country The country in which playback was registered
city The city in which playback was registered
device_model The model of the device doing playback
device_os The OS of the device doing playback
device_os_version The OS version of the device doing playback

Some of the fields might be null if the player sending the anlytics events didn't send the relevant information.

Account IP data

In additon the sessions described above we also store IP information in our raw events, this can be fetched using the /ip endpoint.

This endpoint will return an array with all IPs that we have stored for a specific account in our analytics data. The data is returned in increments of 100 (so the data is non-streaming as opposed to the session data). Each response from the this endpoint will look as follows:

{
    "status": "DONE",
    "key": null,
    "ips": [
        "x.x.x.x",
        "y.y.y.y"
    ]
}

There are three different status objects that can be returned DONE, MORE_TO_FETCH and ERROR. The DONE it means just that, there are no more data to return and the fetch is complete.

The MORE_TO_FETCH means that there are still IPs left to return, in this case the response also contains a key string. In order to fetch the next batch of IPs this key needs to be sent as a query parameter which will look as follows

/v1/export/customer/FantasticTV/business_unit/FantasticContent/account/FantasticCustomer/ip?key=KeyReturnedFromPreviousRequest
{
    "status": "DONE",
    "key": null,
    "ips": [
        "z.z.z.z"
    ]
}

The ERROR means that something went wrong with the aggregation of IPs.

Data dumps

By using the API you can retrieve two types of data dumps. Raw Player Events and Session Summaries, these will be described below.

Player Events

These events are the raw events sent in from our player SDK, each event describes some sort of behaviour in the player during a playback session. More information about these events can be found here. In addition to this the events will contain some data added by the backend (usually denoted in lower case).

Session Summaries

Whenever a player starts playing some media (or attempts to start playing) a number of events will be created to describe the players behaviour during that playback, these are the events described above. Those events are then aggregated into a session summary, these session summaries are used as the basis for most of our analytics reports. A single session summary will describe a single playback session by a single user.

The session summaries contain the following fields:

Field Type Description Legacy/Not used
AccountId str The account id of the account doing playback no
ActivateAtPlay bool Indicator if the product offering connected to the asset was activated when playback started no
AnalyticsBucket int Backend bucketing of analytics event no
AnalyticsPostInterval int The time the player should buffer analytics events before sending them as a batch to the backend no
AnalyticsTag str Backend tagging of analytics event no
Anonymous bool Indicator if the playback was anonymous or not no
AppType str Description of what type of app was used for playing the session no
AssetId str Id of the asset used when playing the session no
AssetRuntime int The length of the asset as reported by the player if available yes
AssetSubtype str The subtype of the asset that was played yes
AssetTitle str The human readable name of the asset yes
AutoPlay bool Indicator if the session was started due to autoplay triggering the session start no
AverageBitrate double The average bitrate across the session. This is a weighted average so the longer a session stays in a specific bitrate the heavier that bitrate will impact the average no
Browser str The manufacturer/developer of the browser used when playing the session no
BrowserFamily str The browser family of browser where playback was started no
BufferingEvents int Number of events which indicated that buffering was started for the session no
BufferingRatio double The ratio between time spent buffering and the time not spent buffering. I.e. a session that is 10m which spent 5m buffering and 5m playing would have a ratio of 1.0 (100%), a value over 100% would then indicate that the user spent more time buffering than watching. no
BufferingRatioOfTotalDuration double The ratio between time spent buffering and the total session duration. I.e. a session that is 10m which spent 5m buffering and 5m playing would have a ratio of 0.5 (50%) no
BusinessUnit str The RedBee business unit to which the session belongs no
CDNVendor str The CDN vendor used when streaming the session no
ChannelId str Id of the channel that was played no
ChannelName str The human readable name of the channel yes
City str The city where playback was initiated no
ConnectionType str The connection type used when playing the session yes
ContainedErrors bool Indicator if the session contained any known errors no
Country str The country where playback was initiated no
Customer str The RedBee customer name to which the session belongs no
DeviceId str The unique id of the device where playback occurred no
DeviceModel str The model name of the device used for playing the session no
DeviceType str The device type used when playing the session yes
DisplayTitle str The human-readable title of the asset yes
DrmCertificateRequests int Number of DRM certificate requests sent by the during the session (usually one, but there can be multiple) no
DrmCertificateResponses str Number of DRM certificate responses sent by the during the session (usually one, but there can be multiple) no
DrmLicenceRequests int Number of DRM licence requests sent by the during the session (usually one, but there can be multiple) no
DrmLicenceResponses int Number of DRM licence responses received by the during the session (usually one, but there can be multiple) no
DRMType str The type of DRM used on the asset (Widewine, FairPlay etc) no
EntitlementId str The id assigned the session by the entitlement call no
EntitlementType str The type of the entitlement request made by the player yes
ErrorCode str The error code reported by the player in case of error no
ErrorMessage str The error message encountered by the player in case of error no
EventCount int Number of raw player events that the was processed for the session no
ExternalProductId str The external product id of the product that the played asset belongs to yes
FirstEventTime int The epoch time of the first event for the entire session no
Format str The playback format used during the session yes
Height int The screen height used on the device playing the session no
IncompleteBufferingEvents int Number of buffering events that were incompleted (i.e. either missing buffering started or buffering ended) no
Intent str The intent (PLAY/DOWNLOAD) of the player when requesting playback no
LastEvent str The last event of the session no
LastEventTime int Epoch timestamp of the last event sent by the player no
LastPlayedOffset int The last played offset time (the position of the player in the stream) no
MandatoryValidationFailed bool Indicator if the session failed the optional analytics pipe validation checks no
Manufacturer str The company manufacturing the device used to play the session no
MaxBitrate double Maximum bitrate value achieved during the session no
MediaLocator str URL of the media locator used for playback no
MinBitrate double The minimum bitrate played during the session no
Model str The asset model used for the played asset yes
MostUsedBitrate double The bitrate that was used the most (i.e. the longest time) during the session no
Name str The name of the layout engine used if a browser was used for playback no
OptionalValidationFailed bool Indicator if the session failed the optional analytics pipe validation checks no
OS str The OS used by the device that played the session no
OSVersion str The version number of the OS used on the device where the session was played no
PackageId str Id of Red Bee program package used yes
PercentagePlayed double Percentage of the asset that was played if the length of the asset was known to the player. no
Player str The name of the player SDK used for playing the session no
PlayerVersion str The version number of the player SDK used for playing the session no
PlayMode str The type of asset that is being played (vod, live etc) yes
PlayType str The type of asset that is being played (vod, live etc) no
ProductId str The id of the product to which the asset that was played belongs no
ProgramId str The id of the last program that was played during the session no
ProgramName str The human-readable title of the program yes
PublicationId str Id of the publication to which the asset belongs no
Resolution str The screen resolution used on the device playing the session no
RootedOrJailbroken bool Indicator if the device was rooted or jailbroken no
SessionCompleted bool Indicator if the session ended with either a playback completed/aborted event or an error no
SessionId str The unique id of the session no
SessionStarted bool Indicator if playback was started (i.e. if the session received the proper started event) no
SessionToken str The session token reported by the player no
StartupDurationMs double Number of ms from session initialization to playback starting no
SupportedDevice bool Indicator iif the device is supported by the player framework no
TimeSpentBufferingMs str The number of milliseconds the session spent buffering in total no
TotalAdCompleted int Number of events that indicated that an ad had completed playing during the session no
TotalAdFailed int Number of events that indicated that an ad had failed playing during the session no
TotalAdStarted str Number of events that indicated that an ad had started playing during the session no
TotalUsageMegabyte double Experimental calculation on how many MBs was consumed during session playback no
Type str The device type reported by the player yes
UserAgent str The user agent string as reported by the device playing the session no
UserId str The unique id for the user doing playback no
VideoLength int The length of the asset as reported by the backend if available no
ViewingDurationMs int The total number of milliseconds that the player was playing content (i.e. not in a paused state) no
Width int The screen width used on the device playing the session no