Data Export API¶
The data export api is responsible for providing a way of fetching raw data in the form of player events and sessions.
Using the API¶
Exporting data¶
The main way of using the API for data export is by providing either a customer or a customer/business unit and a time period. The API will then return a streamed response that can be consumed in any way the user sees fit.
There are five endpoints provided by the API for this, all accessed using a GET request (or a GET and POST request for the limited sessions endpoint):
/v1/export/customer/{customer_name}/events
for raw event data corresponding to an entire customer.
/v1/export/customer/{customer_name}/business_unit/{business_unit}/events
for raw event data corresponding to a specific business unit.
/v1/export/customer/{customer_name}/sessions
for session summaries corresponding to an entire customer.
/v1/export/customer/{customer_name}/business_unit/{business_unit}/sessions
for session summaries corresponding to a specific business unit.
/v1/export/customer/{customer_name}/business_unit/{business_unit}/sessions/limited
for a limited session summaries, with some filter functionality.
In order to select the time period the start
and end
query parameters are provided, these are required in order to make a request and should be provided in ISO 8601 standard.
After making a request to one of the above endpoints the API will return the response as a data stream (for limited export there is also an option for a non-streaming response), and therefore the client fetching the data will need to consume it as a stream in order to avoid any potential memory problems. The data returned by the api will consist json objects separated by a new line. Do note that since the Analytics pipeline at Red Bee is constantly evolving to suit new needs the information in events and session summaries might changed over time. Due to this a exported record might contain several empty fields, this is usually an indcation that these fields are no longer ingested into the analytics database but kept for legacy reasons. In the same vein, due to the evolving nature of the analytics pipeline we do note guarantee any sort of backwards compatability in case the data structure changes.
Limited session export¶
Sometimes you might not be interested in a complete data dump and instead want to fetch individual sessions from a specific user or asset. In that case you can use the limited sessions export which comes with some basic filtering functionality. In this case you can use the limited sessions export functionality. When sending in filters a POST request should be used instead of a regular GET.
The export takes start
and end
query paramters in order to select a time period. In addition to this limit
can also be provided (default is 10, maximum is 100) which tells the API how many records to return. The order of the records returned will be from newest to oldest. In order to filter for a specific user or asset id, one or more filters can be supplied as a request body.
By putting this together the following request can be made:
/v1/export/customer/FantasticTV/business_unit/FantasticContent/sessions/limited?start=2022-01-01&end=2022-02-01&limit=10
{
"filters": {
"user.meta.account_id" : ["user1"]
}
}
This would fetch the last 10 sessions played for the user with the id user1
.
The following filters are available when using the endpoint:
Name | Description |
---|---|
content.asset.id | String uniquely identifying the content in our system |
user.meta.id | String uniquely identifying the user in our system |
user.meta.account_id | String uniquely identifying the account in our system |
viewing.meta.started | Flag to show only sessions that started playback, as in recieved a frame of video content |
The repsonse returned from the limited export will be a json with all the sessions in a list
{
"data" : [
records-go-here
]
}
as_json
(default is True). If this is set to false then the response returned from the api will be sent as a stream of records instead.
Account session data¶
In order to fetch session data related to a specific account the following endpoint can be used:
/v1/export/customer/{customer_name}/business_unit/{business_unit}/account/{account_id}/sessions
The endpoint will return all sessions for a specific account and return user information for each session.
When fetching account session data the response is returned a streaming response and should be consumed as such for the client doing the request.
The data returned by the api will consist json objects separated by a new line and will contain the following fields.
Field | Description |
---|---|
time | The time (in UTC) when the session was created |
account_id | The account id of the account playing the session |
country | The country in which playback was registered |
city | The city in which playback was registered |
device_model | The model of the device doing playback |
device_os | The OS of the device doing playback |
device_os_version | The OS version of the device doing playback |
Some of the fields might be null if the player sending the anlytics events didn't send the relevant information.
Data dumps¶
By using the API you can retrieve two types of data dumps. Raw Player Events and Session Summaries, these will be described below.
Player Events¶
These events are the raw events sent in from our player SDK, each event describes some sort of behaviour in the player during a playback session. More information about these events can be found here. In addition to this the events will contain some data added by the backend (usually denoted in lower case).
Session Summaries¶
Whenever a player starts playing some media (or attempts to start playing) a number of events will be created to describe the players behaviour during that playback, these are the events described above. Those events are then aggregated into a session summary, these session summaries are used as the basis for most of our analytics reports. A single session summary will describe a single playback session by a single user.
The session summaries contain the following fields:
Field | Type | Description | Legacy/Not used |
---|---|---|---|
AccountId | str | The account id of the account doing playback | no |
ActivateAtPlay | bool | Indicator if the product offering connected to the asset was activated when playback started | no |
AnalyticsBucket | int | Backend bucketing of analytics event | no |
AnalyticsPostInterval | int | The time the player should buffer analytics events before sending them as a batch to the backend | no |
AnalyticsTag | str | Backend tagging of analytics event | no |
Anonymous | bool | Indicator if the playback was anonymous or not | no |
AppType | str | Description of what type of app was used for playing the session | no |
AssetId | str | Id of the asset used when playing the session | no |
AssetRuntime | int | The length of the asset as reported by the player if available | yes |
AssetSubtype | str | The subtype of the asset that was played | yes |
AssetTitle | str | The human readable name of the asset | yes |
AutoPlay | bool | Indicator if the session was started due to autoplay triggering the session start | no |
AverageBitrate | double | The average bitrate across the session. This is a weighted average so the longer a session stays in a specific bitrate the heavier that bitrate will impact the average | no |
Browser | str | The manufacturer/developer of the browser used when playing the session | no |
BrowserFamily | str | The browser family of browser where playback was started | no |
BufferingEvents | int | Number of events which indicated that buffering was started for the session | no |
BufferingRatio | double | The ratio between time spent buffering and the time not spent buffering. I.e. a session that is 10m which spent 5m buffering and 5m playing would have a ratio of 1.0 (100%), a value over 100% would then indicate that the user spent more time buffering than watching. | no |
BufferingRatioOfTotalDuration | double | The ratio between time spent buffering and the total session duration. I.e. a session that is 10m which spent 5m buffering and 5m playing would have a ratio of 0.5 (50%) | no |
BusinessUnit | str | The RedBee business unit to which the session belongs | no |
CDNVendor | str | The CDN vendor used when streaming the session | no |
ChannelId | str | Id of the channel that was played | no |
ChannelName | str | The human readable name of the channel | yes |
City | str | The city where playback was initiated | no |
ConnectionType | str | The connection type used when playing the session | yes |
ContainedErrors | bool | Indicator if the session contained any known errors | no |
Country | str | The country where playback was initiated | no |
Customer | str | The RedBee customer name to which the session belongs | no |
DeviceId | str | The unique id of the device where playback occurred | no |
DeviceModel | str | The model name of the device used for playing the session | no |
DeviceType | str | The device type used when playing the session | yes |
DisplayTitle | str | The human-readable title of the asset | yes |
DrmCertificateRequests | int | Number of DRM certificate requests sent by the during the session (usually one, but there can be multiple) | no |
DrmCertificateResponses | str | Number of DRM certificate responses sent by the during the session (usually one, but there can be multiple) | no |
DrmLicenceRequests | int | Number of DRM licence requests sent by the during the session (usually one, but there can be multiple) | no |
DrmLicenceResponses | int | Number of DRM licence responses received by the during the session (usually one, but there can be multiple) | no |
DRMType | str | The type of DRM used on the asset (Widewine, FairPlay etc) | no |
EntitlementId | str | The id assigned the session by the entitlement call | no |
EntitlementType | str | The type of the entitlement request made by the player | yes |
ErrorCode | str | The error code reported by the player in case of error | no |
ErrorMessage | str | The error message encountered by the player in case of error | no |
EventCount | int | Number of raw player events that the was processed for the session | no |
ExternalProductId | str | The external product id of the product that the played asset belongs to | yes |
FirstEventTime | int | The epoch time of the first event for the entire session | no |
Format | str | The playback format used during the session | yes |
Height | int | The screen height used on the device playing the session | no |
IncompleteBufferingEvents | int | Number of buffering events that were incompleted (i.e. either missing buffering started or buffering ended) | no |
Intent | str | The intent (PLAY/DOWNLOAD) of the player when requesting playback | no |
LastEvent | str | The last event of the session | no |
LastEventTime | int | Epoch timestamp of the last event sent by the player | no |
LastPlayedOffset | int | The last played offset time (the position of the player in the stream) | no |
MandatoryValidationFailed | bool | Indicator if the session failed the optional analytics pipe validation checks | no |
Manufacturer | str | The company manufacturing the device used to play the session | no |
MaxBitrate | double | Maximum bitrate value achieved during the session | no |
MediaLocator | str | URL of the media locator used for playback | no |
MinBitrate | double | The minimum bitrate played during the session | no |
Model | str | The asset model used for the played asset | yes |
MostUsedBitrate | double | The bitrate that was used the most (i.e. the longest time) during the session | no |
Name | str | The name of the layout engine used if a browser was used for playback | no |
OptionalValidationFailed | bool | Indicator if the session failed the optional analytics pipe validation checks | no |
OS | str | The OS used by the device that played the session | no |
OSVersion | str | The version number of the OS used on the device where the session was played | no |
PackageId | str | Id of Red Bee program package used | yes |
PercentagePlayed | double | Percentage of the asset that was played if the length of the asset was known to the player. | no |
Player | str | The name of the player SDK used for playing the session | no |
PlayerVersion | str | The version number of the player SDK used for playing the session | no |
PlayMode | str | The type of asset that is being played (vod, live etc) | yes |
PlayType | str | The type of asset that is being played (vod, live etc) | no |
ProductId | str | The id of the product to which the asset that was played belongs | no |
ProgramId | str | The id of the last program that was played during the session | no |
ProgramName | str | The human-readable title of the program | yes |
PublicationId | str | Id of the publication to which the asset belongs | no |
Resolution | str | The screen resolution used on the device playing the session | no |
RootedOrJailbroken | bool | Indicator if the device was rooted or jailbroken | no |
SessionCompleted | bool | Indicator if the session ended with either a playback completed/aborted event or an error | no |
SessionId | str | The unique id of the session | no |
SessionStarted | bool | Indicator if playback was started (i.e. if the session received the proper started event) | no |
SessionToken | str | The session token reported by the player | no |
StartupDurationMs | double | Number of ms from session initialization to playback starting | no |
SupportedDevice | bool | Indicator iif the device is supported by the player framework | no |
TimeSpentBufferingMs | str | The number of milliseconds the session spent buffering in total | no |
TotalAdCompleted | int | Number of events that indicated that an ad had completed playing during the session | no |
TotalAdFailed | int | Number of events that indicated that an ad had failed playing during the session | no |
TotalAdStarted | str | Number of events that indicated that an ad had started playing during the session | no |
TotalUsageMegabyte | double | Experimental calculation on how many MBs was consumed during session playback | no |
Type | str | The device type reported by the player | yes |
UserAgent | str | The user agent string as reported by the device playing the session | no |
UserId | str | The unique id for the user doing playback | no |
VideoLength | int | The length of the asset as reported by the backend if available | no |
ViewingDurationMs | int | The total number of milliseconds that the player was playing content (i.e. not in a paused state) | no |
Width | int | The screen width used on the device playing the session | no |