IBC 2024 'Migrate to Monetize' Demo

Written by Alex Jennings | Sep 29, 2024 10:00:00 PM

Vidispine were invited by AWS to join their partner demo showcase for “Media Archives: Migrate to Monetize”. This collaboration was designed so that information could move seamlessly between various applications all running within an environment provisioned specifically for the IBC show.

The overall goal was to provide a platform so that customers could discover and monetize content which they receive in an unstructured format through bulk acquisition. The various services are interconnected via the publishing and routing of messaged via AWS Eventbridge. This post focuses specifically on the parts of the overall platform which connect to Vidispine's application, but viewing the AWS blog post gives a broader overview.

Titles are initially created by users in MetaVu and linked to various industry identifiers such as IMDB and EIDR. These titles may be a film, a TV episode, or a specific edit version with regional differences in language tracks or compliance cuts. MetaVu publishes a message with the IDs, stating that a new title has been added to the platform. The message is conformed to a uniform schema for all platform events, and routed via EventBridge to an SNS topic which all applications have access for.

Vidispine's project team created an SNS subscription to add all new title events to an SQS queue to be processed. The SQS message is then routed via an EventBridge Pipe to trigger a Lambda function, and configured with an input template to extract the message body.

The Lambda function reads client credentials from a Secret and generates a JWT from Vidispine's AuthService which provides authentication and authorization services across our applications. The body of the message simply passes along as a JSON variable when creating a new VidiFlow workflow execution. This Lambda is written in JavaScript and uses the Vidispine Development Toolkit (VDT) libraries to simplify the code to a few lines without any business logic.

The VidiFlow application is deployed alongside the other Vidispine software into an AWS EKS cluster, with workflow configuration and execution history stored in AWS RDS PostgreSQL, and a AWS MQ RabbitMQ as an internal queue for service agents to retrieve task data.

The workflow triggered by the new title will run a simple scripted task to extract the IDs from the body of the message and store them as variables for other tasks. It was important to place this logic at part of the workflow, rather than upstream, to keep the initial message handling generic.

The EIDR ID is passed along as task data to a new service agent developed for this project. The information about the task is placed onto a RabbitMQ queue and this is used as an event source to trigger a Lambda function. The function uses the EIDR API to lookup the title information and the entity hierarchy. This information is then mapped as VidiCore metadata values which are used to create item placeholders. These placeholders are assigned external identifiers of the various IDs from the upstream systems which ensure unique items for each record. The hierarchy of the parent films or episodic TV is created as nested collections, or uses the existing from previous iterations. The parent collections are assigned metadata values which are inherited by the child placeholders, such as the episode title or a film's director. When the function is complete, a message is published to the queue with the placeholder's VidiCore IDs, which VidiFlow then receives and updates the workflow's execution history. Before ending the workflow, a message is published via a service agent to the platform's SNS topic containing the VidiCore IDs for other applications to consume.

A second event is created when Cloudfirst.io migrates the content. This has the path to the media file stored in the S3 bucket. Again, this triggers a VidiFlow workflow execution which reads the value of the S3 object and triggers the ingest into the placeholder item. This workflow also contains a decision-tree using the media-info results to determine whether transcoding should occur, and which profile should be used. Media can sent for transcoding using VidiCoder or AWS Elemental MediaConvert. The original file can then use a lifecycle rule to move into the Glacier deep-archive while smaller proxy transcodes remain in cheaper S3 locations for playback and review.

Finally, the metadata enrichment event is received from Media2Cloud. This contains timed-metadata for speech-to-text, object and face detection, segmentation and chapter suggestions. The timespans are converted from SMPTE to VidiCore's time representation using the VDT library, and applied at metadata to the item. Additionally, the TwelveLabs index ID is added to the VidiCore item to allow a combined search across both applications.

Users are able to navigate from the upstream systems using the common IDs to see the asset in MediaPortal. They can see the title metadata and hierarch from EIDR, and the timed metadata from Media2Cloud. Additional timed events can be discovered using a natural-language query which is sent to the TwelveLabs index and augmented with the VidiCore asset results. These timed-events from various assets are then cut together as a VidiEditor sequence using the proxies initially, but restoring the full resolution from the archive via a VidiFlow workflow if needed.

Everything developed by Vidispine's project team specifically for the integration was written as infrastructure-as-code (IaC) using the AWS Serverless Application Model. This allowed the Lambda codebase and CloudFormation resources to be deployed via the CLI for rapid iteration.

A combination of using serverless where possible allows the service to scale to peak workloads while not incurring and costs when idle, EventBridge to route events, and Lambda functions to parse messages allowed the project team to loosely couple applications together without relying on vendors to develop specific integrations.

View full post