Machine Learning at Tubi: Powering Free Movies, TV and News for All

Published in

PyTorch

11 min readFeb 25, 2021

Authors: John Trenkle, Jaya Kawale and the Tubi ML Team

This abstract work evokes complex networks, genomics and reels of film simultaneously (source: “Snacce” by jez.atkinson is licensed under CC BY 2.0)

In this blog series, our aim is to highlight the nuances of Machine Learning in Tubi’s Ad-based Video on Demand (AVOD) space as practiced at Tubi. Machine Learning helps solve myriad problems involving recommendations, content understanding and ads. We extensively use PyTorch for several of these use cases as it provides us the flexibility, computational speed and ease of implementation to train large scale deep neural networks using GPUs.

Who is Tubi and what do we do?

With 33 million active monthly users and over 2.5 billion hours of content watched last year, Tubi is one of the leading platforms delivering free movies, TV series and live news to a world eager to consume high-quality shows. We have curated the largest catalog of premium content in the streaming industry including popular titles, great horror, and nostalgic favorites. To maintain and grow our enthusiastic audience and expanding catalog, we leverage information from our platform combined with a selection of trusted publicly-available sources in order to understand not only what our current audience wants to watch now, but also what our expanding audience wants to watch next. Viewers can watch Tubi on dozens of devices, sign in, and have a seamless viewing experience with relevant ads presented at half the load of cable.

Tubi can be watched on nearly any device you use. Someday, it may even be on your smart fridge!

Tubi is all-in for Machine Learning

To succeed, Tubi embraces a data-driven approach, but more importantly, we are on a constant mission to explore the inflating universe of Machine Learning, Deep Learning, Natural Language Processing (NLP) and Computer Vision (CV). (see this for a discussion of our overarching philosophy). Our research, development and deployment are done on a flexible platform that relies heavily on Databricks as a primary computational component (in conjunction with other open-source resources) and PyTorch and other advanced frameworks to tackle our challenging problems.

What is Video on Demand (VOD)?

When you hear the phrase streaming service — where most people now use one or more — it is likely that the companies that come to mind focus on the subscription-based Video on Demand (SVOD) business model. This means that the way they make money is by charging users a monthly fee to watch any of the content available on their platform for that month. Advertising-based Video on Demand (AVOD) looks just like those streaming services with the major difference being that it’s free — just as television has been free for 80 years — because viewers will see a minimal number of commercials amidst the quality shows they are watching. This is how an AVOD company generates revenue. The reason we call this out is that it makes a very big difference in the problems that we have to tackle and the ways we leverage Machine Learning to help us.

The Three Pillars of AVOD steer ML applications

In the AVOD world, there are three groups — or pillars — that support the paradigm:

Content: all the titles we maintain in our library
Audience: everyone who watches titles on Tubi
Advertising: ads shown to viewers on behalf of brands

To be successful, Tubi needs to maximize each group’s level of satisfaction, but they’re tightly interrelated so it’s a delicate balancing act. This figure illustrates the pillars and captures the interactions between them.

The **Three Pillars of AVOD** in which there are specific relationships between the entities that we attempt to leverage in a virtuous cycle to optimize the degree that each is satisfied

The Three Pillars model of AVOD highlights the relationship that we follow to maintain a virtuous cycle — that is a chain of relationships and events that gets reinforced through a feedback loop. We’ll jump into the cycle at Content. Acquiring strong titles helps to keep our Audience watching shows they love. This entails leveraging rich representations of our existing catalog and finding titles similar in this space which we’ll discuss in more detail later. Furthermore, having a Content Pyramid in which popular titles are supported by similar films that the viewers can watch next is critical. When those great shows are streaming, we can inject relevant Ads at a pace that doesn’t disrupt the viewers or drive them away (in the best case, they find the commercials useful or they don’t notice them too much). Those Ads do three things:

Expose brands to audiences they desire and garner ROI
Generate revenue for the Content Partners
Earn Tubi the money it needs to grow and improve

In the feedback loop then, better Content can grow the Audience, and a bigger Audience means more eyeballs for Brands. More Brands will be attracted to Tubi. More Brands beget more Ads to drive profits for the Content Partners and the AVOD. More revenue, more budget to pursue better Content. Rinse. Repeat.

The pillars in Tubi’s virtuous cycle also correspond to three key areas for Machine Learning: Recommendation, AdTech and Content Understanding. In the next section, we’ll look at AVOD through the lens of ML.

How does ML fit into the AVOD ecosystem?

Every streaming service has the tendrils of recommendation systems penetrating every aspect of their business. From what one should watch next, what genres a viewer might like, sending out weekly emails with the latest and greatest relevant titles and numerous others. It is pervasive. We will also address it; however, with current levels of saturation, we’ll flip the discussion and address the ML pillars in this order:

Content Understanding
Advertising Technology
Recommendation Systems

In today’s post, we’ll touch on each and summarize while subsequent posts will deal with each topic in more detail.

Content Understanding

Some of the goals of Content Understanding at Tubi are to develop lists of the most promising titles to pursue, aid in projecting price points for films and series, facilitate smooth addition of newly released titles and many others. The use cases for our Content Understanding system, named Project Spock, will be addressed explicitly in future posts. ML for Content in the VOD arena is fueled by the already-existing body of rich metadata for media, but also mines rich textual content using many of the fairly recent developments in NLP and embedding technologies from the now wizened word2vec and doc2vec, through fasttext and GloVe on to our modern transformer-based techniques such as ELMO and BERT and soon, Big Bird.

Given a rich collection of 1st- and 3rd-party data, we produce embeddings that capture every aspect of a title and leverage those for modeling. We rely on PyTorch to create models that cover numerous use cases such as cold starting new titles, predicting the value of non-Tubi titles and numerous others seen in the figure. In cold-starting, for example, we use PyTorch to build a fully-connected network that allows us to map from a high-dimensional embedding space that captures relationships from metadata and text narratives to the collaborative filtering model in Tubi’s recommendation system. Simply stated, this allows us to establish what viewers may be interested in a new title that has never played on Tubi before.

We call this process “embending” from the universe to the tubiverse — a mash-up of bending from an embedding-space with one perspective to another with a different one. PyTorch has been extremely valuable in helping us to attack this challenging small data problem with its flexible DataLoader utilities for constructing mini-batches that deploy several regularization tricks. Beamed embeddings have been a game-changer for ramping up new inventory as it is added to our catalog.

Making *Embending* models using **PyTorch** to facilitate cold starting

It should be noted that whereas recommendation approaches focus on content playing in the platform — the tubiverse, ML Content tasks focus on all data in the universe. In the long term, we are progressing towards integrating all of our diverse sources of data and embeddings into Graph-Based modeling and Knowledge Graphs as a tangible way to relate all objects in our ecosystem into a single cohesive space. The ability to directly and numerically compare any two objects in our space with confidence leads to better recommendations, more relevant ads to users, a better understanding of our audience and a better overall experience.

The organization of **Tubi’s** *Project Spock* system which drives all content-oriented use cases for **Tubi**

AdTech

Advertising technology exists only in AVOD and covers all aspects of the service that have to do with the experience of how ads are presented to the viewers on the platform and the monetization of those ads. The core goal of ML in the ad space is to give the users a pleasant ad experience.

There are three key focus areas for AdTech:

Targeting: leverage user behavior and demographic information for targeting specific audiences with relevant brands ads
Ad Presentation:
- which ads are seen by a user
- when and how many there are in a break
- where the insertion of ad pods would be least disruptive
Revenue Optimization: dynamically modifying price points for advertisers to collect the best value for each opportunity and other techniques

ML also helps us reduce repetitive brand ads and helps our advertisers to effectively connect with our users. A key example of innovation in this space is our Advanced Frequency Management (AFM) solution, which relies heavily on PyTorch to develop and deploy models for logo detection and classification under the hood. AFM uses computer vision-based technology to cap the exposure of brand ads at a campaign level, regardless of the source of supply. We use a novel approach that scans every piece of creative content that comes from various demand sources and outputs a confidence score on the detected brand. We use this information on the brand and campaign to decide on the delivery.

As a result of this, our users do not receive more ad impressions of a campaign than intended. There are many other challenging subproblems in the ad space that we tackle on an ongoing basis.

Highlighting various use cases of **Tubi’s** AdTech

Recommendation

The primary goal of a recommendation system is to help viewers quickly find content they would like to watch. Recommender systems are ubiquitous on the Tubi homepage. They help surface the most relevant titles for the viewers, help find the most relevant rows or containers, help users search for a title, help us choose the relevant image for a title, helps us send push notifications and messages about relevant content to the viewers, cold start new titles and users, etc. There are several challenges for recommendation at Tubi mostly arising from the large scale of users, the short lifespan of content especially news and the ever-growing content library.

Typically recommender systems rely upon collaborative filtering which establishes a relationship between the titles and the viewers and uses the “wisdom of the crowd” to surface relevant content to the viewers. There are dozens of ways to capture collaborative filtering including matrix factorization, context-based models, deep neural nets, etc.

Our systems are built on top of robust frameworks like Spark, MLeap, MLFlow and others using Databricks which enables us to experiment with the latest trends in ML including online feature stores, real-time inferencing, contextual bandits, deep learning and AutoML. Our end-to-end experimentation platform helps the team to quickly translate the latest ideas into production. Our current research directions are utilizing PyTorch to experiment with Neuro-Collaborative Filtering approaches that will leverage the power of Deep Learning to leverage many of the rich data coming from our Content Understanding system to enable even more insightful recommendations for our viewers.

Recommendation is found everywhere on all **Tubi** platforms

To reiterate, there will be rich and detailed discussions of each of the Three Pillars in future posts!

What does Tubi’s tech stack look like?

At Tubi, we leverage the powerful, well-engineered packages coming from the tech giants. It wasn’t so long ago that all work in ML started with rolling your own variant of the algorithm that you thought was applicable, hoping you got it right when you translated from the paper to code and then trying to solve the actual problem at hand. Luckily, we are now in an age in which one can build on top of tons of highly-interoperable algorithms on data in a standard representation and solve interesting problems with much faster prototyping, the ability to rapidly compare many algorithms, optimize hyper-parameters and get first cuts into production. Ah, progress.

Another aspect of software engineering that has been highly enabling is the advent of the large cloud-based platforms such as Amazon Web Services (AWS) and Microsoft Azure on which one can easily plug into dozens of well-supported and integrated services and deploy solutions at scale. Furthermore, services such as Databricks that further integrate the power of Spark and Cloud architectures with the Notebook IDE paradigm have led to a quantum leap in the ability for small companies to be competitive.

At Tubi, we liberally use all of these resources to solve the many challenges we face in ML on a daily basis. From problems that churn through hundreds of millions of records, to real-time applications where low-latency, highly-performant algorithms are necessary. The following figure illustrates some of our go-to packages.

Calling out the major packages that we frequently use at **Tubi**

So, what does the infrastructure that Tubi uses for ML R&D and deployment look like? The following figure attempts to capture the 30,000-foot view of Tubi’s architecture to highlight how we live on AWS and rely heavily on Databricks as the powerhouse of our system both to support the interactive development of algorithms as well as deploying them to our live platform. This is merely a caricature, but it points out a few substantial facts:

We rely on 1st- and 3rd-party data that can be seamlessly integrated using S3, Redshift and Delta Lake.
Some aspects of ML require low-latency, real-time interactions with viewers. An example would be interacting with first-time users, dynamically delivering somewhat customized titles and recommending live news
Other aspects can be small data, low latency models such as predicting the value of never-seen-before titles
The ability to plug-in and use the latest and greatest algorithms in Databricks using python or scala is a huge benefit when developing ML solutions
PyTorch, XGBoost and other packages have been well-engineered to play nicely with Spark and Databricks and to take advantage of cloud storage, large clusters and GPUs to enable the pursuit of algorithms that may not have been in consideration because of resource issues just a couple of years ago.

A useful trace of the **Tubi** platform from devices to streaming views and the data and machinery that drives ML

Stay tuned for more

In conclusion, we’ve told you all about streaming services, AVOD, Tubi and how we view machine learning. It was fun, but this was just the beginning. In the following posts, we’ll take a deeper look at ML Content and some of its use cases and follow with some depth-first traversal to reveal a bit more about the fascinating world of our Project Spock system for content understanding. Unfortunately, blogs are not on-demand, so be patient — it’ll be worth the wait.