AcademicFEATUREDTechnology

TikToK recommended system

The TikTok recommender system is widely regarded as one of the best in the world at the scale it operates at. It can recommend videos or ads, and even the other big tech companies could not compete. Recommending on a platform like TikTok is tough because the training data is non-stationary as a user’s interest can change in a matter of minutes and the number of users, videos, and ads keeps changing.

The predictive performance of a recommender system on a social media platform deteriorates in a matter of hours, so it needs to be updated as often as possible. TikTok built a streaming engine to ensure the model is continuously trained in an online manner. The model server generates features for the model to recommend videos, and in return, the user interacts with the recommended items. This feedback loop leads to new training samples that are immediately sent to the training server. The training server holds a copy of the model, and the model parameters are updated in the parameter server. Every minute, the parameter server synchronizes itself with the production model.

The recommendation model is several terabytes in size, so it is very slow to synchronize such a big model across the network. That is why the model is only partially updated. The leading cause of non-stationary (concept drift) comes from the sparse variables (users, videos, ads, etc.) that are represented by embedding tables. When a user interacts with a recommended item, only the vectors associated with the user and the item get updated, as well as some of the weights on the network. Therefore, only the updated vectors get synchronized on a minute basis, and the network weights are synchronized on a longer time frame. 

Typical recommender systems use fixed embedding tables, and the categories of the sparse variables get assigned to a vector through a hash function. Typically, the hash size is smaller than the number of categories, and multiple categories get assigned to the same vector. For example, multiple users share the same vector. This allows us to deal with the cold start problem for new users, and it puts a constraint on the maximum memory that the whole table will use. But this also tends to reduce the performance of the model because user behaviors get conflated. Instead, TikTok uses dynamic embedding sizes such that new users can be added to their own vector. They use a collisionless hashing function so each user gets its own vector. Low-activity users will not influence the model performance that much, so they dynamically remove those low-occurrence IDs as well as stale IDs. This keeps the embedding table small while preserving the quality of the model.

Join Upaspro to get email for news in AI and Finance

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses User Verification plugin to reduce spam. See how your comment data is processed.