Deep Neural Networks for YouTube Recommendations

PDF Publication Title:

Deep Neural Networks for YouTube Recommendations ( deep-neural-networks-youtube-recommendations )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 002

tion methods [19], there is relatively little work using deep neural networks for recommendation systems. Neural net- works are used for recommending news in [17], citations in [8] and review ratings in [20]. Collaborative filtering is for- mulated as a deep neural network in [22] and autoencoders in [18]. Elkahky et al. used deep learning for cross domain user modeling [5]. In a content-based setting, Burges et al. used deep neural networks for music recommendation [21]. The paper is organized as follows: A brief system overview is presented in Section 2. Section 3 describes the candidate generation model in more detail, including how it is trained and used to serve recommendations. Experimental results will show how the model benefits from deep layers of hidden units and additional heterogeneous signals. Section 4 details the ranking model, including how classic logistic regression is modified to train a model predicting expected watch time (rather than click probability). Experimental results will show that hidden layer depth is helpful as well in this situa- tion. Finally, Section 5 presents our conclusions and lessons learned. 2. SYSTEM OVERVIEW The overall structure of our recommendation system is il- lustrated in Figure 2. The system is comprised of two neural networks: one for candidate generation and one for ranking. The candidate generation network takes events from the user’s YouTube activity history as input and retrieves a small subset (hundreds) of videos from a large corpus. These candidates are intended to be generally relevant to the user with high precision. The candidate generation network only provides broad personalization via collaborative filtering. The similarity between users is expressed in terms of coarse features such as IDs of video watches, search query tokens and demographics. Presenting a few “best” recommendations in a list requires a fine-level representation to distinguish relative importance among candidates with high recall. The ranking network accomplishes this task by assigning a score to each video according to a desired objective function using a rich set of features describing the video and user. The highest scoring videos are presented to the user, ranked by their score. The two-stage approach to recommendation allows us to make recommendations from a very large corpus (millions) of videos while still being certain that the small number of videos appearing on the device are personalized and engag- ing for the user. Furthermore, this design enables blending candidates generated by other sources, such as those de- scribed in an earlier work [3]. During development, we make extensive use of offline met- rics (precision, recall, ranking loss, etc.) to guide iterative improvements to our system. However for the final deter- mination of the effectiveness of an algorithm or model, we rely on A/B testing via live experiments. In a live experi- ment, we can measure subtle changes in click-through rate, watch time, and many other metrics that measure user en- gagement. This is important because live A/B results are not always correlated with offline experiments. 3. CANDIDATE GENERATION During candidate generation, the enormous YouTube cor- pus is winnowed down to hundreds of videos that may be relevant to the user. The predecessor to the recommender user history and context video millions candidate hundreds ranking dozens corpus generation other candidate sources video features Figure 2: Recommendation system architecture demonstrating the “funnel” where candidate videos are retrieved and ranked before presenting only a few to the user. described here was a matrix factorization approach trained under rank loss [23]. Early iterations of our neural network model mimicked this factorization behavior with shallow networks that only embedded the user’s previous watches. From this perspective, our approach can be viewed as a non- linear generalization of factorization techniques. 3.1 Recommendation as Classification We pose recommendation as extreme multiclass classifica- tion where the prediction problem becomes accurately clas- sifying a specific video watch wt at time t among millions of videos i (classes) from a corpus V based on a user U and context C, evi u P(wt =i|U,C)= 􏰊 evju j∈V whereu∈RN representsahigh-dimensional“embedding”of theuser,contextpairandthevj ∈RN representembeddings of each candidate video. In this setting, an embedding is simply a mapping of sparse entities (individual videos, users etc.) into a dense vector in RN . The task of the deep neural network is to learn user embeddings u as a function of the user’s history and context that are useful for discriminating among videos with a softmax classifier. Although explicit feedback mechanisms exist on YouTube (thumbs up/down, in-product surveys, etc.) we use the im- plicit feedback [16] of watches to train the model, where a user completing a video is a positive example. This choice is based on the orders of magnitude more implicit user history available, allowing us to produce recommendations deep in the tail where explicit feedback is extremely sparse. Efficient Extreme Multiclass To efficiently train such a model with millions of classes, we rely on a technique to sample negative classes from the back- ground distribution (“candidate sampling”) and then correct for this sampling via importance weighting [10]. For each ex- ample the cross-entropy loss is minimized for the true label and the sampled negative classes. In practice several thou- sand negatives are sampled, corresponding to more than 100 times speedup over traditional softmax. A popular alterna- tive approach is hierarchical softmax [15], but we weren’t

PDF Image | Deep Neural Networks for YouTube Recommendations

deep-neural-networks-youtube-recommendations-002

PDF Search Title:

Deep Neural Networks for YouTube Recommendations

Original File Name Searched:

45530.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com | RSS | AMP