The Rakuten-Viki Global TV Recommender Challenge has finally come to a successful closure on the 16 September 2015. Six Teams (Team Merlion, Team GM, Team Haipt, Team Pritish,Team Gbenedek & Team Lenguyenthedat) were invited to present publicly in front of a pool of audiences and the judges.
Finalist teams are announced! We would like you to join us for the final presentation event where shortlisted teams will present their algorithms and insights to you.
Build a model to recommend TV drama episodes to viewers.
A Data Challenge hosted by Rakuten Institute of Technology and Rakuten-Viki based on the online TV viewing data
|Go to Challenge||Download Case Study|
Rakuten-Viki Global TV Recommender Challenge
Viki – a play on words “video” and “wiki” – is a Global TV site powered by fans who have been translating their favourite foreign videos, ranging from Korean and Turkish dramas to Japanese Anime, into over 200 languages. Acquired in 2013 by Rakuten, Viki continues to bring down language barriers to great entertainment, contributing to Rakuten’s borderless digital ecosystem. Are you able to recommend to our viewers videos that would catch their interest?
The Challenge requires participants to predict for each user a set of TV drama episodes that users would watch with interest. The winning models will be accurate, easy to implement, and innovative – scoring high on the Expected Weighted Average Precision (EWAP) metric, being well-documented, and featuring key drivers that are novel and insightful.
Are you able to build a personalized recommender system for Viki fans worldwide? Based on the user and video attributes and the historical viewing patterns, your task is to predict which 3 TV dramas each user will watch next with highest engagement.
The evaluation metric is:
Expected Weighted Average Precision (EWAP).
Note: Accuracy of the data will be evaluated real-time (against evaluation datasets). Participants are limited to 10 submissions a day until the closing date.
Period 39 Days
Start Wednesday 22nd July 2015
End Monday 31st August 2015 23:59
Prize pool: SG$ 12,000
Team Merlion Announced as Winner of Rakuten-Viki Global TV Recommender Challenge
20 Sept, The Neo Dimension
Winner of Rakuten-Viki Global TV Recommender Challenge Announced
17 September 2015, The Tech Revolutionist
Rakuten Institute of Technology announces winner for Rakuten-Viki Global TV Recommender Challenge
17 Sept, Tech Trade Asia
Applied analytics takes many forms at Rakuten-Viki Global TV Recommender Challenge
16 Sept, Tech Trade Asia
A Singapore focuses on refining data, the new ‘oil’
16 Sept, Tech Trade Asia
Rakuten Challenge Results
4th September 2015, DEXTRA Blog Post
Successful Raketen Workshop
19th August 2015, DEXTRA Blog Post
Rakuten opens research centre in S’pore
1st August 2015, The Straits Times (Online)
Rakuten opens research centre in S’pore
1st August 2015, Asia One
Rakuten Institute of Technology Launches in Singapore to Drive Innovation, Empower Businesses and Consumers
1st August 2015, The Tech Revolutionist
Rakuten : Establishes New Technology Research Centers in Singapore and Boston
29th July 2015, Singapore News Net
#RakutenInstituteofTechnology , Rakuten’s New Research Centre in #Singapore & #Boston
30th July 2015, The Neo Dimension
Expected Weighted Average Precision
20th July 2015, DEXTRA Blog Post
Rakuten-Viki Global TV Recommender Challenge
7th July 2015, DEXTRA Blog Post
about the host
Rakuten Inc is a Japanese Internet services company with headquarters in Tokyo, Japan, founded in 1997. In addition to its flagship online marketplaces, the Rakuten Group consists of 40+ online businesses, including travel booking, internet finance, e-reading, digital marketing, and professional sports. Rakuten integrates all of its services in Japan through a membership loyalty program, Rakuten Super Points, which is a foundation of Rakuten Ecosystem and the core strategy for firm’s global expansion.
Rakuten Institute of Technology
Rakuten Institute of Technology is the strategic R&D arm of Rakuten, Inc. It is the in-house think-tank and accelerator on a mission to improve existing services and to find new solutions to business challenges. Rakuten Institute of Technology supports businesses with predictive analytics, reduces costs via automation, and introduces innovative technologies that contribute towards Rakuten’s broader corporate goal of empowering businesses and consumers.
Rakuten Institute of Technology has established research centres in Tokyo, Paris, and New York. In 2015, The Institute opened its offices in Singapore and Boston.
The Singapore branch of the Rakuten Institute of Technology will lead the research and technology development to deepen Rakuten’s understanding of consumer behaviours in Asia and beyond, enabling innovation in growth markets.
Viki Inc is a global TV site with TV shows, movies and other premium content, translated into more than 200 languages by a community of avid fans. With 35 million viewers each month, 26 million mobile app installs and over 800 million words translated, Viki uniquely brings global prime-time entertainment to new audiences and unlocks new markets and revenue opportunities for content owners. Viki was acquired by Rakuten in 2013. Viki has offices in San Francisco, Singapore, Seoul and Tokyo.
In January 2014, Rakuten launched Internet shopping mall Rakuten.com.sg in Singapore, through its group company Rakuten Asia Pte. Ltd.
Data is provided for a sample of users and should not be considered complete. All data is anonymized to protect users’ privacy.
This is the users’ viewing behaviour data on Viki from 1 October 2014 to 31 January 2015, containing more than 4.9 million rows. Each record contains information about a particular user watching a particular video.
This is the user attributes data, specifying gender and country of origin for over 880,000 users.
This is the video attribute data, including video’s country of origin, language, genre, and more, for over 600 titles.
This is the data of nearly 2,000 actors and actresses featured in Viki TV dramas. The country of origin and gender of the cast members are provided without masking. The names are replaced with actor ID.
This is the list of user IDs to be submitted with your recommended movies. The data has more than 1.8 million rows and covers user behaviour from February 2015 to March 2015. We will share details on the submission format soon.
This is Report and Algorithm Summary template. Please complete your report according to the template. Save it as PDF and submit it together with your prediction scores and zipped well-documented codes in the “Upload” section of the submission page in your final submission.
This is the data schema for those data in CSV format. It includes the explanations of the features and their values in the datasets, and some basic statistics of selected features. Please read it carefully and not miss any information inside it.
We handpicked a selection of materials for you to quickly become familiar with the core principles of the recommendation systems.
Introduction to recommendation systems:
More advanced materials:
What is the evaluation metric, and why we choose it?
The evaluation metric is Expected Weighted Average Precision (EWAP), derived from a commonly used metric Mean Average Precision (MAP), selected and customized for this challenge by DEXTRA and Rakuten data scientists. The details, including an example of how the metric is calculated, can be found in the section “details” of the challenge statement.
Typically 3 types of metrics are used to evaluate an offline recommendation system, i.e., accuracy metric (such as RMSE), decision support metric (such as recall and precision), and rank metric (such as DCG). From business point of view, decision support and rank metrics are usually more important, because they are directly related to increasing sales and improving users’ experience. Therefore, we crafted the new metric incorporating the characteristics of the two metrics.
Do you provide any resources for beginners in recommendation systems?
We strongly encourage everyone to participate in this challenge. We handpicked a selection of materials for you to quickly become familiar with the core principles of the recommendation systems. Please click here to access these materials.
Why the demographic data of users are limited?
Viki collects limited information from registered users. If you have an idea on improving the recommendation system through collecting more demographic data, please share with us in your report.
Can I submit more than one algorithm?
Yes, you can submit more than 1 algorithm by adding a one-line summary on your algorithm in the comments section during each submission. There is no limit to the number of models you can submit. However, please take note that you can submit only 10 models a day and we will evaluate your final performance based on your latest submission.
What if I submit the same algorithm more than once (as I had to make some changes)?
We totally understand that participants will be testing multiple iterations of their algorithms before the deadline and prior to writing the final report. Feel free to make use of the “Comment” section to annotate your submission for your easy reference.
Which techniques should I employ?
You are encouraged to explore different techniques or even a combination of multiple algorithms. It is a dual challenge for your creativity and knowledge. Work hard but have fun! As long as the output is in CSV format, so that our system can evaluate your submissions, there is no limit to which approaches or software or technology you use.
What is the difference between Public Evaluation Score and Private Evaluation Score?
The accuracy of your submission is decided by your recommendation sets’ performance on the provided test dataset. We have the answers for this dataset, but are witholding them to compare with your predictions.
Public Evaluation Score is what you will receive back upon each submission (that score is calculated using a statistical evaluation metric, which is EWAP in this challenge). But your Public Evaluation Score is being determined by only a fraction of the test dataset. The scores shown on the leaderboard are the Public Evaluation Score and it shows relative performance during the challenge.
Private Evaluation Score is created when your prediction is compared with the rest of the test dataset. In this challenge, public score is calculated from users’ activities in February 2015, about 50% of total test datasets, and the private score is calculated from users’ activities in March 2015. You will never receive this private score until the challenge ends. Finalists are selected based on Private Leaderboard.
Why do we need to split Public and Private Evaluation Scores?
The separation of the test dataset into public and private portion is to ensure that the most accurate but generalized model is the winning model. If you based your model solely on the data which gave you constant feedback, you run the danger of a model that overfits to the specific noise in that data. One of the hard challenges in predictive analytics is to avoid overfitting, by leaving your model flexible to out-of-sample data.