Connected Intelligence

Master internship – Data Mining on Big Data for Music Recommender Systems




In today’s world, many goods and services provided to the consumers are done through a web application. Within the available plethora of information in e-commerce, it is necessary to lter this information for keeping only items that might be relevant for the user. Recommender systems are such automated systems, they can be de ned as software tools and techniques that provide suggestions for items that are most likely of interest to a particular user [1]. In the cultural eld, like music recommendation [2], using those systems raises the question of diversity, novelty, and discovery [3]. The human being is fond of stability, but he is not against breaking his routine and exploring things out of his comfort zone. In this context, it is relevant to propose new items not too similar to items already used or buy by the users for expanding and enriching their cultural knowledge. This approach can been done based on a dissimilarity measure computed between cultural items. Moreover, few years ago, with the emergence of the word embedding paradigm [4] and the ability to run algorithms on big data, the content-based approaches [5] have experienced a resurgence of interest for the recommender systems [6, 7].
The objective of this master thesis is to study how data mining techniques can be helpful for improving the quality of the music recommendations on a streaming platform when the content associated to music artists is not well structured, which is the case of emerging music artists from independent record labels (“indie labels”): the artists are not publicly known, they have no publicity or only a little, most of them do not have a web site with a useful structured content to exploit (e.g., a Wikipedia page), there is a very few chance of fnding items on these music artists in the specialized music press, etc.
This project will be done at Hubert Curien Laboratory (Saint-Etienne, France) on the data of a real music streaming platform called “1D lab” developed by the social start-up company 1D Lab (


Expected results