An Easy Introduction to Data Science Recommender Systems


How do you think Netflix or YouTube gets to know what you would like to watch? How does Amazon suggest the exact items you have been vying to purchase? How does Google getto know what material you would like to read? All of this is made possible for Big Tech by the data science recommender systems.

Recommender systems include the techniques and algorithms that suggest to a user the content “relevant” to her preferences.

Recommender Systems help digital ecosystem scale and customize at the same time.

This quality gives businesses better targeting capabilities and is a great value-add.

How does it do it? It ranks items according to their relevancy, which is quantified based on what the user has recently viewed or purchased – news, ecommerce items, music, etc. Based on this data, recommender systems algorithms begin to show similar themes to the user.

In this article, we discuss:

  • Types of Recommender Systems’
  • Collaborative Systems and their types
  • Complementary Systems
  • Hybrid Systems

Three Kinds of Recommender Systems

There are two types of recommender systems – collaborative and content-based.

Collaborative systems consider historical preferences of a user to recommend, while content-based systems customize the recommendations much further by categorizing them as per age, sex, and other factors.

Collaborative Systems

Collaborative methods are based on a user’s past interactions with a website or system. These systems record what items does the user click and interact with and based on that devise new recommendations of the product/video/other offerings.

Collaborative systems, thus, are stored in a form known as the user-item interactions matrix. A diagrammatic representation:

Users User-Item Interactions Matrix Items
Subscribers Rating given Movies
Readers Time spent by a reader Articles
Buyers Product clicked or not (during suggestion) Products

The core idea behind collaborative methods is that users’ past interactions are enough to conclusively predict the items in similar proximities that they would like.

There are two sub-categories of collaborative recommender systems algorithms – memory-based and model-based.

  • Memory-based approachesassume no model, and work around the recorded interactions. They are based on nearest neighbor’s search – i.e. finding the past records of closes users to predict what the user in question would be interested in. Based on such “neighbors’ suggestions,” most popular of them is recommended.
  • Model-based approaches assumea model – a generative model to call it. It explains the interaction of users and items and based on them try to discover and make new predictions.

Benefits and Limitations of Collaborative Systems

Major advantage of collaborative machine learning predictive models is that they don’t require much training or information on user or items. This makes them suitable to be used in a variety of situations. Additionally, the more a user interacts with the items, the better will become the recommendations by these systems.

These systems, however, suffer from “cold start problem.” That is to say that it is impossible for collaborative systems to recommend anything to a new user as it would have nothing in its past records to take a cue from. Data scientists and machine learning professionals address this problem majorly by:

  • Random Strategy – Recommending random items to new users or recommending new items to random users.
  • Maximum Expectation Strategy –Recommending popular items to active users.
  • Exploratory Strategy – Recommending a set of items to new users or new items to a set of users.

Content-Based Systems

Content-based approaches use detailed information about a user and the items to profile each. They take into account several factors such as age, sex, occupation, interest, hobbies, past-interactions, and other factors.

For instance, movies recommender systems algorithms would use a user’s personal characteristics and preferences to understand the actors one like, the time one has to watch a movie (based on the duration of movies one usually prefers), and more, to finally recommend a movie (or other items).

User’s Features Star Wars: The Rise of Skywalker Avatar
Male, 22,

Interests: Action, Crime

Like Dislikes
Female, 26,

Interests: Fantasy, Adventure

Dislikes Likes


Benefits and limitations of content-based systems

The core idea behind content-based data science recommender systems is that it seeks to build a model based on the features of a user, and then try to explain what a user would like or dislike.

To give an example, to model young women, a content-based recommender system would try to model how young women tend to rate movies. Similarly, about the young men.A model makes new predictions more easily and accurately as more information is fed into it. These recommender systems learn from every input as are built on fundamental principles of machine learning. Unlike collaborative systems, content based systems don’t suffer from cold-start problem.

Hybrid Systems

As the name suggests, hybrid systems are a combination of both collaborative and content-based systems. The hybrid approaches usually take two forms: they either train two models differently – one on collaborative system and another on content-based; and other builds one model (generally a neural network) that unifies both types of recommender systems into one.

Data science recommender systems have become vital to all industries – not just big tech. This was an introduction to these much in vogue art of recommendation in the digital world. For more on them, follow DASCA insights.


Please enter your comment!
Please enter your name here