Recommendation Engine for WPP marketing agency

This is quick summary of a Machine Learning project I did as a freelancer for a marketing agency between January and June 2017. The tech then got acquired by WPP.

The summary is mostly non-technical, and it can be interesting if you want to find out more about Recommendation Engines or about what steps Data Science projects usually involve.

Context on recommendation systems

A recommendation algorithm is the engine that allow websites and social medias to present relevant ads and content to their users. The algorithms are key to boost revenue for online retailers. Approximately 35% of purchases on Amazon are due to machine learning algorithms which analyze transaction data to recommend products, according to a report by McKinsey & Company.

Content providers and online movies streaming services also owe much of their success to these machine learning algorithms – some reports shows that:

  • Nearly 75% of viewing on Hulu is due to its recommendation system
  • Approximately 50% of movies watched on Netflix are recommended by the company’s recommendation algorithm.

Context on the project

In 2017, digital marketing company FusePump needed its own recommendation engine to satisfy a giant consumer goods client, but lacked the internal capability to build machine learning algorithms that learn customer preferences and recommend relevant products. I got referred by an acquaintance, and FusePump asked me to build a product recommendation engine able to:

  • Analyse user preferences and recommend relevant produces based on online activity.
  • Use anonymised data to not compromise customer privacy.
  • Continuously learn from new data and new customers.

FusePump also asked me for the product recommendations to be at the core of a bidding platform that would allow brands and other retailers to compete for ad space. Essentially, if a retailer receives multiple bids for the same customer segment, it needs a way to choose the winning bid. Instead of relying on a human to make the decision, FusePump wanted to automate the process with a custom machine learning algorithm.


Delivery

After understanding the initial scope of the problem, the first step was for me to propose a scope of work, essentially a description of the Machine Learning approach I was suggesting and a development timeline, which included assumptions, milestones, potential risks and decision points.
An important milestone was the early delivery of a proof of concept. That took me 2 weeks after gaining access to the data, and it consistent in a simple prototype with limited functionality. A proof of concept is a vital part the development process, as it demonstrates that the algorithm works and helps non-technical employees understand how machine learning translates logic into code.
I then collaborated with FusePump’s internal development, marketing, design and consulting teams to ensure the product aligned with the clients needs and it could integrate well with FusePump’s existing infrastructure.
To facilitate the development process and allow for potential change of requirements, the project was carried out through incremental improvements and iterations. Over the subsequent 4 months, I had built two core algorithms:


Algorithm 1: Personalized Product Suggestions


This followed this workflow:

  1. The algorithm analyzes a customer’s purchase history and estimates how similar items are to each other. Similarity is scored based on how many clients purchased the items together. For example, diapers and milk were more similar than diapers and beer, as many people who buy diapers also buy milk, but very few people who purchase diapers also purchase beer.
  2. A customer’s historical purchases are matched with similar products and assembled into a list of product recommendations.
  3. When a customer visits the retailer website, the top ranking item is shown as an ad. The algorithm could be further improved by allowing it to take into account external factors, such as likelihood to buy based on time of the year - e.g. customers are more likely to buy blankets during the winter than during the summer.

Algorithm 2: Bidding rules for ad space


This algorithm served to decide how to weight bids. Looking at $ bid was not an optimal approach in fact, in that I also wanted to optimize for the end customers experience and their lifetime value for the retailer. My solution was to create a score that incorporated multiple factors:

  • Likelihood of customers to purchase the product
  • Net margin of the retailer on the product
  • Size of the bid


Results


The algorithms I created were incorporated within the platform by month 4, and deployed in AWS Sagemaker as endpoints. We ran A/B tests for about 2 months to assess how they worked, results showed that:

  • Estimated 47% increase in conversion rates as compared with static banner ads
  • Estimated 15% increase in time spent on the website, as shoppers have a more engaging experience due personalized advertisements and products
  • Retailers can maximize the value of their ad space and generate new revenue (numerical results not available as the retailer did not share numbers)
  • Brands have a unique opportunity to influence shoppers with a high intent to buy. This allows brands to increase sales and reduce their advertising budget (numerical results not available as the brands did not share numbers)

The model predictions, along with the underlying traffic data, also got recorded in a MongoDB database, and I setup some automated periodic retraining procedures to keep the model relevant (with fallbacks and checks in case the model performance was to deteriorate).


This simple infrastructure then allowed Fusepump to build their internal Data Science capabilities and keep improving on the algorithms. Eventually Fusepump got acquired by advertising giant WPP, who took ownership of the platform.