The science of understanding intent
As the world becomes increasingly digital, the importance of brand recommendations has skyrocketed. The ability to provide personalized recommendations to customers can mean the difference between a one-time purchase and a lifelong relationship. To achieve this level of personalization, companies are turning to advanced natural language processing (NLP) techniques to gain a better understanding of user search intent.
At the heart of these advanced techniques lie our brand recommendation models that are now leveraging the power of word embeddings and principal component analysis (PCA). By analyzing the keywords associated with user searches, our models gain a better understanding of the user intent and preferences. In this blog post, we’ll dive deeper into some of the techniques that we have used and explore how they work together to deliver personalized brand recommendations.
A keyword is a word or a phrase in a web content that is related to a campaign or brand and that makes it possible for a user to find a destination to navigate to via search engines. The keyword’s information gives insights into what the user is searching for which helps our ML models to better learn patterns about the context of the user click-in and to better understand which brand the visitor will be most interested in. This results in a more personalized brand recommendation.
In order for keywords to be used in our Machine Learning (ML) models, they first need to be transformed into numbers.
But before that, we have to pre-process them.
In fact, the first step in the process of building any ML model is preprocessing the data. This process involves:
- Word tokenization, which consists of splitting sentences into words or tokens that can be more easily assigned meaning.
- Word cleaning which includes removing non-alphabetic tokens such as punctuation as well as stopwords.
Stopwords are some commonly used words in a language which do not add much meaning to a sentence. Examples of stop words in English are “a,” “the,” “is,” “are,” etc. These unimportant words are usually eliminated to allow the model to focus on the important words.
With the keywords now tokenized and cleaned, the next step is to transform them into a numerical format that can be used by our ML models.
This is done using a technique called word embedding, which involves mapping each word to a high-dimensional vector of numbers. This vector is a representation of the word in a way that words with the same meaning have a similar representation.
At the heart of word embedding is a deep learning technique called word2vec, which is used to train a neural network to predict the context in which each word appears in a given text.
The result of this training is a set of high-dimensional vectors that represent the meaning of each word in the text. These vectors capture the semantic and syntactic meaning of words in relation to the other words in the keyword. To form a keyword vector, we concatenate the vectors of all the words in the keyword, resulting in a single vector that captures the meaning of the entire keyword.
With the keywords now represented as a set of high-dimensional vectors, the next challenge is to reduce their dimensionality so that the data can be processed more efficiently by our ML models. This is where the principal component analysis (PCA) comes in.
PCA is a technique that can be used to transform high-dimensional data into a lower-dimensional representation while retaining as much of the original information as possible. In our models, PCA is used to reduce the initial dimensionality of the keyword vectors so that they can be processed more efficiently.
With the data being transformed into word embeddings, and reduced to a lower-dimensional representation using PCA, the final step is to feed it into our AI recommendation models.
The models are then trained to recommend the most relevant brands to the user based, among others, on their search query.
The power of advanced NLP techniques cannot be overstated when it comes to delivering personalized brand recommendations. By taking into account the “intent” of the user through the keyword that’s associated with its search, we gained a better understanding of the user preferences and delivered recommendations that are tailored to each user’s needs.
Integrating the visitor’s search intent through our advanced machine learning models has been a significant milestone in our brand recommendation model. It represents a good proxy of a visitor’s intent, which allows us to customize our recommendations to the visitor’s needs and is a testament to the power of machine learning in delivering personalized brand recommendations.