Zappos lead data scientist on the challenges of using semantic search

Online retailer Zappos has never been shy about adopting new technologies to improve its business. And its search bar is no exception: Over the past two years, the company has been overhauling its search algorithms using machine learning.

At VentureBeat’s Transform 2019 conference last week in San Francisco, Zappos lead data scientist Ameen Kazerouni talked about how his team implemented semantic search into the website (you can watch the full session above). Unlike traditional search that matches your results purely based on the words you use, semantic search tries to understand the context and intent behind those terms.

The problem with the former is that it has a chance of returning a bunch of wrong or incorrect items that the customer has to filter through (which also might convince them to just leave the site). Kazerouni brought up the phrase “classic short” as an example. Most search bars, he said, will just show you different pairs of shorts if you typed that in. But in actuality, classic short refers to a certain type of boot.

With semantic search, websites can determine what people are really looking for and avoid those misunderstandings.

“At Zappos, we’ve actually taken it a step further and made the decision that not only is there a contextual meaning behind a search term, [but] that the contextual meaning changes on a per customer basis, as well,” Kazerouni said. “So for the millions of unique search terms — in the millions of unique customers — we actually try our best to serve individually unique search results. And I stress the word individually because it’s been a nightmare engineering problem.

“But we are at a point where it’s not collaborative filtering, it’s not segmentation; it is a one-to-one understanding of the individual and the term that they applied.”

Working around the English language

It hasn’t always been this way, however. According to Kazerouni, Zappos didn’t start using semantic search until 2017. His data science team wasn’t working on search at all. That responsibility fell to the company’s search team, which maintains the database of words in the search index. But the old lexical-based algorithm kept giving customers too many poor results when they searched for specific items like classic shorts or dress shoes.

The search team created manual redirects for these terms as they popped up (like telling the system to point to boots instead of shorts when searching for classic shorts), but it quickly got out of hand.

Above: Zappos had to overhaul its searching algorithm.

Image Credit: Zappos

“I think the search team realized they were playing a game of whack-a-mole. Because when you said, ‘Fix classic short’ to go to these boots,’ what that would do is then change the search index to be weighted more on the product name,” said Kazerouni. “So hiking shorts would not go to shorts; it would go to something else instead. And when dress shirts would go to dresses, we’d be like, ‘Well, dress here is more of an occasion and not a product type. So please fix that.’

“And then someone would type in ‘evening dress’, and it would not go to dresses anymore and go to shirts instead because dress would now be more important than dress shirt. So they realized that they were fixing one problem [while] creating seven other problems.”

Between late 2016 and early 2017, the search team approached Kazerouni and asked his data science crew for help. Part of the issue had to do with the language itself.

“We realized that English is a very funny language in the sense that many, many words are heavily overloaded. They have many different meanings depending on the phrase that they occur in,” said Kazerouni. “So the first thing we set out to do was understanding search terms, taking in search terms, and looking at customer behavior and building machine learning models that could create what are known as word embeddings.”

Word embeddings are mathematical representations of words and phrases, something that the search engine could use to predict the meaning behind customers’ terms. The first tests of Zappos’ new semantic search algorithm were positive, resulting in a significant increase in click-through rates and engagement on the website.

“We were showing ROIs as a machine learning team, which I hear is not very common,” said Kazerouni, laughing. “So it was also fun to prove to our business stakeholders that we weren’t just a research team or PR stunt. We were actually going to provide value to the core business.”

Kazerouni noted that Zappos has since “evolved past” word embeddings and built neural networks to enhance its semantic search engine. It’s been a huge success for Zappos so far, leading to more searches and an increase in revenue. And some tech-savvy consumers are helping them improve it: They’ve been chiming in with their experiences in feedback surveys, with some people specifically mentioning the algorithm.

“But I just love that the consumer is expecting machine learning-driven solutions and demanding an experience that is more intelligent in a way. That’s surprising to me. And I love it,” said Kazerouni.

Please follow and like us:
%d bloggers like this: