Understanding Products and Customers with AI: Q&A with Lily AI’s Matthew Nokleby

We asked Matthew Nokleby, Lily AI’s Tech Lead Manager, Computer Vision, for his expertise on questions surrounding retail AI, building ML models, and any advice he’d give to those pursuing careers in data science.

When you look at the modern e-commerce landscape, what are the biggest gaps that retailers have in terms of areas in which AI-powered solutions could make a big difference?

I think most retailers have the basics down when it comes to e-commerce artificial intelligence/machine learning (AI/ML): use purchase, search, and clickstream data to power recommendation systems and (text-based) search, and then leverage user history to personalize those systems. The standard algorithms for these systems work pretty well and deliver a lot of value, even before you start looking at the recent AI advances in areas like computer vision and natural language processing (NLP).

Assuming you’ve handled the basics, the next logical step is to use AI to enrich your search and recommendation systems. You can use computer vision and NLP to extract information from unstructured product data. Most retailers have product images, text descriptions, product reviews, etc., that have tons of latent information that tends to go unused. Modern image and text algorithms can extract detailed information about a product’s style, how customers are using (or wearing) it, and what they like and dislike about it. That data is raw material for delivering more relevant and better personalized recommendations and search results.

Beyond that, AI can power entirely new retail experiences. The possibilities here are huge, even if you look only at what’s possible with computer vision: shoppable images, visual product search, personalized item recommendations, etc. Each of these offers a new way for customers to interact with brands and products.

What new advances in image recognition technology have fueled the “digital transformation” of retail in recent years?

I’m not sure if it counts as “new” anymore, but when talking about image recognition it’s impossible to overstate the impact of the reemergence of convolutional neural networks and the success of deep learning frameworks like Pytorch and Tensorflow. These frameworks aren’t even a decade old, and already they’re used in all sorts of industries and in organizations of all sizes. It’s now possible even for small teams of data scientists to build models for classifying images or detecting objects in frame.

Specifically for retail and for Lily AI, I’ll mention two more recent advances: deep metric learning and self-supervised learning. In deep metric learning, we train a neural network to map ordinary images to a representation space that captures visual similarity. If two images are visually similar—in a sense induced by the training data—they map to nearby points in the representation space. This is useful for retail applications like visual search: snap a photo of a blouse you like, and we’ll find similar blouses in a retailer’s catalog. It’s also useful for fine-grained classification—meaning image classification where the classes are quite similar—which we use to capture detailed style attributes from retail product images.

The second advance is self-supervised learning. Usually we train image classifiers by so-called “supervised” learning: label a bunch of images with the correct classes, then train a deep network to predict those classes. But supervised learning often requires a lot of labeled images to train a good classifier, and getting labels can be time-consuming, especially for a brand-new project. In self-supervised learning, you only need a bunch of unlabeled data—which is usually easier to get—and you play some tricks with the data to cook up a simple supervised learning problem. If you “pre-train” a deep network using that self-supervised problem, you can dramatically reduce the number of training samples you need to solve the original supervised learning problem. At Lily, this is really helpful when we’re prototyping models in a new domain and don’t yet have a lot of labeled data. 

What are some of the ways that you and your team are using AI to learn about the customer, and how is that then applied?

The primary way we understand our customers is through detailed stylistic analysis of the products they are looking at. Lily AI has built a platform of image and text classification models that can identify hundreds of detailed style attributes in product images and descriptions: fit, color, pattern, fabric, etc. These models allow us to understand a retailer’s catalog in fine detail.

Once we understand the stylistic details of a retailer’s products, it becomes easier to understand a customer’s personal preferences. By looking at the style attributes of the products a customer has viewed in the current session, we can recommend products that are tailored to their intent.

Can you give us some of the biggest challenges that the data science team routinely runs into as you build and train models, and how you overcome them?

The biggest is probably data heterogeneity. Every retailer’s product data has a different distribution—which just means that their product images look a little different and their text descriptions have a different style. Deep networks trained on one data distribution often perform poorly on a different distribution, so we need to design models that are both robust to distribution shifts and can adapt quickly to a new distribution.

We have a few different techniques to deal with this problem, including the self-supervised approaches mentioned earlier. When we get a new retailer’s image and text data, we can fine-tune our models via self-supervision to adapt to the new data distribution. This gives us good performance on unseen data, which improves even further once we curate a set of labeled training samples.

Since ML models tend to be domain-specific, are there ways to achieve efficiency gains in terms of building models for newer verticals/domains that we may expand into?

Self-supervision is valuable here, too, as it tends to be easier to get unlabeled data in a new vertical. But in this case fancy ML techniques can only get you so far, and at some point you need accurate labels for your data. That requires domain expertise.

Fortunately at Lily AI we have a talented Domain Experts team. Especially when we’re prototyping models in a new domain, they help us get accurate labels for our data and ensure that our model output matches their expert knowledge. It makes for a great partnership!

What advice would you give to someone wanting to pursue a career in data science, and what resources have helped you in your career?

My strongest advice is to get your hands dirty with real-world data and problems as soon as possible. Venues like Kaggle have their place for ML self-study, but it’s increasingly difficult to get even entry-level positions without a relevant college degree and/or real-world experience.

If you’re at a large company with a data science division, consider asking around about mentorship programs. We had such a program in one of my previous roles, and I mentored software engineers and business analysis who wanted to learn more about machine learning. We set up a self-study curriculum and carved out small—but real!— ML problems to work on. If there isn’t a formal mentorship program, you can still look for a mentor or, better yet, see if you can help set one up!

Think Like Your Customer

Would you like to talk with a Lily AI specialist about how your brand can dramatically improve site search, personalized product discovery, recommendations, and demand forecasting?
Let's Talk!
Woman sitting on the ground online shopping.