Dr. Ella Hilal, head of data science at Shopify, showcased how the e-commerce giant applies machine learning for anomaly detection and forecasting.
The e-commerce platform Shopify has successfully leveraged machine learning through a methodological, five step strategy. Below we’ll examine each step and detail how it’s helped the company thrive.
Shopify’s growth is unparalleled in the e-commerce industry. Collectively, the platform’s merchants make up the 7th largest company in the world in terms of revenue. That puts them above conglomerates like Apple and Volkswagen.
The company has been at the forefront of ML adoption for anomaly detection and forecasting. That’s particularly impressive when considering that up to 85% of machine learning projects ultimately fail to deliver on initial promises and goals.
At the Cognilytica “Data for the AI Community” seminar, head of data science at Shopify Dr. Ella Hilal spoke on how the e-commerce titan has scaled so effectively using machine learning (ML). Let’s take a look.
The fundamental step of leveraging ML is simple enough: Gain a top-level understanding of your forecasting needs.
Dr. Hilal suggests a few questions business leaders can ask to understand their forecasting needs:
Realize that this holistic approach to creating a machine learning model will be applied in each step of the process.
Also see: Top Data Analytics Tools
Dr. Hilal framed it clearly: “What data sources are available and what are their properties?”
Understanding the properties of your data sets is a key step to forecasting. Ask yourself foundational questions, such as whether data is univariate or multivariate. It’s also important to detect and eliminate non-stationary behavior in your time-series data.
“Nobody ever regrets spending the first few cycles of their effort toward a big model that informs big decisions by double-clicking on what data they have.”
In other words: It’s in your best interest to stay curious. It’s just as important to see which data sources are not available as the ones that are.
Understanding your data requires a top-level view of not only your operations but the market you operate in. This foundational knowledge is how data scientists can parse out anomalies and the specific reasons behind them.
This is where the actionable steps behind forecasting begin. Dr. Hilal suggested several key points that forecasting models must have:
Shopify used Facebook Prophet, an open-source additive regression model. They specifically chose this model due to its scalability and its ability to generate forecasts quickly over millions of data points.
Another key decision to make when selecting the right forecasting model is whether to take a top-down or bottom-up approach. Top-down approaches look at top-line metrics first and analyze the drivers behind them, while bottom-up approaches analyze and track drivers first. Other questions to consider when selecting the proper model include:
Also see: Best Business Intelligence Tools
“Anomalies are not all bad,” said Dr. Hilal. In fact, if it is effectively explained and perhaps even recurring, data scientists may want to amplify the anomaly to learn from it.
A large part of managing your anomaly bias is studying your data across time and analyzing which timeframes are indicative of future behavior. Shopify had a massive amount of data to select from, with very clear annual cycles of data in place. However, the team decided that the last three months were most indicative of their next performance.
Dr. Hilal also mentions that 2020 was, clearly, an anomalous year. However, businesses and data scientists should consider that anomalies and unique trends will always exist. This is why Shopify took 2021’s unique trends into account as well.
Finally, just like any other ML model in production, forecasting is truly effective when automated.
This automation should address a few points:
Remember, you ultimately want your model to be scalable and to work with little intervention – automation is valuable for independent operation.
Also see: Top Data Mining Techniques
It’s clear that the process behind leveraging ML for forecasting and anomaly detection at scale starts with a top-level understanding of your operations, needs, and industry.
In each step, Dr. Hilal emphasized the importance of stepping back and understanding the “what,” “how,” and “when” behind each decision.
eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.
Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.
Property of TechnologyAdvice.
© 2021 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.