Examining User Heterogeneity in Digital Experiments
Author: Marshall V. King
Unlocking Insights through Digital Experiments
Three professors in the Mendoza College of Business at the University of Notre Dame are helping eBay better understand its customers.
Research by Sriram Somanchi, Ahmed Abbasi, and Ken Kelley explored how users of the ecommerce platform respond to what they see on a screen in front of them.
Digital experiments gather data on how customers or potential customers respond to changes, even slight ones such as font size. Historically, digital platforms conducted what was known as A/B testing. One randomly assigned set of customers was shown A, such as grid view of products, and another set was shown B, such as a list. Then business managers for a platform would study the outcomes in terms of clicks and purchases and put whichever did better into production.
Slight changes in colors, layout, and even fonts draw a lot of engagement from users. Companies are constantly seeking guidance on how to generate sales. “There’s significant growth in the number of digital experiments at organizations,” said Sriram Somanchi, assistant professor of business analytics.
That’s where the trio of Notre Dame researchers, along with David Dobolyi of the University of Colorado, come in. They, along with Ted Tao Yuan from eBay, published a paper in March 2023 titled “Examining User Heterogeneity in Digital Experiments,” exploring how digital testing can be improved with machine learning.
Enhancing Digital Experiments through Machine Learning
A/B testing outcomes are often averages. It can tell you that group A purchased a higher amount or quantity, on average, than group B, for example. That form of gathering data to drive decisions goes back to drug trials of the 1940s or 1950s that didn’t include large groups of people, said Ahmed Abbasi, the Joe and Jane Giovanini Professor of IT, Analytics, and Operations. He serves as Director of the Analytics Ph.D. program and Co-Director of the Human-centered Analytics Lab with Kelley, who is the Edward F. Sorin Society Professor of IT, Analytics, and Operations (ITAO) and the Senior Associate Dean for Faculty and Research in Mendoza. Kelley says that the importance of randomization cannot be overemphasized when one is interested in cause, as randomization ensures probabilistically that the A and B groups start from a single population and there are not a priori different from one another. For example, randomization ensures that the expectations of the average effect is actually zero if there is, in fact, no effect.
Digital experiments may have up to half a billion users at large e-commerce sites. Yet dashboards for experiments can assess a broad range of behaviors, but only with measures of the average.
“Instead of just an average measure, can that dashboard have an additional feature that shows the effects of users who have a lot more experience using the platform?” said Somanchi, who is also doing research looking at how multiple experiments affect each other.
Leveraging Machine Learning for Improved Insights
In 2019, researchers from Stanford University and top digital companies published a paper titled, “Top Challenges for Online Experiments.” Two of the suggestions were related to the testing user heterogeneity and highlighted the value of using state-of-the-art methods rather than relying on averages of random groupings.
The low-hanging fruit has been plucked from those thousands of experiments and the gains are much smaller, said Abbasi. A more nuanced understanding of customer behavior is needed.
Machine learning isn’t needed for basic regression analysis of A/B groupings, said Somanchi. “Where machine learning comes in, and that’s where it’s popular now, is letting the machine tell us which subgroup has significant effect,” he said.
Their research explores how to progress from a single Average Treatment Effect to Heterogenous Treatment Effects. Machines can identify patterns in ways human can’t and do it quickly. That allows a business manager or researcher to break down users in digital experiments into subgroups.
At this point, the methods are experimental and relatively new, but HTE detection methods are the future of digital experimentation, said Abbasi.