A/B Testing Archives - abtasty https://www.abtasty.com/topics/ab-testing/ Thu, 05 Sep 2024 07:00:18 +0000 en-GB hourly 1 https://wordpress.org/?v=6.4.2 https://www.abtasty.com/wp-content/uploads/2024/02/cropped-favicon-32x32.png A/B Testing Archives - abtasty https://www.abtasty.com/topics/ab-testing/ 32 32 A/B Testing: It’s Not Just About the Outcome https://www.abtasty.com/blog/a-b-testing-its-not-just-about-the-outcome/ https://www.abtasty.com/blog/a-b-testing-its-not-just-about-the-outcome/#respond Thu, 29 Aug 2024 11:52:21 +0000 https://www.abtasty.com/?p=153394 A/B testing is often seen as the magic bullet for improving e-commerce performance. Many believe that small tweaks—like changing the color of a “Buy Now” button—will significantly boost conversion rates. However, A/B testing is much more complex.  Random changes without […]

The post A/B Testing: It’s Not Just About the Outcome appeared first on abtasty.

]]>
A/B testing is often seen as the magic bullet for improving e-commerce performance. Many believe that small tweaks—like changing the color of a “Buy Now” button—will significantly boost conversion rates. However, A/B testing is much more complex. 

Random changes without a well-thought-out plan often lead to neutral or even negative results, leaving you frustrated and wondering if your efforts were wasted. 

Success in A/B testing doesn’t have to be defined solely by immediate KPI improvements. Instead, by shifting your focus from short-term gains to long-term learnings, you can turn every test into a powerful tool for driving sustained business growth. 

This guest blog was written by Trevor Aneson, Vice President Customer Experience at 85Sixty.com, a leading digital agency specializing in data-driven marketing solutions, e-commerce optimization, and customer experience enhancement. In this blog, we’ll show you how to design A/B tests that consistently deliver value by uncovering the deeper insights that fuel continuous improvement. 

Rethinking A/B Testing: It’s Not Just About the Outcome 

Many people believe that an A/B test must directly improve core e-commerce KPIs like conversion rates, average order value (AOV), or revenue per visitor (RPV) to be considered successful. This is often due to a combination of several factors: 

1. Businesses face pressure to show immediate, tangible results, which shifts the focus toward quick wins rather than deeper learnings. 

2. Success is typically measured using straightforward metrics that are easy to quantify and communicate to stakeholders.

3. There is a widespread misunderstanding that A/B testing is a one-size-fits-all solution, which can lead to unrealistic expectations. 

However, this focus on short-term wins limits the potential of your A/B testing program. When a test fails to improve KPIs, you might be tempted to write it off as a failure and abandon further experimentation. However, this mindset can prevent you from discovering valuable insights about your users that could drive meaningful, long-term growth. 

A Shift in Perspective: Testing for Learnings, Not Just Outcomes 

To maximize the success and value of your A/B tests, it’s essential to shift from an outcome-focused approach to a learning-focused one. 

Think of A/B testing not just as a way to achieve immediate gains but as a tool for gathering insights that will fuel your business’s growth over the long term. 

The real power of A/B testing lies in the insights you gather about user behavior — insights that can inform decisions across your entire customer journey, from marketing campaigns to product design. When you test for learnings, every result — whether it moves your KPIs or not — provides you with actionable data to refine future strategies. 

Let’s take a closer look at how this shift can transform your testing approach. 

Outcome-Based Testing vs. Learning-Based Testing: A Practical Example 

Consider a simple A/B test aimed at increasing the click-through rate (CTR) of a red call-to-action (CTA) button on your website. Your analytics show that blue CTA buttons tend to perform better, so you decide to test a color change. 

Outcome-Based Approach 

Your hypothesis might look something like this: “If we change the CTA button color from red to blue, the CTR will increase because blue buttons typically receive more clicks.”

In this scenario, you’ll judge the success of the test based on two possible outcomes: 

1. Success: The blue button improves CTR, and you implement the change. 2. Failure: The blue button doesn’t improve CTR, and you abandon the test. 

While this approach might give you a short-term boost in performance, it leaves you without any understanding of why the blue button worked (or didn’t). Was it really the color, or was it something else — like contrast with the background or user preferences — that drove the change? 

Learning-Based Approach 

Now let’s reframe this test with a focus on learnings. Instead of testing just two colors, you could test multiple button colors (e.g., red, blue, green, yellow) while also considering other factors like contrast with the page background. 

Your new hypothesis might be: “The visibility of the CTA button, influenced by its contrast with the background, affects the CTR. We hypothesize that buttons with higher contrast will perform better across the board.” 

By broadening the test, you’re not only testing for an immediate outcome but also gathering insights into how users respond to various visual elements. After running the test, you discover that buttons with higher contrast consistently perform better, regardless of color. 

This insight can then be applied to other areas of your site, such as text visibility, image placement, or product page design. 

Key Takeaway: 

A learning-focused approach reveals deeper insights that can be leveraged far beyond the original test scenario. This shift turns every test into a stepping stone for future improvements. 

How to Design Hypotheses That Deliver Valuable Learnings

Learning-focused A/B testing starts with designing better hypotheses. A well-crafted hypothesis doesn’t just predict an outcome—it seeks to understand the underlying reasons for user behavior and outlines how you’ll measure it. 

Here’s how to design hypotheses that lead to more valuable insights: 1. Set Clear, Learning-Focused Goals 

Rather than aiming only for KPI improvements, set objectives that prioritize learning. For example, instead of merely trying to increase conversions, focus on understanding which elements of the checkout process create friction for users. 

By aligning your goals with broader business objectives, you ensure that every test contributes to long-term growth, not just immediate wins. 

2. Craft Hypotheses That Explore User Behavior 

A strong hypothesis is specific, measurable, and centered around understanding user behavior. Here’s a step-by-step guide to crafting one: 

Start with a Clear Objective: Define what you want to learn. For instance, “We want to understand which elements of the checkout process cause users to abandon their carts.” 

Identify the Variables: Determine the independent variable (what you change) and the dependent variable (what you measure). For example, the independent variable might be the number of form fields, while the dependent variable could be the checkout completion rate. 

Explain the Why: A learning-focused hypothesis should explore the “why” behind the user behavior. For example, “We hypothesize that removing fields with radio buttons will increase conversions because users find these fields confusing.” 

3. Design Methodologies That Capture Deeper Insights 

A robust methodology is crucial for gathering reliable data and drawing meaningful conclusions. Here’s how to structure your tests:

Consider Multiple Variations: Testing multiple variations allows you to uncover broader insights. For instance, testing different combinations of form fields, layouts, or input types helps identify patterns in user behavior. 

Ensure Sufficient Sample Size & Duration: Use tools like an A/B test calculator to determine the sample size needed for statistical significance. Run your test long enough to gather meaningful data but avoid cutting it short based on preliminary results. 

Track Secondary Metrics: Go beyond your primary KPIs. Measure secondary metrics, such as time on page, engagement, or bounce rates, to gain a fuller understanding of how users interact with your site. 

4. Apply Learnings Across the Customer Journey 

Once you’ve gathered insights from your tests, it’s time to apply them across your entire customer journey. This is where learning-focused testing truly shines: the insights you gain can inform decisions across all touchpoints, from marketing to product development. 

For example, if your tests reveal that users struggle with radio buttons during checkout, you can apply this insight to other forms across your site, such as email sign-ups, surveys, or account creation pages. By applying your learnings broadly, you unlock opportunities to optimize every aspect of the user experience. 

5. Establish a Feedback Loop 

Establish a feedback loop to ensure that these insights continuously inform your business strategy. Share your findings with cross-functional teams and regularly review how these insights can influence broader business objectives. This approach fosters a culture of experimentation and continuous improvement, where every department benefits from the insights gained through testing. 

Conclusion: Every Test is a Win 

When you shift your focus from short-term outcomes to long-term learnings, you transform your A/B testing program into a powerful engine for growth. Every

test—whether it results in immediate KPI gains or not—offers valuable insights that drive future strategy and improvement. 

With AB Tasty’s platform, you can unlock the full potential of learning-focused testing. Our tools enable you to design tests that consistently deliver value, helping your business move toward sustainable, long-term success. 

Ready to get started? Explore how AB Tasty’s tools can help you unlock the full potential of your A/B testing efforts. Embrace the power of learning, and you’ll find that every test is a win for your business.

The post A/B Testing: It’s Not Just About the Outcome appeared first on abtasty.

]]>
https://www.abtasty.com/blog/a-b-testing-its-not-just-about-the-outcome/feed/ 0
Hotel Chocolat at CX Circle: Sweetening Loyalty with Experimentation https://www.abtasty.com/blog/hotel-chocolat-at-cx-circle/ https://www.abtasty.com/blog/hotel-chocolat-at-cx-circle/#respond Wed, 28 Aug 2024 08:00:00 +0000 https://www.abtasty.com/?p=153323 Welcome to a world where chocolate isn’t just a treat but an experience—a world crafted by Hotel Chocolat, a British group with nearly 31 years of rich history. At the heart of their journey lies a realization: loyalty isn’t bought […]

The post Hotel Chocolat at CX Circle: Sweetening Loyalty with Experimentation appeared first on abtasty.

]]>
Welcome to a world where chocolate isn’t just a treat but an experience—a world crafted by Hotel Chocolat, a British group with nearly 31 years of rich history. At the heart of their journey lies a realization: loyalty isn’t bought with discounts—it’s earned through authentic connections and shared values.

Recently, they shared this ethos at the CX Circle event by Contentsquare featuring Mel Parekh, Head of E-commerce at Hotel Chocolat. Mel took the stage to unravel the complexities of customer loyalty—a subject that has never been more critical in the fast-evolving world of eCommerce. The discussion centered around how Hotel Chocolat has navigated the challenges of a changing world while staying true to its brand values using the power of experimentation.

The Secret Ingredient: Authenticity and Quality

Hotel Chocolat stands out in the chocolate industry for its commitment to authenticity and quality. While most chocolate brands are content to source their cocoa, Hotel Chocolat went all-in, growing their own on the lush Rabot Estate in Saint Lucia. This direct control over their supply chain ensures that they use only the highest quality ingredients while helping craft a brand that’s as genuine as the cocoa it cultivates. 

Hotel Chocolat has witnessed a constant change in the e-commerce landscape. They’ve learned to adapt to these changes while staying true to their brand identity. One of their key initiatives has been to clearly define who they are as a brand and to create compelling reasons for customers to return to their site time and time again.

A Changing Landscape

It’s no secret that the world of eCommerce is in constant flux. Prices are rising across the board—from raw materials to operating costs—and the competition for customers is fiercer than ever.  In this environment, retailers must do more with less, finding innovative ways to stand out. 

As customers increasingly engage with various digital platforms and experiences, the range of choices available to them has become almost overwhelming. In this crowded marketplace, standing out from the competition requires more than just eye-catching design elements.

Moreover, the explosion of data in recent years has made it possible for even smaller companies to leverage insights that were once only accessible to larger players. However, the real challenge lies in capturing this data, interpreting it effectively, and, most importantly, implementing it in ways that drive meaningful results. Hotel Chocolat has embraced this data-driven approach, using insights to refine their strategies and create a more personalized experience for their customers with both Contentsquare and AB Tasty.

Building Lasting Relationships with Customers in a Phygital World

Loyalty is the cornerstone of Hotel Chocolat’s strategy in this new era. As a premium brand, they understand that their customers aren’t just looking for a product; they’re looking for an experience that resonates with their values and desires. This understanding has led Hotel Chocolat to focus on building a brand that not only meets customer expectations but exceeds them by offering a unique, personalized experience.

One of the key strategies they’ve implemented is their “phygital” approach, which blends the digital and physical worlds to create a more personalized, engaging shopping experience. This approach is centered on three key principles:

  • Instant: Reducing delay or lag to ensure a smooth customer experience.
  • Connected: Creating a more personal connection with each customer.
  • Engaging: Giving customers a sense of control over their shopping journey.

 Make the Experience Personal

With over 120 different chocolate recipes, Hotel Chocolat faced this challenge: how do you help customers find the perfect product without overwhelming them? Their solution was gamification—a method that makes the shopping experience more fun and interactive. In Spring 2023, they launched the “Chocolate Love Match,” a quiz that matches customers to one of six flavor profiles. This not only narrows down the selection from 120 options to 20 or 30, making it easier to shop but also helps customers find the perfect gift for friends and family based on their flavor preferences.

The personalization doesn’t stop there. 

Hotel Chocolat also leverages machine learning and tools like AB Tasty to improve their customer experience further. For instance, they’ve been experimenting with “Add to Bag” personalized recommendations. This initiative is crucial, especially as acquisition costs rise, making it more important than ever to maximize the value of each customer interaction.

Using AB Tasty, they tested two variations: one that showed products frequently bought together and another that displayed recently viewed items for easy access. Both approaches tested positively, resulting in a 5.31% increase in average order value and a 2.87% boost in revenue. 

Embracing Data for Optimization

Hotel Chocolat has also focused on optimizing its digital presence, particularly their website. Working with AB Tasty, they undertook a redesign of their homepage, recognizing that the layout and user experience across devices play a critical role in customer engagement. The goal was to create a more visually appealing and intuitive experience that could better connect with customers online—especially when you can’t taste or smell the products.

The results speak volumes. By optimizing the homepage, they saw a 10% reduction in bounce rate, a 1.67% increase in visiting time, and significant improvements in conversion rates—up 0.54% overall and a substantial 7.24% on desktop. This uplift was largely due to better highlighting the most attractive elements on the homepage, such as category tiles that drive higher conversion and revenue.

Loyalty from a Brand Perspective

Mel Parekh left us with three takeaways for building a brand that stands the test of time:

  1. Embracing Change: It shows that your brand is up-to-date and ready to adapt. Staying agile ensures that your brand remains relevant and continues to serve your customers, no matter the circumstances.
  2. Listening and Understanding Customers: If loyal customers aren’t heard and understood, they’ll lose their preference for your brand and start considering others.
  3. Sticking to Your Values: Clearly reward loyal customers for their loyalty, and make sure to differentiate between who is loyal and who isn’t.

Conclusion

Loyalty isn’t just about offering a great product; it’s about creating connections that resonate. Hotel Chocolat has perfected this recipe by blending their commitment to quality with a data-centric culture. Experimentation and data from AB Tasty have allowed them to be able to improve in all areas – whether that is personalization, gamification of their loyalty scheme, or the link between their online and physical shops. Experimentation has improved more than just their CRO but has helped define who they are and what they stand for.

Find out more in Mel’s talk below:

Hotel Chocolat at CX Circle

The post Hotel Chocolat at CX Circle: Sweetening Loyalty with Experimentation appeared first on abtasty.

]]>
https://www.abtasty.com/blog/hotel-chocolat-at-cx-circle/feed/ 0
From Clicks to Connections: How AI is Shaping the Future of Digital Optimization https://www.abtasty.com/blog/ai-and-future-of-digital-optimization/ https://www.abtasty.com/blog/ai-and-future-of-digital-optimization/#respond Mon, 19 Aug 2024 14:54:56 +0000 https://www.abtasty.com/?p=153239 Any marketer will tell you that Digital Optimization is crucial to ensure successful e-commerce operations and yield the best possible return on investment (ROI). This practice includes both A/B testing and website personalization: every website presents a unique set of […]

The post From Clicks to Connections: How AI is Shaping the Future of Digital Optimization appeared first on abtasty.

]]>
Any marketer will tell you that Digital Optimization is crucial to ensure successful e-commerce operations and yield the best possible return on investment (ROI). This practice includes both A/B testing and website personalization: every website presents a unique set of features and designs, which must, in turn, be optimized through A/B testing. 

Building a great website is, unfortunately, not simply a matter of following best practices. Even within a single industry, users will hold varied expectations based on your brand, communication style, target audience, funnel, etc. And while browsing the same website, users’ expectations can vary, with some knowing exactly what they want and others needing to explore, check your returns policy, learn about your sustainability initiatives, and so on.

We have all heard the hype about how AI has been revolutionizing how marketers approach experimentation. Generative AI offers new opportunities for optimizing every aspect of the user journey, allowing marketers to:

  • streamline testing,
  • create new online experiences, 
  • and create new types of user segments for more precise personalized experiences that drive conversions.

This guest blog post was written by Rodolphe Dougoud, Project Lead at fifty-five—a leading data company that helps brands harness the potential of Generative AI and mitigate associated risks effectively with a comprehensive and pragmatic AI strategy, among other services. 

Below, we’ll explore these three perspectives in depth, with real-life examples gleaned from AB Tasty’s new algorithm, Emotions AI, and fifty-five’s work with its clients around GenAI. 

AI in Action for Experiences that Matter

  1. Streamline testing

When thinking about A/B testing, you might immediately picture creating an experiment and launching it live on a website. However, the most time-consuming phases of the A/B testing process generally come before and after that: finding new features to try out in order to create a testing roadmap and analyzing the results of these tests. Here, AI can increase test velocity by helping to reduce bottlenecks hindering both of the aforementioned stages.

Test ideation

Your roadmap must not only be top-down but also bottom-up: pay close attention to insights from your UX designers, based on benchmarks from your competitors and industry trends, and data-driven insights based on your own analytics data. Here, AI can facilitate the process by analyzing large datasets (e.g., on-site Analytics data) to find insights humans might have missed.

Result analysis

Similarly, it’s essential to analyze the results of your tests thoroughly. Looking at one KPI can sometimes be enough, but it often represents only one part of a bigger story. An aptly-calibrated AI model can find hidden insights within your testing results

While we generally know what data we want to access, the actual querying of that data can be time-consuming. Applying a GenAI model to your dataset can also allow you to query your data in natural language, letting the model pull the data for you, run the query, and create instant visualizations for major time gains.

Content creation

While not necessary for most tests, creating new content to be included in the testing phase can take a long time and impact your roadmap. While GenAI cannot produce the same quality of content as your UX team, a UX designer equipped with a GenAI tool can create more content faster. The model used can be trained with your design chart to ensure it integrates with the rest of your content. Overall, adding a GenAI tool as a complement to your design arsenal can yield substantial gains in productivity and, therefore, reinforce your testing roadmap timeline.

  1. Create new online experiences

Marketers should not hesitate to experiment with AI to create unique and interactive experiences. Generative AI can create personalized content and recommendations that can engage users more effectively. 

Consider, for instance, fifty-five’s recent work with Chronodrive, a grocery shopping and delivery app. We used AI to address a common user challenge (and, frankly, near-universal issue): deciding what to make for dinner. 

With our innovative solution, taking a picture of the inside of your fridge will allow the app to create a recipe based on the ingredients it identifies, while a photo of a dish – taken at a restaurant or even downloaded from social media – will generate a recipe for said dish and its associated shopping list. 

 Artificial Intelligence opens new creative options that weren’t available with previous LLM models. Chronodrive’s solution may not be applicable to most companies, but every business can think back on their typical user’s pain points and conceptualize how GenAI could help ease them.

  1. Create new types of user segments for more precise personalized experiences

When a customer enters a store, a salesperson can instantly personalize their experience by checking if they want to be helped or just want to browse, if they are discovering the brand or are already sold on it, if they require guidance or know precisely what they want… A website, on the other hand, necessitates extra effort to present the user with a similarly personalized experience. 

Online, segmentation thus becomes indispensable to deliver the most satisfying user experience possible. Even during testing phases, deploying A/B tests on user segments makes achieving significant results more likely, as increased precision helps mitigate the risk of obtaining neutral results.

AI can analyze a wide array of user interactions on a given website to determine which elements drive the most conversions, or how different users respond to specific stimuli. This analysis can allow brands to classify users into new segments that could not have been available otherwise. For instance, fifty-five applied AI to split Shiseido’s website users between low and high-lifetime value segments. This allowed Shiseido to run differentiated A/B tests and personalize their website depending on the expected lifetime value of the user, resulting in a 12.6% increase in conversions.

Going even further, what if AI could read your emotions? AB Tasty’s new AI algorithm, Emotions AI, can automatically segment your audience into 10 categories based on emotional needs. 

  • If a user needs to be reassured, the website can emphasize its free return policy
  • If they need clarity, the website can highlight factual information about your product
  • And if they need immediacy, the website can hide any unnecessary information to instead focus on its main CTAs

The model estimates the needs of the user by taking into consideration all of their interactions with the website: how long they wait before clicking, whether they scroll through an entire page, where their mouse hovers, how many times they click, etc. This enables stronger personalization, both during testing phases and when deploying online features, by letting you know exactly what your users need. 

Want to Learn More?

If you would like to dive deeper into current experimentation trends, watch our webinar replay here, where fifty-five and AB Tasty explored key CRO case studies and more. And if you have any questions or insights you’d like to share, please leave a comment – we would love to hear from you! 

The post From Clicks to Connections: How AI is Shaping the Future of Digital Optimization appeared first on abtasty.

]]>
https://www.abtasty.com/blog/ai-and-future-of-digital-optimization/feed/ 0
Transaction Testing With AB Tasty’s Report Copilot https://www.abtasty.com/blog/transaction-testing-report-copilot/ https://www.abtasty.com/blog/transaction-testing-report-copilot/#respond Thu, 18 Jul 2024 20:48:15 +0000 https://www.abtasty.com/?p=152030 Transaction testing, which focuses on increasing the rate of purchases, is a crucial strategy for boosting your website’s revenue.  To begin, it’s essential to differentiate between conversion rate (CR) and average order value (AOV), as they provide distinct insights into […]

The post Transaction Testing With AB Tasty’s Report Copilot appeared first on abtasty.

]]>
Transaction testing, which focuses on increasing the rate of purchases, is a crucial strategy for boosting your website’s revenue. 

To begin, it’s essential to differentiate between conversion rate (CR) and average order value (AOV), as they provide distinct insights into customer behavior. Understanding these metrics helps you implement meaningful changes to improve transactions.

In this article, we’ll delve into the complexities of transaction metrics analysis and introduce our new tool, the “Report Copilot,” designed to simplify report analysis. Read on to learn more.

Transaction Testing

To understand how test variations impact total revenue, focus on two key metrics:

  • Conversion Rate (CR): This metric indicates whether sales are increasing or decreasing. Tactics to improve CR include simplifying the buying process, adding a “one-click checkout” feature, using social proof, or creating urgency through limited inventory.
  • Average Order Value (AOV): This measures how much each customer is buying. Strategies to enhance AOV include cross-selling or promoting higher-priced products.

By analyzing CR and AOV separately, you can pinpoint which metrics your variations impact and make informed decisions before implementation. For example, creating urgency through low inventory may boost CR but could reduce AOV by limiting the time users spend browsing additional products. After analyzing these metrics individually, evaluate their combined effect on your overall revenue.

Revenue Calculation

The following formula illustrates how CR and AOV influence revenue:

Revenue=Number of Visitors×Conversion Rate×AOV

In the first part of the equation (Number of Visitors×Conversion Rate), you determine how many visitors become customers. The second part (×AOV) calculates the total revenue from these customers.

Consider these scenarios:

  • If both CR and AOV increase, revenue will rise.
  • If both CR and AOV decrease, revenue will fall.
  • If either CR or AOV increases while the other remains stable, revenue will increase.
  • If either CR or AOV decreases while the other remains stable, revenue will decrease.
  • Mixed changes in CR and AOV result in unpredictable revenue outcomes.

The last scenario, where CR and AOV move in opposite directions, is particularly complex due to the variability of AOV. Current statistical tools struggle to provide precise insights on AOV’s overall impact, as it can experience significant random fluctuations. For more on this, read our article “Beyond Conversion Rate.”

While these concepts may seem intricate, our goal is to simplify them for you. Recognizing that this analysis can be challenging, we’ve created the “Report Copilot” to automatically gather and interpret data from variations, offering valuable insights.

Report Copilot

The “Report Copilot” from AB Tasty automates data processing, eliminating the need for manual calculations. This tool empowers you to decide which tests are most beneficial for increasing revenue.

Here are a few examples from real use cases.

Winning Variation:

The left screenshot provides a detailed analysis, helping users draw conclusions about their experiment results. Experienced users may prefer the summarized view on the right, also available through the Report Copilot.

Complex Use Case:


The screenshot above demonstrates a case where CR and OAV have opposite trends and need a deeper understanding of the context.

It’s important to note that the Report Copilot doesn’t make decisions for you; it highlights the most critical parts of your analysis, allowing you to make informed choices.

Conclusion

Transaction analysis is complex, requiring a breakdown of components like conversion rate and average order value to better understand their overall effect on revenue. 

We’ve developed the Report Copilot to assist AB Tasty users in this process. This feature leverages AB Tasty’s extensive experimentation dashboard to provide comprehensive, summarized analyses, simplifying decision-making and enhancing revenue strategies.

The post Transaction Testing With AB Tasty’s Report Copilot appeared first on abtasty.

]]>
https://www.abtasty.com/blog/transaction-testing-report-copilot/feed/ 0
Mobile Optimization Guide: Tips for Smartphone Survival https://www.abtasty.com/resources/mobile-optimization-guide/ Tue, 09 Jul 2024 20:32:45 +0000 https://www.abtasty.com/?post_type=resources&p=151500 Mobile devices are more than a modern accessory, they’ve become a necessity. At the end of 2023, nearly 70% of the world’s population are smartphone users. And with an increasing number of people owning more than one phone, subscriptions have […]

The post Mobile Optimization Guide: Tips for Smartphone Survival appeared first on abtasty.

]]>
Mobile devices are more than a modern accessory, they’ve become a necessity.

At the end of 2023, nearly 70% of the world’s population are smartphone users. And with an increasing number of people owning more than one phone, subscriptions have outpaced ownership at an estimated 7 billion this year (and is expected to reach 8 billion in 2028).

Smartphones have transformed how consumers interact with brands, creating an almost instant connection between inspiration and action.

Mobile website traffic peaked at 59% in 2022 before settling at 54% in 2023.

However, mobile conversion rates lag behind. Desktop and tablet conversions lead at 3.1% and 2.8%, while smartphones trail at 2.3%.

So, why the gap?

In our e-book, “Mobile Optimization Guide: Tips for Smartphone Survival,” we explore consumer behavior, industry trends, and mobile optimization tests from AB Tasty clients. Discover how smartphones impact the customer journey and learn strategies to optimize the user experience and boost conversions.

Topics discussed in this e-book include:

  • Smartphones and the customer journey
  • The importance of speed
  • The thumb zone
  • Maximizing available space
  • Optimizing the call-to-action
  • Streamlining processes

74% of Gen Z say they most often reach for mobile devices when shopping online.

Download your copy of the “Mobile Optimization Guide” today and start transforming your mobile strategy!

The post Mobile Optimization Guide: Tips for Smartphone Survival appeared first on abtasty.

]]>
DirectAsia Simplifies Insurance Experiences with Empathy https://www.abtasty.com/resources/directasia-emotionsai/ Wed, 19 Jun 2024 08:16:57 +0000 https://www.abtasty.com/?post_type=resources&p=150810 Through the use of simplified and confidence-building journeys, DirectAsia is transforming the traditionally tedious task of buying vehicle and travel insurance into a seamless experience that reflects the overall convenience of their brand and services. Part of this transformation lies […]

The post DirectAsia Simplifies Insurance Experiences with Empathy appeared first on abtasty.

]]>
Through the use of simplified and confidence-building journeys, DirectAsia is transforming the traditionally tedious task of buying vehicle and travel insurance into a seamless experience that reflects the overall convenience of their brand and services.

Part of this transformation lies in their strategic partnership with AB Tasty and the integration of EmotionsAI to better understand their customers and boost their experience optimization roadmap.

With insurance buyers increasingly seeking reassurance, trust, and intuitive experiences, DirectAsia recognized the need to evolve beyond traditional approaches to meet these demands. Like many financial services, insurance is inherently complex by nature.

DirectAsia ran tests focusing on an area on their website where they knew they had room for improvement. They compared the results of that test on segments selected by EmotionsAI versus a broader audience. Download the case study to find out how EmotionsAI expedited visitor journeys through to their quote page.

The post DirectAsia Simplifies Insurance Experiences with Empathy appeared first on abtasty.

]]>
Multivariate Testing – All you need to know about MVT​ https://www.abtasty.com/resources/multivariate-testing-all-you-need-to-know-about-mvt/ Wed, 05 Jun 2024 15:57:37 +0000 https://www.abtasty.com/?post_type=resources&p=150436 Finding the perfect combination. That’s what multivariate testing is all about. A multivariate test is a test that simultaneously tests several combinations of several variables. The idea is to modify several elements simultaneously on the same page and then define which […]

The post Multivariate Testing – All you need to know about MVT​ appeared first on abtasty.

]]>
Finding the perfect combination. That’s what multivariate testing is all about. A multivariate test is a test that simultaneously tests several combinations of several variables.

The idea is to modify several elements simultaneously on the same page and then define which one, among all of the possible combinations, has the most impact on the indicators being tracked.

Multivariate testing (MVT) helps test associations of variables, which is not the case with successive A/B (or A/B/C, etc.) tests. Unlike classic A/B testing, multivariate testing allows you to understand which combination of elements works the best for your visitors and their specific needs. Sounds appealing, doesn’t it? Learn all you need to know in this multivariate testing guide and try any combination of your ideas.

What is a multivariate test?

During an A/B test, you may not modify more than one element at a time (for example, the wording of a button) in order to be able to measure the impact. If you modify both the button’s wording and color (for example, a blue “Buy” button vs. a red “Make the most of it” button) and notice an improvement, how will you know if it was the change in wording or color that contributed to this performance? The impact of one change could be negligible or they each could have had an equal impact.

Multivariate testing looks to provide the solution. You can change a title and an image at the same time. With multivariate tests, you test a hypothesis for which several variables are modified and determine which combination from among all possible solutions performed the best. If you create 3 different versions of 2 specific variables, you then have nine combinations in total (number of variants of the first variable X number of variants of the second).

More articles on multivariate testing:

The history of multivariate testing

Testing methods like MVTs started back in the 1700’s. Scurvy was a major problem back then. Without knowing it, a British Royal Navy ship surgeon created the very first multivariate test in history, when he started giving sick crew members different solutions and treated them under different conditions: a high number of variables that, in the end, he could compare to see how these variables interacted with one another.

This multivariate testing led him to measure the effectiveness of each combination and find out the perfect treatment for scurvy: Citrus fruits, fresh air and lots of sleep.

What kind of websites are relevant for MVT?

Multivariate testing can benefit any website that has a purpose behind it. Because technically, the way of reaching a goal can always be improved. And so can any website. Some sites are aiming at lead generation, e-commerce sites are aiming at selling. Media sites, for example, could benefit from multivariate tests by improving editorial features, not a number of transactions.

Most websites do multivariate tests like:

  • Testing the different combinations of text and color of a call-to-action button.
  • Testing how text and visual elements on a webpage work together the most effective.

What types of multivariate tests are there?

There are 2 main methods for performing multivariate tests:

  • “Full Factorial”: This is the method generally referred to when we talk about multivariate testing. With this method, all of the possible combinations of variables are designed and tested over equal parts of traffic. If you test 2 variants of one element and 3 of another, each of the 6 combinations will therefore receive 16.66% of your traffic.
  • “Fractional Factorial”: as its name suggests, only a fraction of possible combinations is effectively tested on your traffic. The conversion rate of untested combinations is statistically deduced based on those actually tested. This method has the disadvantage of being less precise, but requires less traffic.

Why run multivariate tests?

There are three benefits to MVT:

  • Avoid performing successive A/B tests and save time since multi variant testing can be seen as performing several A/B tests on the same page at the same time.
  • Determine the impact of each variable in measured gains.
  • Measure the impact of interactions between different elements presumed to be independent (for example, page title and illustration visual).

Limits of MVT

The first limit concerns the number of visitors needed for your multivariate test’s results to be significant. By multiplying the number of variables and versions tested in your multivariate test, you will quickly reach a large number of combinations. The sample assigned to each combination will be reduced proportionally.

Where, for a traditional A/B test, you would assign 50% of your traffic to the original version in the tool and the rest to the variant, you will only assign 5, 10, or 15% of your traffic to each combination in a multivariate test. In practice, this often translates into longer tests and an inability to reach the statistical significance needed to make a decision. This is especially true if you test pages deep within your site with low traffic, which is often the case in order tunnels or landing pages for your traffic acquisition campaigns.

The second limit is linked to the way the multivariate test is defined. In some cases, it’s the result of an admission of weakness: the users don’t know exactly what to test and think that by testing several things at once in a multivariate test, they will eventually find a solution they can take advantage of. We then often find small changes at work in these multivariate tests. A/B testing, on the other hand, requires great rigour and helps better identify test hypotheses, which generally lead to more creative tests, backed up by data, with better results.

The third limit is related to complexity. Conducting an A/B test is often easier than a multivariate test, especially when analysing the results. You don’t have to do complex mental gymnastics to try to understand why a particular element interacts positively with another in one case but not in another. Keeping the process simple and quick to perform helps maintain confidence and rapidly reiterate on optimization ideas.

Multivariate test ideas and hypotheses

The key to a successful MVT approach is a strong hypotheses for every element tested. This hypothesis should later be implemented in the different test modules and combinations of your MVT.

In order to create a strong MVT hypotheses you must:

  • Clearly identify the question you are interested in answering with your MVT
  • But note: A hypothesis is a statement, not a question. It is a very clear testable prediction about what will happen, if certain changes are being made to a website
  • Make it clear and link your prediction to a problem that has identifiable causes
  • Mention a possible solution

More articles about creating MVT test hypotheses:

Testing sample size

In order to test a MVT hypothesis, you need a larger sample size. Think of your multivariate test as several parallel A/B tests and increase the number of tested visitors accordingly.

In short: A good multivariate test requires enough website traffic to test multiple variations simultaneously. Therefore, the required sample size should never exceed your level of website traffic, unless you want to wait forever for your test results to be valid.

Multivariate testing requires more traffic than A/B testing

What is the ideal length for my MVT?

There is no universal answer to this question, but to give you an idea, here’s a simple calculation. Let’s say your website has 30,000 visitors a day and about 5% of the visitors convert and you want to test three variations, your test should run for 11 days. If your website has only 5,000 visitors a day and an average conversion rate of 2%, the required number of tested visitors per variation is 78.039, which will require your test to run 468 days.

If your average number of visitors is very low, multi variant testing may be inappropriate for you. Check out A/B testing instead! Additionally, here are six techniques for getting started with testing with low traffic.

Want to know where we took those numbers from? Check out our free sample size calculator!

Our online sample size calculator helps you calculate the minimum sample size as well as the duration of your tests based on your audience, your conversions and other information such as the Minimum Detectable Effect. This helps you increase your confidence level before making any decisions to improve your conversion rate.

Tips and best practices

Here are some tips that will help you set up your first multivariate tests and avoid common mistakes.

1. Choose a strong testing tool

Multivariate testing is often assumed to be very technical, so we suggest you go for a testing tool which keeps it simple for you to use. AB Tasty, for example, makes it easy for any marketer to jump into multivariate testing and helps you gain valuable customer insights for you to make the right decisions.

2. Form a good testing team

In a strong CRO team, different tasks should be clearly defined and distributed. For example: A “conversion manager“ could be lead the team and be in charge of QA. Another team member could be responsible for a first analysis of your visitors‘ behaviour and the status quo. A designer could take care of aesthetic modifications on your website and a technical profile (JS and CSS developer) should be responsible for the implementation of advanced tests. A data scientist could be in charge of evaluating your results in the end. One person can be in charge of different tasks at the same time if they have all skills necessary and sufficient time to meet all tasks.

3. Have a plan and clarify a timeline

It’s all about a good structure. Before you start creating your multivariate tests, clearly define what elements should be tested, why they are being tested and define a time frame. Knowing when the MVT results are needed will help you work more efficiently.

4. Set targets and define success and failure

Set annual targets that can be adjusted each year, for example the number of campaigns launched. Quantitative measurement is easy and precise. For your multivariate testing campaigns, clearly define what makes a test successful. Note that even a failed test is worth something because it helps you understand what doesn’t work and needs to be changed in the future.

5. Create a knowledge database

Keep track of the most important things you learned, save testing knowledge in a database and avoid recurring failure in the future. Once knowledge is acquired, it should be made available to everyone in your team. It will also make the onboarding process of new team members more efficient

6. Include all third parties necessary

Workplace loneliness is a real problem. Your CRO team should not be working in an isolated way. Let others know what your CRO department is working on and spread the word about testing results. Also, be open to new ideas and input from others outside your CRO team!

7. Identify and test different audience segments

In your multivariate testing campaigns you may determine returning visitors prefer a different website design than new visitors. Innovative tools like AB Tasty will recognize and automatically suggest visitor segmentation.

Articles on best practices worth a read:

Examples

Looking for ideas for your very own multivariate tests? Below you’ll find some links to a few examples and testing inspiration:

Multivariate testing softwares

Make sure you use a tool that actually addresses the problems you need to solve. When it comes to improving your website’s conversion rates, in a wide-range optimization process there should be much more involved than testing alone. Therefore, choose a tool that helps you fully understand user behavior. We recommend you use AB Tasty, as it offers you numerous sources of information you can use to gain this fuller picture:

Other forms of testing

There’s much more than multivariate tests out there! Here’s a list of other testing scenarios:

  • A/B/n Testing: Build and compare two or more variations of the same element
  • Split testing: Redirect traffic to one or several URLs. A perfect fit for new pages hosted on your servers
  • Multi-Page Testing: Display changes consistently across multiple pages (Funnel Testing)

Refer to this article to learn how to choose between this different testing methods.

Website optimization is not limited to testing. You can use advanced audience segmentation personalization to deliver tailored experiences across every customer touchpoint and much more.

The post Multivariate Testing – All you need to know about MVT​ appeared first on abtasty.

]]>
Mutually Exclusive Experiments: Preventing the Interaction Effect https://www.abtasty.com/blog/mutually-exclusive-experiments/ https://www.abtasty.com/blog/mutually-exclusive-experiments/#respond Thu, 23 May 2024 18:32:35 +0000 https://www.abtasty.com/?p=149689 What is the interaction effect? If you’re running multiple experiments at the same time, you may find their interpretation to be more difficult because you’re not sure which variation caused the observed effect. Worse still, you may fear that the […]

The post Mutually Exclusive Experiments: Preventing the Interaction Effect appeared first on abtasty.

]]>
What is the interaction effect?

If you’re running multiple experiments at the same time, you may find their interpretation to be more difficult because you’re not sure which variation caused the observed effect. Worse still, you may fear that the combination of multiple variations could lead to a bad user experience.

It’s easy to imagine a negative cumulative effect of two visual variations. For example, if one variation changes the background color, and another modifies the font color, it may lead to illegibility. While this result seems quite obvious, there may be other negative combinations that are harder to spot.

Imagine launching an experiment that offers a price reduction for loyal customers, whilst in parallel running another that aims to test a promotion on a given product. This may seem like a non-issue until you realize that there’s a general rule applied to all visitors, which prohibits cumulative price reductions – leading to a glitch in the purchase process. When the visitor expects two promotional offers but only receives one, they may feel frustrated, which could negatively impact their behavior.

What is the level of risk?

With the previous examples in mind, you may think that such issues could be easily avoided. But it’s not that simple. Building several experiments on the same page becomes trickier when you consider code interaction, as well as interactions across different pages. So, if you’re interested in running 10 experiments simultaneously, you may need to plan ahead.

A simple solution would be to run these tests one after the other. However, this strategy is very time consuming, as your typical experiment requires two weeks to be performed properly in order to sample each day of the week twice.

It’s not uncommon for a large company to have 10 experiments in the pipeline and running them sequentially will take at least 20 weeks. A better solution would be to handle the traffic allocated to each test in a way that renders the experiments mutually exclusive.

This may sound similar to a multivariate test (MVT), except the goal of an MVT is almost the opposite: to find the best interaction between unitary variations.

Let’s say you want to explore the effect of two variation ideas: text and background color. The MVT will compose all combinations of the two and expose them simultaneously to isolated chunks of the traffic. The isolation part sounds promising, but the “all combinations” is exactly what we’re trying to avoid. Typically, the combination of the same background color and text will occur. So an MVT is not the solution here.

Instead, we need a specific feature: A Mutually Exclusive Experiment.

What is a Mutually Exclusive Experiment (M2E)?

AB Tasty’s Mutually Exclusive Experiment (M2E) feature enacts an allocation rule that blocks visitors from entering selected experiments depending on the previous experiments already displayed. The goal is to ensure that no interaction effect can occur when a risk is identified.

How and when should we use Mutually Exclusive Experiments?

We don’t recommend setting up all experiments to be mutually exclusive because it reduces the number of visitors for each experiment. This means it will take longer to achieve significant results and the detection power may be less effective.

The best process is to identify the different kinds of interactions you may have and compile them in a list. If we continue with the cumulative promotion example from earlier, we could create two M2E lists: one for user interface experiments and another for customer loyalty programs. This strategy will avoid negative interactions between experiments that are likely to overlap, but doesn’t waste traffic on hypothetical interactions that don’t actually exist between the two lists.

What about data quality?

With the help of an M2E, we have prevented any functional issues that may arise due to interactions, but you might still have concerns that the data could be compromised by subtle interactions between tests.

Would an upstream winning experiment induce false discovery on downstream experiments? Alternatively, would a bad upstream experiment make you miss an otherwise downstream winning experiment? Here are some points to keep in mind:

  • Remember that roughly eight tests out of 10 are neutral (show no effect), so most of the time you can’t expect an interaction effect – if no effect exists in the first place.
  • In the case where an upstream test has an effect, the affected visitors will still be randomly assigned to the downstream variations. This evens out the effect, allowing the downstream experiment to correctly measure its potential lift. It’s interesting to note that the average conversion rate following an impactful upstream test will be different, but this does not prevent the downstream experiment from correctly measuring its own impact.
  • Remember that the statistical test is here to take into account any drift of the random split process. The drift we’re referring to here is the fact that more impacted visitors of the upstream test could end up in a given variation creating the illusion of an effect on the downstream test. So the gain probability estimation and the confidence interval around the measured effect is informing you that there is some randomness in the process. In fact, the upstream test is just one example among a long list of possible interfering events – such as visitors using different computers, different connection quality, etc.

All of these theoretical explanations are supported by an empirical study from the Microsoft Experiment Platform team. This study reviewed hundreds of tests on millions of visitors and saw no significant difference between effects measured on visitors that saw just one test and visitors that saw an additional upstream test.

Conclusion

While experiment interaction is possible in a specific context, there are preventative measures that you may take to avoid functional loss. The most efficient solution is the Mutually Exclusive Experiment, allowing you to eliminate the functional risks of simultaneous experiments, make the most of your traffic and expedite your experimentation process.

References:

https://www.microsoft.com/en-us/research/group/experimentation-platform-exp/articles/a-b-interactions-a-call-to-relax/

 

The post Mutually Exclusive Experiments: Preventing the Interaction Effect appeared first on abtasty.

]]>
https://www.abtasty.com/blog/mutually-exclusive-experiments/feed/ 0
Frequentist vs Bayesian Methods in A/B Testing https://www.abtasty.com/blog/bayesian-ab-testing/ Tue, 21 May 2024 09:00:50 +0000 https://www.abtasty.com/?p=3822 These terms refer to 2 inferential statistical methods. Debates over which is ‘better’ are fierce - we’ll walk you through the pros and cons.

The post Frequentist vs Bayesian Methods in A/B Testing appeared first on abtasty.

]]>
In A/B testing, there are two main ways of interpreting test results: Frequentist vs Bayesian.

These terms refer to two different inferential statistical methods. Debates over which is ‘better’ are fierce – and at AB Tasty, we know which method we’ve come to prefer.

If you’re shopping for an A/B testing vendor, new to A/B testing or just trying to better interpret your experiment’s results, it’s important to understand the logic behind each method. This will help you make better business decisions and/or choose the best experimentation platform.

Bayesian vs frequentist methods in ab testing

Source

In this article, we discuss these two statistical methods under the inferential statistics umbrella, compare and contrast their strong points and explain our preferred method of measurement.

What is inferential statistics?

Both Frequentist and Bayesian methods are under the umbrella of inferential statistics.

As opposed to descriptive statistics (which describes purely past events), inferential statistics try to infer or forecast future events.

Would version A or version B have a better impact on X KPI?

Side note: If we want to geek out, technically inferential statistics isn’t really forecasting in a temporal sense, but extrapolating what will happen when we apply results to a larger pool of participants. 

What happens if we apply winning version B to my entire website audience? There’s a notion of ‘future’ events in that we need to actually implement version B tomorrow, but in the strictest sense, we’re not using statistics to ‘predict the future.’

For example, let’s say you were really into Olympic sports, and you wanted to learn more about the men’s swimming team. Specifically, how tall are they? Using descriptive statistics, you could determine some interesting facts about ‘the sample’ (aka the team):

  • The average height of the sample
  • The spread of the sample (variance)
  • How many people are below or above the average
  • Etc.

This might fit your immediate needs, but the scope is pretty limited.

What inferential statistics allows you to do is to infer conclusions about samples that are too big to study in a descriptive way.

If you were interested in knowing the average height of all men on the planet, it wouldn’t be possible to go and collect all that data. Instead, you can use inferential statistics to infer that average from different, smaller samples.

Two ways of inferring this kind of information through statistical analysis are the Frequentist and Bayesian methods.

What is the Frequentist statistics method in A/B testing? 

The Frequentist approach is perhaps more familiar to you since it’s more frequently used by A/B testing software (pardon the pun). This method also makes an appearance in college-level stats classes.

This approach is designed to make a decision about a unique experiment.

With the Frequentist approach, you start with the hypothesis that there is no difference between test versions A and B. And at the end of your experiment, you’ll end up with something called a P-Value (probability value).

The P-Value is the probability of obtaining results at least as extreme as the observed results assuming that there is no (real) difference between the experiments.

In practice, the P-Value is interpreted to mean: the probability that there is no difference between your two versions. (That’s why it is often “inverted” with the basic formula p = 1-pValue, in order to express the probability that there is a difference.)

The smaller the P-Value, the higher the chance that there is, in fact, a difference, and also that your hypothesis is wrong.

Frequentist pros:

  • Frequentist models are available in any statistic library for any programming language.
  • The computation of frequentist tests is blazing fast.

Frequentist cons:

  • You only estimate the P-Value at the end of a test, not during. ‘Data peeking’ before a test has ended generates misleading results because it actually becomes several experiments (one experiment each time you peek at the data), whereas the test is designed for one unique experiment.
  • You can’t know the actual gain interval of a winning variation – just that it won.

What is the Bayesian statistics method in A/B testing?

The Bayesian approach looks at things a little differently.

We can trace it back to a charming British mathematician, Thomas Bayes, and his eponymous Bayes’ Theorem.

Bayes Theorem

Source

The Bayesian approach allows for the inclusion of prior information (‘a prior’) into your current analysis. The method involves three overlapping concepts:

  • Prior – information you have from a previous experiment. At the beginning of the experiment, we use a ‘non-informative’ prior (think ’empty’)
  • Evidences –  the data of the current experiment
  • Posterior – the updated information you have from the prior and the evidences. This is what is produced by the Bayesian analysis.

By design, this test can be used for an ongoing experiment. When data peeking, the ‘peeked at data’ can be seen as a prior, and the future incoming data will be the evidence, and so on.

This means ‘data peeking’ naturally fits in the test design. So at each ‘data peeking,’ the posterior computed by the Bayesian analysis is valid.

Crucially for A/B testing in a business setting, the Bayesian approach allows the CRO practitioner to estimate the gain of a winning variation – more on that later.

Bayesian pros:

  • Allows you to ‘peek’ at the data during a test, so you can either stop sending traffic if a variation is tanking or switch earlier to a clear winner.
  • Allows you to see the actual gain of a winning test.
  • By its nature, often rules out the implementation of false positives.

Bayesian cons:

  • Needs a sampling loop, which takes a non-negligible CPU load.  This is not a concern at the user level, but could potentially gum things up at scale.

Bayesian vs Frequentist: which is better?

So, which method is the ‘better’ method?

Let’s start with the caveat that both are perfectly legitimate statistical methods. But at AB Tasty, our customer experience optimization and feature management software, we have a clear preference for the Bayesian a/b testing approach.  Why?

Gain size

One very strong reason is because with Bayesian statistics, you can estimate a range of the actual gain of a winning variation, instead of only knowing that it was the winner, full stop.

In a business setting, this distinction is crucial. When you’re running your A/B test, you’re really deciding whether to switch from variation A to variation B, not whether you choose A or B from a blank slate. You therefore need to consider:

  • The implementation cost of switching to variation B (time, resources, budget)
  • Additional associated costs of variation B (vendor costs, licenses…)

As an example, let’s say you’re a B2B software vendor, and you ran an A/B test on your pricing page. Variation B included a chatbot, whereas version A didn’t. Variation B outperformed variation A, but to implement variation B, you’ll need 2 weeks of developer time to integrate your chatbot into your lead workflow, plus allocate X dollars of marketing budget to pay for the monthly chatbot license.

via GIPHY

You need to be sure the math adds up, and that it’s more cost-effective to switch to version B when these costs are weighed against the size of the test gain. A Bayesian A/B testing approach will let you do that.

Let’s take a look at an example from the AB Tasty reporting dashboard.

In this fictional test, we’re measuring three variations against an original, with ‘CTA clicks’ as our KPI.

AB Tasty reporting

We can see that variation 2 looks like the clear winner, with a conversion rate of 34.5%, compared to the original of 25%. But by looking to the right, we also get the confidence interval of this gain. In other words, a best and worst-case scenario.

The median gain for version 2 is 36.4%, with the lowest possible gain being +2.25% and the highest being 48.40%

These are the lowest and the highest gain markers you can achieve in 95% of cases.

If we break it down even further:

  • There’s a 50% chance of the gain percentage lying above 36.4% (the median)
  • There’s a 50% chance of it lying below 36.4%.
  • In 95% of cases, the gain will lie between +2.25% and +48.40%.
  • There remains a 2.5% chance of the gain lying below 2.25% (our famous false positive) and a 2.5% chance of it lying above 48.40%.

This level of granularity can help you decide whether to roll out a winning test variation across your site.

Are both the lowest and highest ends of your gain markers positive? Great!

Is the interval small, i.e. you’re quite sure of this high positive gain? It’s probably the right decision to implement the winning version.

Is your interval wide but implementation costs are low? No harm in going ahead there, too.

However, if your interval is large and the cost of implementation is significant, it’s probably best to wait until you have more data to shrink that interval. At AB Tasty we generally recommend that you:

  • Wait until you have recorded at least 5,000 unique visitors per variation
  • Let the test run for at least 14 days (two business cycles)
  • Wait until you have reached 300 conversions on the main goal.

Data peeking

Another advantage of Bayesian statistics is that it’s ok for you to ‘peek’ at your data’s results during a test (but be sure not to overdo it…).

Let’s say you’re working for a giant e-commerce platform and you’re running an A/B test involving a new promotional offer. If you notice that version B is performing abysmally – losing you big money – you can stop it immediately!

Conversely, if your test is outperforming, you can switch all of your website traffic to the winning version earlier than if you were relying on the Frequentist method.

This is precisely the logic behind our Dynamic Traffic Allocation feature – and it wouldn’t be possible without Mr. Thomas Bayes.

Dynamic Traffic Allocation

If we pause quickly on the topic of Dynamic Traffic Allocation, we’ll see that it’s particularly useful in business settings or contexts that are volatile or time-limited.

AB Tasty dynamic traffic allocation bayesian

Dynamic Traffic Allocation option in the AB Tasty Interface.

Essentially, (automated) Dynamic Traffic Allocation strikes the balance between data exploitation and exploration.

The test data is ‘explored’ rigorously enough to be confident in the conclusion, and ‘exploited’ early enough so as to not lose out on conversions (or whatever your primary KPI is) unnecessarily. Note that this isn’t manual – a real live person is not interpreting these results and deciding to go or not to go.

Instead, an algorithm is going to make the choice for you, automatically.

In practice, for AB Tasty clients, this means checking the associated box and picking your primary KPI. The platform’s algorithm will then make the determination of if or when to send the majority of your traffic to a winning variation, once it’s determined.

This kind of approach is particularly useful:

  • Optimizing micro-conversions over a short time period
  • When the time span of the test is short (for example, during a holiday sales promotion)
  • When  your target page doesn’t get a lot of traffic
  • When you’re testing 6+ variations

Though you’ll want to pick and choose when to go for this option, it’s certainly a handy one to have in your back pocket.

Want to start A/B testing on your website with a platform that leverages the Bayesian method? AB Tasty is a great example of an A/B testing tool that allows you to quickly set up tests with low code implementation of front-end or UX changes on your web pages, gather insights via an ROI dashboard, and determine which route will increase your revenue.

False Positives

In Bayesian statistics, like with Frequentist methods, there is a risk of what’s called a false positive.

A false positive, as you might guess, is when a test result indicates a variation shows an improvement when in reality it doesn’t.

It’s often the case with false positives that version B gives the same results as version A (not that it performs inadequately compared to version A).

While by no means innocuous, false positives certainly aren’t a reason to abandon A/B testing. Instead, you can adjust your confidence interval to fit the risk associated with a potential false positive.

Gain probability using Bayesian statistics

You’ve probably heard of the 95% gain probability rule of thumb.

In other words, you consider that your test is statistically significant when you’ve reached a 95% certainty level. You’re 95% sure your version B is performing as indicated, but there’s still a 5% risk that it isn’t.

For many marketing campaigns, this 95% threshold is probably sufficient. But if you’re running a particularly important campaign with a lot at stake, you can adjust your gain probability threshold to be even more exact – 97%, 98% or even 99%, practically ruling out the potential for a false positive.

While this seems like a safe bet – and it is the right choice for high-stakes campaigns – it’s not something to apply across the board.

This is because:

  • In order to attain this higher threshold, you’ll have to wait longer for results, therefore leaving you less time to reap the rewards of a positive outcome.
  • You will implicitly only get a winner with a bigger gain (which is rarer), and you will let go of smaller improvements that still could be impactful.
  • If you have a smaller amount of traffic on your web page, you may want to consider a different approach

Bayesian tests limit false positives

Another thing to keep in mind is that because the Bayesian approach provides a gain interval – and because false positives virtually only appear to perform slightly better than in reality – you’re unlikely to implement a false positive in the first place.

A common scenario would be that you run an A/B test to test whether a new promotional banner design increases CTA click-through rates.

Your result says version B performs better with a 95% gain probability but that the gain is minuscule (1% median improvement). Were this to be a false positive, you’re unlikely to deploy the version B promotional banner across your website, since the resources needed to implement it wouldn’t make it worth the minimum again.

But, since a Frequentist approach doesn’t provide the gain interval, you might be more tempted to put in place the false positive. While this wouldn’t be the end of the world – version B likely performs the same as version A – you would be spending time and energy on a modification that won’t bring you any added return.

Bottom line? If you play it too safe and wait for a confidence level that’s too high, you’ll miss out on a series of smaller gains, which is also a mistake.

Wrapping up: Frequentist vs Bayesian

So, which is better, Frequentist or Bayesian?

As we mentioned early, both approaches are perfectly sound, statistical methods.

But at AB Tasty, we’ve opted for the Bayesian approach, since we think it helps our clients make even better business decisions on their web experiments.

It also allows for more flexibility and maximizing returns (Dynamic Traffic Allocation). As for false positives, these can occur whether you go with a Frequentist or Bayesian approach – though you’re less likely to fall for one with the Bayesian approach.

At the end of the day, if you’re shopping for an A/B testing platform, you’ll want to find one that gives you easily interpretable results that you can rely on.

The post Frequentist vs Bayesian Methods in A/B Testing appeared first on abtasty.

]]>
The Truth Behind the 14-Day A/B Test Period https://www.abtasty.com/blog/truth-behind-the-14-day-ab-test-period/ https://www.abtasty.com/blog/truth-behind-the-14-day-ab-test-period/#respond Tue, 14 May 2024 18:17:01 +0000 https://www.abtasty.com/?p=149309 The A/B testing method involves a simple process: create two variations, expose them to your customer, collect data, and analyze the results with a statistical formula.  But, how long should you wait before collecting data? With 14 days being standard […]

The post The Truth Behind the 14-Day A/B Test Period appeared first on abtasty.

]]>
The A/B testing method involves a simple process: create two variations, expose them to your customer, collect data, and analyze the results with a statistical formula. 

But, how long should you wait before collecting data? With 14 days being standard practice, let’s find out why as well as any exceptions to this rule.

Why 14 days?

To answer this question we need to understand what we are fundamentally doing. We are collecting current data within a short window, in order to forecast what could happen in the future during a more extended period. To simplify this article, we will only focus on explaining the rules that relate to this principle. Other rules do exist, which mostly correlate to the number of visitors, but this can be addressed in a future article.

The forecasting strategy relies on the collected data containing samples of all event types that may be encountered in the future. This is impossible to fulfill in practice, as periods like Christmas or Black Friday are exceptional events relative to the rest of the year. So let’s focus on the most common period and set aside these special events that merit their own testing strategies.

If the future we are considering relates to “normal” times, our constraint is to sample each day of the week uniformly, since people do not behave the same on different days. Simply look at how your mood and needs shift between weekdays and weekends. This is why a data sampling period must include entire weeks, to account for fluctuations between the days of the week. Likewise, if you sample eight days for example, one day of the week will have a doubled impact, which doesn’t realistically represent the future either.

This partially explains the two-week sampling rule, but why not a longer or shorter period? Since one week covers all the days of the week, why isn’t it enough? To understand, let’s dig a little deeper into the nature of conversion data, which has two dimensions: visits and conversions.

  • Visits: as soon as an experiment is live, every new visitor increments the number of visits.
  • Conversions: as soon as an experiment is live, every new conversion increments the number of conversions.

It sounds pretty straightforward, but there is a twist: statistical formulas work with the concept of success and failure. The definition is quite easy at first: 

  • Success: the number of visitors that did convert.
  • Failures: the number of visitors that didn’t convert.

At any given time a visitor may be counted as a failure, but this could change a few days later if they convert, or the visit may remain a failure if the conversion didn’t occur. 

So consider these two opposing scenarios: 

  • A visitor begins his buying journey before the experiment starts. During the first days of the experiment he comes back and converts. This would be counted as a “success”, but in fact he may not have had time to be impacted by the variation because the buying decision was made before he saw it. The problem is that we are potentially counting a false success: a conversion that could have happened without the variation.
  • A visitor begins his buying journey during the experiment, so he sees the variation from the beginning, but doesn’t make a final decision before the end of the experiment – finally converting after it finishes. We missed this conversion from a visitor who saw the variation and was potentially influenced by it.

These two scenarios may cancel each other out since they have opposite results, but that is only true if the sample period exceeds the usual buying journey time. Consider a naturally long conversion journey, like buying a house, measured within a very short experiment period of one week. Clearly, no visitors beginning the buying journey during the experiment period would have time to convert. The conversion rates of these visitors would be artificially in the realm of zero – no proper measurements could be done in this context. In fact, the only conversions you would see are the ones from visitors that began their journey before the variation even existed. Therefore, the experiment would not be measuring the impact of the variation. 

The delay between the effective variation and the conversion expedites the conversion rate. In order to mitigate this problem, the experiment period has to be twice as long as the standard conversion journey. Doing so ensures that visitors entering the experiment during the first half will have time to convert. You can expect that people who began their journey before the experiment and people entering during the second half of the experiment period will cancel each other out: The first group will contain conversions that should not be counted, and some of the second group’s conversions will be missing. However, a majority of genuine conversions will be counted.

That’s why a typical buying journey of one week results in a two-week experiment, offering the right balance in terms of speed and accuracy of the measurements.

Exceptions to this rule

A 14-day experiment period doesn’t apply to all cases. If the delay between the exposed variation and the conversion is 1.5 weeks for instance, then your experiment period should be three weeks, in order to cover the usual conversion delay twice. 

On the other hand, if you know that the delay is close to zero, such as in the case of a media website, where you are trying to optimize the placement of an advertisement frame on a page where visitors only stay a few minutes, you may think that one day would be enough based on the this logic, but it’s not. 

The reason being that you would not sample every day of the week, and we know from experience that people do not behave the same way throughout the week. So even in a zero-delay context, you still need to conduct the experiment for an entire week.

Takeaways: 

  1. Your test period should mirror the conditions of your expected implementation period.
  2. Sample each day of the week in the same way.
  3. Wait an integer number of weeks before closing an A/B test.

Respecting these rules will ensure that you’ll have clean measures. The accuracy of the measure is defined by another parameter of the experiment: the total number of visitors. We’ll address this topic in another article – stay tuned.

The post The Truth Behind the 14-Day A/B Test Period appeared first on abtasty.

]]>
https://www.abtasty.com/blog/truth-behind-the-14-day-ab-test-period/feed/ 0