Do you keep in mind your first A/B test you ran? I do. (Nerdy, I realize. )
I felt concurrently thrilled and terrified because I knew I had to actually use some of the things i learned in college to get my job.
There were a few aspects of A/B testing We still remembered — for instance, I knew you need a not too young sample size to run quality on, and you need to run the test long enough to get statistically significant results.
But… which is pretty much it. I wasn’t sure how big was “big enough” for sample sizes and how long was “long enough” for test durations — and Googling this gave me a variety of answers our college statistics courses certainly didn’t prepare me just for.
Turns out I wasn’t by itself: Those are two of the most common A/B testing queries we get from clients. And the reason the typical answers from a Google search aren’t that helpful is because they’re speaking about A/B testing in an ideal, theoretical, non-marketing world.
So , I figured I’d do the research to help answer this question for you in a useful way. At the end of this post, you should be able to know how to determine the right small sample size and time frame for the next A/B test. Why don’t dive in.
A/B Screening Sample Size & Time period
In theory, to determine a winner among Variation A and Change B, you need to wait unless you have enough results to see if there exists a statistically significant difference between the two.
Depending on your company, sample size, and how you execute the particular A/B test, getting statistically significant results could happen within hours or days or weeks — and an individual has just got to stick it out until you get those results. In theory , you should not restrict the time in which you’re gathering results.
For a lot of A/B tests, waiting is not any problem. Testing headline duplicate on a landing page? It’s awesome to wait a month for outcomes. Same goes with blog CTA creative — you’d be opting for the long-term lead generation play, anyway.
But certain aspects of marketing demand shorter timelines when it comes to A/B testing. Get email as an example. With email, waiting for an A/B test to conclude can be a problem, for several practical reasons:
1 . Every email send has a limited audience.
Unlike a landing page (where you can continue to gather new audience members more than time), once you send a message A/B test off, absolutely it — you can’t “add” more people to that A/B test. So you’ve got to work out how squeeze the most juice from your emails.
This will usually require you to send an A/B test to the smallest portion of your list needed to get statistically significant results, pick a winner, and then send the winning variation on to the rest of the listing.
2 . Running an email marketing program means you’re juggling at least a few email sends each week. (In reality, probably far more than that. )
If you spend too much time collecting results, you could miss out on sending your next email — which could have got worse effects than in case you sent a non-statistically-significant winner email on to one section of your database.
3. Email sends are often designed to end up being timely.
Your marketing emails are optimized to deliver at a certain time of day, whether your emails are supporting the time of a new campaign start and/or landing in your recipient’s inboxes at a time they’d like to receive it. So if you wait for your email to be completely statistically significant, you might miss out on being timely and appropriate — which could defeat the objective of your email send in the first place.
That’s why email A/B testing programs have a “timing” setting built in: At the end of that period frame, if neither result is statistically significant, one variation (which you choose ahead of time) will be sent to the rest of your list. That way, you can nevertheless run A/B tests in email, but you can also work about your email marketing scheduling demands and ensure people are always obtaining timely content.
So to run A/B tests in e-mail while still optimizing your sends for the best results, you have to take both sample size and timing into account.
Following up — how to in fact figure out your sample size and timing using information.
How to Determine Sample Size for an A/B Test
Right now, let’s dive into ways to actually calculate the small sample size and timing you require for your next A/B test.
For our purposes, we’re going to use email as our example to demonstrate how you’ll figure out sample size and time for an A/B test. However , it’s important to note — the steps in this list can be used for any A/B check, not just email.
Let’s jump in.
Like mentioned above, every A/B test you send can only be sent to the finite audience — therefore you need to figure out how to maximize the outcomes from that A/B test. To do that, you need to figure out the particular smallest portion of your overall list needed to get statistically significant results. Here’s how you calculate it.
1 . Assess whether you have enough contacts in your list to A/B test a sample in the first place.
To A/B test a sample of your list, you need to have a decently large list size — at least 1, 000 contacts. If you have fewer than that within your list, the proportion of the list that you need to A/B test to get statistically significant outcomes gets larger and bigger.
For example , to get statistically substantial results from a small list, you may have to test 85% or 95% of your list. And the results of the people on your list who have haven’t been tested yet will be so small which you may as well have just sent half of your list one particular email version, and the other half another, and then measured the.
Your results might not be statistically significant at the end of it all, yet at least you’re gathering learnings while you grow your lists to have more than 1, 000 connections. (If you want more tips on growing your email listing so you can hit that one, 000 contact threshold, check out this blog post. )
Take note for HubSpot customers: 1, 000 contacts is also our own benchmark for running A/B tests on samples of e-mail sends — if you have less than 1, 000 contacts within your selected list, the An edition of your test will immediately be sent to half of your list and the B will be sent to the other half.
2 . Use a sample size loan calculator.
Next, you’ll want to find a trial size calculator — SurveySystem. com offers a good, free sample size calculator.
Here’s what it looks like when you open it upward:
3. Put in your own email’s Confidence Level, Confidence Time period, and Population into the tool.
Yep, that’s a lot of data jargon. Here’s what these terms translate to in your email:
Population : Your own sample represents a larger group of people. This larger group is known as your population.
In e-mail, your population is the regular number of people in your list that get emails delivered to them — not the number of individuals you sent emails in order to. To calculate population, I’d look at the past three to five email messages you’ve sent to this listing, and average the total variety of delivered emails. (Use the regular when calculating sample size, as the total number of delivered emails will fluctuate. )
Confidence Interval : You might have heard this known as “margin of error. inch Lots of surveys use this, which includes political polls. This is the range of results you can expect this A/B test to explain once it might be run with the full human population.
For example , in your emails, if you have an interval of 5, and 60% of your trial opens your Variation, you can be sure that between 55% (60 minus 5) and 65% (60 plus 5) could have also opened that e-mail. The bigger the interval you select, the more certain you can be the populations true actions are actually accounted for in that interval. At the same time, large intervals will give you less definitive results. It’s a trade-off you’ll have to make in your emails.
For our purposes, it’s not worth your money too caught up in self-confidence intervals. When you’re just getting to grips with A/B tests, I’d suggest choosing a smaller interval (ex: around 5).
Level of confidence : This tells you exactly how sure you can be that your trial results lie within the above confidence interval. The lower the percentage, the less sure you can be about the results. The higher the percentage, the more people you’ll need in your sample, too.
Note for HubSpot customers: The HubSpot Email A/B tool automatically uses the 85% confidence level to determine a victor. Since that option isn’t really available in this tool, I’d suggest choosing 95%.
Email A/B Test Example:
Let’s make-believe we’re sending our first A/B test. Our listing has 1, 000 people in it and has a 95% deliverability rate. We want to be 95% confident our winning email metrics fall inside a 5-point interval of our human population metrics.
Here’s what we’d put in the tool:
- Population : 950
- Confidence Level : 95%
- Confidence Period : 5
4. Click “Calculate” and your sample size will spit out.
Ta-da! The calculator will throw out your sample size.
In our example, our test size is: 274.
This is the size 1 your variants needs to be. So for your e-mail send, if you have one control and one variation, you’ll need to dual this number. If you had a control and two variants, you’d triple it. (And so on. )
5. Based on your email program, you may have to calculate the sample size’s percentage of the whole e-mail.
HubSpot customers, I’m taking a look at you for this section. When you’re running an email A/B test, you’ll need to select the percentage associated with contacts to send the list to — not just the raw sample size.
To do that, you should divide the number in your example by the total number of contacts in your list. Here’s what that will math looks like, using the example numbers above:
274 / 1, 000 = 27. 4%
Which means that each sample (both your control AND your variation) must be sent to 27-28% of your viewers — in other words, roughly a total of 55% of your total list.
And that’s this! You should be ready to select your sending time.
How to Choose the best Timeframe for Your A/B Test
Again, for figuring out the appropriate timeframe for your A/B check, we’ll use the example of email sends – but these details should still apply whatever the type of A/B test you will absolutely conducting.
However , your time-frame will vary depending on your business’ goals, as well. If you’d like to design a new landing page by Q2 2021 and it’s Q4 2020, you’ll likely want to finish your A/B test by The month of january or February so you can use those results to build the winning page.
But , for our purposes, let’s return to the email send example: You have to work out how long to run your e-mail A/B test before sending a (winning) version on to the rest of your list.
Determining the timing aspect is less statistically driven, however, you should definitely use past data to help you make better decisions. Here is how you can do that.
If you don’t have timing restrictions on when to send the winning email towards the rest of the list, head over to your analytics.
Figure out when your e-mail opens/clicks (or whatever your success metrics are) begins to drop off. Look your past email sends to figure this particular out.
For example , what percent of total clicks did you get in your first day time? If you found that you get 70% of your clicks in the very first 24 hours, and then 5% every day after that, it’d make sense to cap your email A/B testing timing window for 24 hours because it wouldn’t be really worth delaying your results just to gather a little bit of extra data.
In this scenario, you would most likely want to keep your timing windows to 24 hours, and at the finish of 24 hours, your email program should let you know if they can determine a statistically significant winner.
Then, they have up to you what to do next. In case you have a large enough sample size and found a statistically significant winner at the end of therapy time frame, many email marketing programs will automatically and instantly send the winning change.
If you have a large enough sample size and there’s no statistically significant winner at the end of therapy time frame, email marketing tools may also allow you to automatically send a variation of your choice.
If you have an inferior sample size or are running a 50/50 A/B test, when to send the next email based on the initial email’s outcomes is entirely up to you.
Should you have time restrictions on when to send the winning email to the rest of the list, work out how late you can send the particular winner without it getting untimely or affecting some other email sends.
For example , when you have sent an email out at 3 p. m. SE RÉVÈLE ÊTRE for a flash sale that will ends at midnight EST, a person wouldn’t want to determine an A/B test winner with 11 p. m. Instead, you’d want to send the email closer to 6 or seven p. m. — that’ll give the people not mixed up in A/B test enough time to behave on your email.
And that’s pretty much it, folks. After carrying out these calculations and examining your data, you should be in a much better state to conduct profitable A/B tests — ones that are statistically valid and help you move the hook on your goals.