How we split-test using one line of code

We were convinced of the many, many benefits of split-testing. But in true lean startup fashion, we wanted to implement a MVP first to better understand what a more feature-rich system might look like.

While Eric Ries has also suggested simple ways to split test, we were able to do it using one line of code and a SQL query.

How we did it

Add one line of code

if not mod(user.id, 2):
        # [do something new]

Let the experiment run
Analyze the test and control groups on a key metric you track (e.g. % clicker) since the split test was started. This is easily done in MySQL*:

SELECT MOD(user_id, 2), count(*)
FROM event_event
WHERE action = “click” AND date > “2012-07-10”
GROUP BY MOD(user_id, 2)

Compare the numbers for both the control and test group to see how you did!

Why does this work?

Random: if you’re like most companies, there’s no fundamental difference between users whose ids are odd, even or divisible by 10, so using user_id controls for any potential biases
Consistent: the user_id isn’t going to change once a user is registered so we can ensure a consistent experience for tests that might require a couple days to show results.
Deterministic: When you have a lot of users, you don’t want to have to store which users are in which split test. Splitting users by modulus makes it easy to analyze the results through querying MySQL.
Controllable: By changing the modulus, you can set the percentage of users you want to be in the test group. Want to test a risky idea? Use not mod(user.id, 10) to only experiment on 10% of your users.

What about significance?

We use the normal approximation to Fisher’s exact test to test for equality of two percentages. It’s easy!

Conclusion

It’s been 5 months and we’ve run over 50 split tests, killed some very expensive potential features, and saw a simple subject line optimization bump retention by 15%. Since then, we’ve built a more robust system, but that’s another post.

We’ve also stopped arguing for hours about features and now argue about which keyboard layout is better.

If you follow the steps above, you should be split-testing in less than a week. We were.

Yipit Django Blog

How We Split-test Using One Line of Code

How we did it

Why does this work?

What about significance?

Conclusion

Comments