|
luni, 4 august 2014
50 African Leaders in the Nation's Capital
CRO Statistics: How to Avoid Reporting Bad Data
CRO Statistics: How to Avoid Reporting Bad Data |
CRO Statistics: How to Avoid Reporting Bad Data Posted: 03 Aug 2014 05:15 PM PDT Posted by CraigBradford Without a basic understanding of statistics, you can often present misleading results to your clients or superiors. This can lead to underwhelming results when you roll out new versions of a page which on paper look like they should perform much better. In this post I want to cover the main aspects of planning, monitoring and interpreting CRO results so that when you do roll out new versions of pages, the results are much closer to what you would expect. I've also got a free tool to give away at the end, which does most of this for you. PlanningA large part running a successful conversion optimisation campaign starts before a single visitor reaches the site. Before starting a CRO test it's important to have:
Assuming you have a hypothesis, let's look at predicting how long a test should take. How long will it take?As a general rule, the less traffic that your site gets and/or the lower the existing conversion rate, the longer it will take to get statistically significant results. There's a great tool by Evan Miller that I recommend using before starting any CRO project. Entering the baseline conversion rate and the minimum detectable effect (i.e. What is the minimum percentage change in conversion rate that you care about, 2%? 5%? 20%?) you can get an estimate of how much traffic you'll need to send to each version. Working backwards from the traffic your site normally gets, you can estimate how long your test is likely to take. When you arrive on the site, you'll see the following defaults: Notice the setting that allows you to swap between 'absolute' and 'relative'. Toggling between them will help you understand the difference, but as a general rule, people tend to speak about conversion rate increases in relative terms. For example: Using a baseline conversion rate of 20%
There's a huge difference in the sample size needed to detect any change as well. In the absolute example above, 1,030 visits are needed to each branch. If you're running two test versions against the original, that looks like this:
Total 3,090 visits needed. If you change that to relative, that drastically changes: 25,255 visits are needed for each version. A total of 75,765 visits. If your site only gets 1,000 visits per month and you have a baseline conversion rate of 20%, it's going to take you 6 years to detect a significant relative increase in conversion rate of 5% compared to only around 3 months for an absolute change of the same size. This is why the question of whether or not small sites can do CRO often comes up. The answer is yes, they can, but you'll want to aim higher than a 5% relative increase in conversions. For example, If you aim for a 35% relative increase (with 20% baseline conversion), you'll only need 530 visits to each version. In summary, go big if you're a small site. Don't test small changes like button changes, test complete new landing pages, otherwise it's going to take you a very long time to get significantly better results. AnalyticsA critical part of understanding your test results is having appropriate tracking in place. At Distilled we use Optimizely so that's what I'll cover today; fortunately Optimizely makes testing and tracking really easy. All you need is a Google analytics account that has a custom variable (custom dimension in universal analytics) slot free. For either Classic or Universal Analytics, begin by going to the Optimizely Editor, then clicking Options > Analytics Integration. Select enable and enter the custom variable slot that you want to use, that's it. For more details, see the help section on the Optimizely website here. With Google analytics tracking enabled, now when you go to the appropriate custom variable slot in Google Analytics, you should see a custom variable named after the experiment name. In the example below the client was using custom variable slot 5: This is a crucial step. While you can get by by just using Optimizely goals like setting a thankyou page as a conversion, it doesn't give you the full picture. As well as measuring conversions, you'll also want to measure behavioral metrics. Using analytics allows you to measure not only conversions, but other metrics like average order value, bounce rates, time on site, secondary conversions etc. Measuring interactionAnother thing that's easy to measure with Optimizely is interactions on the page, things like clicking buttons. Even if you don't have event tracking set up in Google Analytics, you can still measure changes in how people interact with the site. It's not as simple as it looks though. If you try and track an element in the new version of a page, you'll get an error message saying that no items are being tracked. See the example from Optimizely below: Ignore this message, as long as you've highlighted the correct button before selecting track clicks, the tracking should work just fine. See the help section on Optimizely for more details. Interpreting resultsOnce you have a test up and running, you should start to see results in Google Analytics as well as Optimizely. At this point, there's a few things to understand before you get too disappointed or excited. Understanding statistical significanceIf you're using Google analytics for conversion rates, you'll need something to tell you whether or not your results are statistically significant - I like this tool by Kiss Metrics which looks like this: It's easy to look at the above and celebrate your 18% increase in conversions - however you'd be wrong. It's easier to explain what this means with an example. Let's imagine you have a pair of dice that we know are exactly the same. If you were to roll each die 100 times, you would expect to see each of the numbers 1-6 the same number of times on both die (which works out at around 17 times per side). Let's say on this occasion though we are trying to see how good each die is at rolling a 6. Look at the results below:
A simplistic way to think about Statistical significance is it's the chance that getting more 6s on the second die was just a fluke and that it hasn't been optimised in some way to roll 6s. This makes sense when we think about it. Given that out of 100 rolls we expect to roll a 6 around 17 times, if the second time we rolled a 6 19/100 times, we could believe that we just got lucky. But if we rolled a 6 30/100 times (76% more), we would find it hard to believe that we just got lucky and the second die wasn't actually a loaded die. If you were to put these numbers into a statistical significance tool (2 sided t-test), it would say that B performed better than A by 76% with 97% significance. In statistics, statistical significance is the complement of the P value. The P value in this case is 3% and the complement therefore being 97% (100-3 = 97). This means there's a 3% chance that we'd see results this extreme if the die are identical. When we see statistical significance in tools like Optimizely, they have just taken the complement of the P-value (100-3 = 97%) and displayed it as the chance to beat baseline. In the example above, we would see a chance to beat baseline of 97%. Notice that I didn't say there's a 97% chance of B being 76% better - it's just that on this occasion the difference was 76% better. This means that if we were to throw each dice 100 times again, we're 97% sure we would see noticeable differences again, which may or may not be by as much as 76%. So, with that in mind here is what we can accurately say about the dice experiment:
Here's what we cannot say:
This still leaves us with the question of what we can expect to happen if we roll version B out. To do this we need to use confidence intervals. Confidence intervalsConfidence intervals help give us an estimate of how likely a change in a certain range is. To continue with the dice example, we saw an increase in conversions by 76%. Calculating confidence intervals allow us to say things like:
Note: These are relative ranges. That being -13% less than 17% and 166% greater than 17%. The three questions you might be asking at this point are:
The only way we can reduce the range of the confidence intervals is by collecting more data. To decrease the chance of the difference being less than 0 (we don't want to roll out a version that performs worse than the original) we need to roll the dice more times. Assuming the same conversion rate of A (0.17%) and B (0.3%) - look at the difference increasing the sample size makes on the range of the confidence intervals. As you can see, with a sample size of 100 we have a 99% confidence range of -13% to 166%. If we kept rolling the dice until we had a sample size of 10,000 the 99% confidence range looks much better, it's now between 67% better and 85% better. The point of showing this is to show that even if you have a statistically significant result, it's often wise to keep the test running until you have tighter confidence intervals. At the very least I don't like to present results until the lower limit of the 90% interval is greater than or equal to 0. Calculating average order valueSometimes conversion rate on its own doesn't matter. If you make a change that makes 10% fewer people buy, but those that do buy spend 10x more money, then the net effect is still positive. To track this we need to be able to see the average order value of the control compared to the test value. If you've set up Google analytics integration like I showed previously, this is very easy to do. If you go into Google analytics, select the custom variable tab, then select the e-commerce view, you'll see something like:
It's great that people who saw version B appear to spend twice as much, but how do we know if we just got lucky? To do that we need to do some more work. Luckily, there's a tool that makes this very easy and again this is made by Evan Miller: Two sample t-test tool. To find out if the change in average order value is significant, we need a list of all the transaction amounts for version A and version B. The steps to do that are below: 1 - Create an advanced segment for version A and version B using the custom variable values. 2 - Individually apply the two segments you've just created, go to the transactions report under e-commerce and download all transaction data to a CSV. 3 - Dump data into the two-sample t-test tool The tool doesn't accept special characters like $ or £ so remember to remove those before pasting into the tool. As you can see in the image below, I have version A data in the sample 1 area and the transaction values for version B in the sample 2 area. The output can be seen in the image below: Whether or not the difference is significant is shown below the graphs. In this case the verdict was that sample 1 was in fact significantly different. To find out the difference, look at the "d" value where is says "difference of means". In the example above the transactions of those people that saw the test version were on average $19 more than those that saw the original. A free tool for reading this farIf you run a lot of CRO tests you'll find yourself using the above tools a lot. While they are all great tools, I like to have these in one place. One of my colleagues Tom Capper built a spreadsheet which does all of the above very quickly. There's 2 sheets, conversion rate and average order value. The only data you need to enter in the conversion rate sheet is conversions and sessions, and in the AOV sheet just paste in the transaction values for both data sets. The conversion rate sheet calculates:
There's an extra field that I've found really helpful (working agency side) that's called "Chance of <=0 uplift". If like the example above, you present results that have a potential negative lower range of a confidence interval:
The logical question a client is going to ask is: "What chance is there of the result being negative?" That's what this extra field calculates. It gives us the chance of rolling out the new version of a test and the difference being less than or equal to 0%. For the data above, the 99% confidence interval was -13% to +166%. The fact that the lower limit of the range is negative doesn't look great, but using this calculation, the chance of the difference being <=0% is only 1.41%. Given the potential upside, most clients would agree that this is a chance worth taking. You can download the spreadsheet here: Statistical Significance.xls Feel free to say thanks to Tom on Twitter. This is an internal tool so if it breaks, please don't send Tom (or me) requests to fix/upgrade or change. If you want to speed this process up even more, I recommend transferring this spreadsheet into Google docs and using the Google Analytics API to do it automatically. Here's a good post on how you can do that. I hope you've found this useful and if you have any questions or suggestions please leave a comment. If you want to learn more about the numbers behind this spreadsheet and statistics in general, some blog posts I'd recommend reading are: Scientific method: Statistical errors Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! |
You are subscribed to email updates from Moz Blog To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google Inc., 20 West Kinzie, Chicago IL USA 60610 |
Seth's Blog : Is authenticity authentic?
Is authenticity authentic?
Perhaps the only truly authentic version of you is just a few days old, lying in a crib, pooping in your pants.
Ever since then, there's been a cultural overlay, a series of choices, strategies from you and others about what it takes to succeed in this world (in your world).
And so it's all invented.
When you tell me that it would be authentic for you to do x, y or z, my first reaction is that nothing you do is truly authentic, it's all part of a long-term strategy for how you'll make an impact in the world.
I'll grant you that it's essential to be consistent, that people can tell when you shift your story and your work in response to whatever is happening around you, and particularly when you say whatever you need to say to get through the next cycle. But consistency is easier to talk about and measure than authenticity is.
The question, then, is what's the impact you seek to make, what are the changes you are working for? And how can you achieve that and still do work you're proud of?
More Recent Articles
- Short term, long term
- Pleasing a person who is not in the room
- A bigger logo?
- Trading favors
- This is ours
[You're getting this note because you subscribed to Seth Godin's blog.]
Don't want to get this email anymore? Click the link below to unsubscribe.
Email subscriptions powered by FeedBlitz, LLC, 365 Boston Post Rd, Suite 123, Sudbury, MA 01776, USA. |
duminică, 3 august 2014
Mish's Global Economic Trend Analysis
Mish's Global Economic Trend Analysis |
- Top Gun Style Aerial Chicken With Russia Sends US Spy Plane Into Swedish Air Space Without Permission
- Steen Jakobsen Short Dax, Long Treasuries, Sees Major Buy Signal for Gold, Silver, Mining
- Florida Obamacare Blues
Posted: 03 Aug 2014 10:24 PM PDT In an attempt to avoid Russian radar and a Russian fighter jet, a US Official Admits Spy Plane Flees Russian Jet, Radar; Ends Up Over Sweden. The Cold War aerial games of chicken portrayed in the movie "Top Gun" are happening in real life again nearly 30 years later.Questions of the Day If you cannot wait for permission, are you where you are not supposed to be in the first place? Is this how stupid wars start? Or is that exactly what the US wants? How about both? Mike "Mish" Shedlock http://globaleconomicanalysis.blogspot.com Mike "Mish" Shedlock is a registered investment advisor representative for SitkaPacific Capital Management. Sitka Pacific is an asset management firm whose goal is strong performance and low volatility, regardless of market direction. Visit http://www.sitkapacific.com/account_management.html to learn more about wealth management and capital preservation strategies of Sitka Pacific. |
Steen Jakobsen Short Dax, Long Treasuries, Sees Major Buy Signal for Gold, Silver, Mining Posted: 03 Aug 2014 07:04 AM PDT Steen Jakobsen, chief economist and CIO of Saxo Bank is back from a Tour De France and summer holiday and says "it's time for status on macro view and a look into what rest of 2014 gives us". Steen shares his views in a Trading Floor post Steen's Chronicle: Three things can't be hidden long: The sun, the moon and the truth (Buddha). This week saw US GDP rebound an impressive 4.0% taking the run rate for GDP in 2014 to 2.3%, still shy of the ambitious 3.0% the consensus firmly believe in. Wall Street is busy selling strategies on how to hedge the coming hike in policy rates from Fed and we are, again, told how rates will explode.More in report. Brief synopsis of Steen's views: Short the German DAX, long US treasuries and German Bunds, gold and silver major buy signal coming up, US dollar topping vs. Euro, energy firm. As a proxy for 10-year US treasuries Steen mentions IEF the Barclays 7-10 year duration US treasury ETF. The play appears to be intermediate-term, citing "inflation expectations" in the 4th quarter. Mike "Mish" Shedlock http://globaleconomicanalysis.blogspot.com Mike "Mish" Shedlock is a registered investment advisor representative for SitkaPacific Capital Management. Sitka Pacific is an asset management firm whose goal is strong performance and low volatility, regardless of market direction. Visit http://www.sitkapacific.com/account_management.html to learn more about wealth management and capital preservation strategies of Sitka Pacific. |
Posted: 02 Aug 2014 11:52 PM PDT Florida has the aging healthcare Obamacare blues as older citizens who previously had no healthcare insurance demand more services than ever before after enrolling As a result, Florida's largest health insurer, Florida Blue, is raising exchange rates an average of 17.6 percent. Florida Blue, the state's largest health insurer, is increasing premiums by an average of 17.6 percent for its Affordable Care Act exchange plans next year, company officials say.Standard Practice No need to worry. This is nothing new or unprecedented. Rates have been going up 10 percent a year, as "standard practice". Let's do the compound math on a typical $300-$400 per month policy, assuming a midpoint of $350 per month and a "standard practice" hike of 10% a year. The above chart appears shocking, but there is absolutely nothing to fear. As we all know, deficits don't matter and besides, Obamacare will pick up the tab. If the tab grows unexpectedly, we can tax the rich and the poor, and the young and the old (especially the young who we already make overpay for insurance). And if that doesn't work, the Fed can simply print the money. The fallback options are so enormous, one can only wonder why healthcare is not free to everyone on the planet. Mike "Mish" Shedlock http://globaleconomicanalysis.blogspot.com Mike "Mish" Shedlock is a registered investment advisor representative for SitkaPacific Capital Management. Sitka Pacific is an asset management firm whose goal is strong performance and low volatility, regardless of market direction. Visit http://www.sitkapacific.com/account_management.html to learn more about wealth management and capital preservation strategies of Sitka Pacific. |
You are subscribed to email updates from Mish's Global Economic Trend Analysis To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google Inc., 20 West Kinzie, Chicago IL USA 60610 |
Seth's Blog : Short term, long term
Short term, long term
The best way to change long-term behavior is with short-term feedback.
The opposite is not true. We rarely change short-term behavior with long-term feedback.
That's why sanctions rarely work well in international politics, and why cigarette taxes are the best way to keep people from getting lung cancer.
Sure, intelligent adults should be smart enough to figure out the net present value of a lifetime of cigarette purchases, plus the long-term health costs. And some are. But not enough.
And students should be smart enough to realize that extra effort and expense in college might pay off in income or happiness in a few decades. And some are. But not enough.
If you want to reward (or punish) short-term behavior, don't do it down the road. Advances turn more heads than royalty streams do.
More Recent Articles
[You're getting this note because you subscribed to Seth Godin's blog.]
Don't want to get this email anymore? Click the link below to unsubscribe.
Email subscriptions powered by FeedBlitz, LLC, 365 Boston Post Rd, Suite 123, Sudbury, MA 01776, USA. |
sâmbătă, 2 august 2014
Mish's Global Economic Trend Analysis
Mish's Global Economic Trend Analysis |
Posted: 02 Aug 2014 11:56 AM PDT Thank president Bush, vice president Dick Cheney, and president Obama for working together to create a how-to map of perfect foreign policy stupidity. The Telegraph has details in Afghanistan has cost more to rebuild than Europe after Second World War
What some see as "perfect stupidity" others (especially the industrial-military complex) see as a "job well done", complete with arms to the Taliban to ensure that the war on terrorism goes on and on. Mike "Mish" Shedlock http://globaleconomicanalysis.blogspot.com Mike "Mish" Shedlock is a registered investment advisor representative for SitkaPacific Capital Management. Sitka Pacific is an asset management firm whose goal is strong performance and low volatility, regardless of market direction. Visit http://www.sitkapacific.com/account_management.html to learn more about wealth management and capital preservation strategies of Sitka Pacific. |
You are subscribed to email updates from Mish's Global Economic Trend Analysis To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google Inc., 20 West Kinzie, Chicago IL USA 60610 |
It's Time for Congress to Help the Middle Class
|
Facebook Twitter | More Ways to Engage