The Bigfoot Update (AKA Dr. Pete Goes Crazy) |
The Bigfoot Update (AKA Dr. Pete Goes Crazy) Posted: 11 Jun 2012 11:40 AM PDT Posted by Dr. Pete On June 5th of 2012, at around 9:00am Central Daylight Time, I spotted what appeared to be a major Google algorithm update in the wild. Unfortunately, I was alone… and the photos all turned out blurry… ok, and I had had a few beers. Still, that doesn’t mean it didn’t happen. This is the true story of an update that I honestly believe we missed, and why we’re just not as good at spotting them as we like to think. The First SightingLet’s cut to the chase – this is an artist’s representation of what I saw that fateful morning (not a very talented artist, granted):
Please note that the Y-axis has been scaled to enhance differences. This is a graph of ten days of “Delta10” – I can’t fully explain what that is right now (come to MozCon to hear more), but the short version is that it’s a measure of 24-hour rankings fluctuations across a sample of top 10 Google results. The higher the Delta10, the more rankings changed over that 24 hours. Delta10 theoretically goes from 0-10, but the practical range is much smaller. For reference, the Delta10 on the morning of June 5th (which really indicates activity on June 4th) was 3.24. The 30-day average just prior to this was 2.29. Over 60 days, June 4th had the 2nd highest Delta10 on record – the record is currently held by the “Penguin” update (3.32). I spot-checked my data and confirmed it from a second tracking station – this wasn’t a fluke. So, I told the Twitterverse what I had seen…
Replies ranged from “I didn’t see anything” to “Stop drinking, Dr. Pete!” to “Who are you?!” Clearly, the SEO community was unconvinced. The Second SightingI was about to go back to the bottle, when a second sighting was confirmed by SERPmetrics:
While I don’t know the exact details of their tracking system (or how it compares to mine), it also measures ranking fluctuations. So, I asked the burning question: “How big was it?” and got back this:
If I was crazy, at least there were two of us. Was it an authentic Sasquatch, though, or just a hairy, naked dude taking a walk in the forest? It was time to go CSI on the data… Clues in the SERPsOne of the plusses of my system is that it stores the top 10 URLs for the tracked keywords, so I can see how any given SERP changed. The tough part, as I’m learning, is that many SERPs change every day, so you have to learn to separate out “normal” volatility from unusual change. As I went through the SERPs that changed the most from 6/4 to 6/5, I came across one that seemed pretty quiet in the preceding week. This is the top 10 for “bjs menu” on the morning of 6/4:
I’ve color-coded the domains – for reference, there are nine root domains in this SERP. Here’s where it gets interesting – look at the data for the morning of 6/5:
The number of root domains in the top 10 dropped from nine to only five – BJsBrewhouse.com grew from two to three listings, and BJSRestaurants.com expanded from one to four listings. By itself, this could mean anything, but I started to see the same pattern repeated as I dug into more and more individual SERPs. Here’s another example – a search for “kohl store locator”. On 6/4, the top 10 included seven different domains:
On the morning of 6/5, only four domains were left standing:
Although this is only one result, there are a couple of interesting things to note here. First, this wasn’t simply a change in exact-match domain handling or brand power. Kohl’s sites didn’t expand, and power domains like Wikipedia lost ranking – meanwhile, WhitePages.com jumped from one listing to four. It’s also interesting to note that two previous, broad Kohls.com pages were replaced with specific landing pages. Of course, it’s possible that was just a change on the Kohl’s site and not a Google tweak. Clues in Domain DiversityOf course, these single SERPs are anecdotal at best. I needed a larger-scale metric, so I decided to run some numbers on domain diversity across the entire data set (1,000 SERPs = 10,000 URLs). Put simply, across the 10K URLs, how many domains were in play? To simplify the data processing, I treated each sub-domain as unique. Here’s what I saw over the ten days from 5/28 to 6/6 (in this case, 6/5 is the critical day):
Again, I’m cheating a little on the Y-axis here – for the record, domain diversity decreased 2.6% on June 5th, from 5,802 domains on 6/4 to 5,654 on 6/5. I included 6/6 to show that the change seems to have stuck, at least temporarily. While 2.6% isn’t a huge change, the numbers appear to have been very steady prior to 6/5, and this data does match the pattern shown in the example queries. It’s interesting to note that Google’s April Search Highlights included a change that was supposed to increase domain diversity in the SERPs: “More domain diversity. [launch codename "Horde", project codename "Domain Crowding"] Sometimes search returns too many results from the same domain. This change helps surface content from a more diverse set of domains.” So, I decided to run out the domain diversity calculation over the full data set (which goes back to 4/5). What I saw was the following…
Keep in mind that more sub-domains across the 10K URLs equal more diversity. Not only can I find no clear evidence of Google’s “Horde” update in April, but the data suggests that domain diversity has steadily declined over the past two months. There are, in fact, two steep drop-offs. The second drop-off is the one being discussed in this post and shown in the previous graph. The first drop-off is the Penguin update. Of course, it’s important to note that this is a hand-selected sample of 1,000 keywords and only measures the top 10 rankings. While the domain diversity patterns across the data set are interesting, they don’t necessarily reflect the entire population of Google’s rankings. Entity Detection ChangesAfter my initial Tweet on 6/4, SEO patent guru Bill Slawski turned me on to a Google patent published on 5/31 (although it was filed back in February). Interpreting patents, let alone if and when they enter the algorithm is a tricky business, and I’m not 5% as adept at it as Bill, but the patent essentially covers how Google matches queries to entities. In particular, note Claim 28, which describes how a term could be matched to “a plurality of domains”. Or, as Bill noted:
This is highly speculative, and I don’t want to put words in Bill’s mouth or over-simplify a long conversation, but if this reflects a general change in capability on Google’s part, it does match the pattern somewhat. If Google could match an entity like Kohls to not only Kohls.com, but it’s listings on WhitePages.com, the algorithm could give more weight to those non-brand domains, in theory. Could I Be Wrong?NO!! KHAN!!!! *shakes fist at sky*Ok, yes, I could. At this point, I think the fluctuation data is reliable – I’ve confirmed it wasn’t a bug, and the SERPmetrics numbers back me up. Of course, fluctuations in the rankings are just one way of looking at things, and the tougher question is: What was the impact? If you look at the sample queries, you can see that many of the changes happened in the bottom 5 of the top 10. For my metric (Delta10), a change from #6 to #7 is the same as a change from #2 to #3, or, for that matter, a change from #7 to #6. Maybe, fluctuations were high but occurred almost entirely in lower-impact positions. There’s another possibility, though – maybe the fluctuations occurred in rankings that do matter (in the aggregate) but that most of us aren’t watching. How many of us take notice when a few long-tail keywords drop from #6 to #7? By themselves, they don’t mean much, but across hundreds of keywords, I suspect some sites experienced significant traffic changes. Does Bigfoot Have a Brother?Or possibly a sister – I’m not getting close enough to check. Just as I had almost finished this post, weekend monster sightings were off the charts. Although Google is officially confirming Panda 3.7 and an impact of <1% of queries, ranking fluctuations over the weekend were massive. Here’s an updated graph that includes June 4th:
The original “Bigfoot” (I owe Dave Snyder a hat tip for the name, even if he was kidding) was June 4th (Delta10 = 3.24), but that was followed up by an unusually active weekend, including a peak Saturday of Delta10 = 3.62. Keep in mind, Saturday topped not only the first Penguin update, but dwarfed Panda 3.5 and 3.6. My gut reaction is that something bigger happened here than just a Panda data refresh, but I honestly can’t prove that. Keep in mind that weekends are also normally pretty quiet, so relative to a typical Saturday/Sunday, these numbers are even more unusual. It’s possible that Panda 3.7 impacted more sites than 3.5 or 3.6, or that Google had to make adjustments on the fly, or that Panda 3.7 rolled out in addition to other updates. Unfortunately, the timing of this post made a full analysis of Panda 3.7 tricky and the pattern of change over the weekend isn’t clear, but I pulled a couple of numbers. First of all, the domain diversity drop I’ve documented leading up to June 4th has not reversed. June 8-10 was not simply a rollback of June 4, as far as I can tell. These were separate events. It is entirely likely that June 8-10 were related to each other (you can see a pretty clear ramp-up into the weekend). It also appears that the weekend was not simply a matter of a big change that got reversed. Let’s say, for example, that every URL moved on Saturday and then moved back on Sunday to its original position. Each day would show high Delta10s, but the two-day change would be zero. Looking at Sunday vs. Friday, the two-day change here is 3.91 (compared to a 24-hour change of 3.44). Although multi-day changes can be very tough to interpret, the evidence suggests that the changes from this past week are here to stay, at least for a while. What’s in a Name?I’m almost sorry Panda 3.7 came along before this post went live, because it painfully illustrates a fundamental problem in SEO right now – we’re letting Google define what we pay attention to. By my numbers, Penguin 1.0 was big, and Panda 3.7 was bigger, but many recent Panda updates have been barely blips on the radar (just above average), and I’ve tracked a half-dozen events in the past 60 days that are as bigger or bigger than Panda 3.5 and 3.6. Google has stated publicly (under oath, in fact) that they made 516 updates in 2010. The numbers for 2011 and 2012 appear to be on par with that. On average, that’s 1.4 updates every day. We’re chasing two runaway animals while an entire zoo is stampeding toward Grandma’s house, and we’re too often doing bad SEO along the way. I’m not asking you to chase the algorithm – my obsession shouldn’t become yours. I’m asking you to pay attention and stop waiting for official confirmation that something changed. Think long-term, pay attention to your traffic, and watch the numbers that matter to you. The picture of rapid change I’m painting doesn’t even count localization, personalization, rich results, vertical results, etc. You have to know your own niche, and if you want to succeed, you’d better watch it like a hawk. Don’t rely on Google to tell you which changes are important. Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! |
You are subscribed to email updates from SEOmoz Daily SEO Blog To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google Inc., 20 West Kinzie, Chicago IL USA 60610 |