How to Build Your Own Mass Keyword Difficulty Tool |
How to Build Your Own Mass Keyword Difficulty Tool Posted: 02 Dec 2013 03:15 PM PST Posted by MartinMacDonald Despite keywords being slightly out of fashion, thanks to the whole (not provided) debacle, it remains the case that a large part of an SEO's work revolves around discovering opportunity and filling that same opportunity with content to rank. When you are focusing on smaller groups of terms, there are plenty of tools to help; the Moz Keyword Difficulty Tool being a great example.
These tools function by checking the top results for a given keyword, and looking at various strength metrics to give you a snapshot as to how tough they are to rank for. The problem is, though, that these tools operate on the fly, and generally only allow you to search for a small amount of keywords at any one time. The Moz tool, for instance, limits you to 20 keywords. But I need to check 100,000 keywords!By the end of this tutorial you will be able to visualize keyword difficulty data in a couple of ways, either by keyword:
Or by keyword type:
Or by category of keyword, spliced by specific position in the results:
So what do we need to do?All keyword difficulty tools work in the same way when you break them down. They look at ranking factors for each result in a keyword set, and sort them. It's that simple. The only thing we need to do is work out how to perform each step at scale:
Step 1: Get URLsMy preference for scraping Google is using Advanced Web Ranking to get the ranking results for large sets of keywords. Quite a few companies offer software for this service (including Moz), but the problem with this approach is that costs spiral out of control when you are looking at hundreds of thousands of keywords. Once you have added your keyword set, run a ranking report of the top 10 results for the search engine of your choice. Once it's complete you should see a screen something like this:
The next step is to get this data out of Advanced Web Ranking and into Excel, using a "Top Sites" report, in CSV format (The format is important! If you choose any other format it makes manipulating the data much tougher):
This presents us with a list of of keywords, positions, and result URLs:
So now we can start harvesting some SEO data on each one of those results! My preference is to use the fantastic Niels Bosma Excel Plugin and the MajesticSEO API to access their Citation Score metric. Equally, though, you could use the SEOgadget Excel tool alongside the Moz API. I haven't tested that thoroughly enough, but it should give you pretty similar results if you are more used to using them. Step 2: Analyze resultsNow that we have a nice result set of the top 10 results for your keyword list, its time to start pulling in SEO metrics for each of those to build some actionable data! My preference is to use the Niels Bosma Excel Plugin, as its super easy and quick to pull the data you need directly into Excel where you can start analyzing the information and building charts. If you haven't already done so, you should start by downloading and installing the plugin available here (note: It's for Windows only, so if you are a Mac user like me, you'll need to use Parallels or another virtual machine).
In the column adjacent to your list of URLs you simply need to use the formula: =MajesticSEOIndexItemInfo(C2,"CitationFlow","fresh",TRUE) This formula gives you the CitationFlow number for the URL in cell C2. Obviously, if your sheet is formatted differently, then you'll need to update the cell reference number. Once you see the CitationFlow appear in that cell, just copy it down to fill the entire list, and if you have lots of keywords right now would be a great time to go grab a coffee, as it can take some time depending on your connection and the number of results you want. Now you should be looking at a list something like this:
Which allows us to start doing some pretty incredible keyword research! Step 3: Find opportunityThe first thing that you probably want to do is look at individual keywords and find the ranking opportunity in those. This is trivially easy to do as long as you are familiar with Excel pivot tables. For a simple look, just create a pivot of the average citation score of each keyword, the resulting table creator wizard will look something like this:
Of course you can now visualize the data just by creating a simple chart, if we apply the above data to a standard bar chart you will begin to see the kind of actionable data we can build:
This is just the beginning, though! If you create a pivot chart across a large dataset and look at the average citation score for each position, you can see interesting patterns develop. This example is looking at a dataset of 52,000 keywords, and taking the average score of each site appearing in each position in the top 10 results:
As you can see, across a large dataset there is a really nice degradation of strength in the top 10 results, a real vindication that the data we are looking at is rational and is a good indicator of how strong you need to be to rank a given page (providing the content is sufficient and focused enough). You really want to splice the data into categories at this stage, to identify the areas of quickest opportunity and focus on building content and links towards the areas where you are likely to earn traffic. The below chart represents a comparison of three categories of keywords, sorted by the average Citation of the results in each category:
From this we can see that of the three keyword categories, we are likely to rank higher up for keywords in the "brown widgets" category. Having said that, though, we are also able to rank lower down the page in the "blue widgets" category, so if that has significantly more traffic it might prove a better investment of your time and energy. There you go!We have created a homebrew keyword difficulty tool, capable of analyzing hundreds of thousands of URLs to mine for opportunity and guide your content and linkbuilding strategies! There is so much you can do with this data if you put your mind to it. True, scraping Google's results strictly speaking is against their Terms of Service, but they have a habit of using our data, so lets turn the tables on them for a change! Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! |
Panic Stations! A Case Study on How to Handle an Important Page Disappearing from Google Posted: 02 Dec 2013 02:33 AM PST Posted by steviephil This post was originally in YouMoz, and was promoted to the main blog because it provides great value and interest to our community. The author's views are entirely his or her own and may not reflect the views of Moz, Inc.
Picture the scene...
You wake up, grab a shower, throw on your Moz t-shirt (and other clothes, presumably...), boil the kettle, pour yourself a cup of coffee, switch on the ol' laptop, let your daily rank checks complete and then slowly run through them one by one... ...Yep... ...Yep... ...Ooo, that's nice... ...Uh-huh... ...Yes! Great jump there!... ...Yep... ...Ye- Wait, hold on... What? Lots of red, all across the board? Rankings have either dropped multiple pages or dropped out of the top 100 results entirely?! Uh-oh. It's gonna be a looong day.... This happened to me recently with one of my clients. Their homepage - their main page as far as rankings were concerned - had mysteriously vanished from Google's index overnight, taking with it a lot of page one rankings, as you can see from the world's saddest and perhaps most unnecessary GIF image below:
This was also the first time that it'd happened to me. Granted, I've consulted on this type of thing before, but usually when it's happened to someone and they approach me asking what's happened afterwards. However, this was the first instance of it where I was discovering it for myself and it was happening under my watch, affecting one of my clients. This post runs through the steps that I took to resolve the issue. I acted methodically yet swiftly, and in doing so managed to get the homepage back in Google's index (and - with it - its former rankings) in less than 12 hours. I accept that this is one of those articles where you probably won't even need it until it happens to you. To be honest, I was in that exact situation - I pretty much knew what to do, but I was still a bit like "OMG OMG OMG, whowhatwherewhenwhy?!" in trying to find an article to just double-check that I was doing everything I could be doing and wasn't overlooking anything obvious. So... Are you ready? Here we go! Check if it's just that page or all pages on the siteI primarily use Link Assistant's Rank Tracker (with Trusted Proxies) for my rank checking needs, with Moz PRO's rank checking as a backup and second opinion. Rank Tracker allows a 'URL Found' column, which revealed something to me instantly: other pages were still ranking, just not the homepage. Additionally, where a ranking had seen a drop of a few pages (but was still ranking within the top 10 pages/100 results), a different page was ranking instead - in my client's case, it was things like the Services, Testimonials and Contact pages. This suggested to me that it was just the homepage that was affected - but there was still a way that I could find out to be sure... Use the 'site:' operator to check if it's still in Google's indexMy next step was to use Google's 'site:' operator (see #1 here) on the domain, to see whether the homepage was still in Google's index. It wasn't - but all of the site's other pages were. Phew... Well at least it wasn't site-wide! Even though I had a feeling that this would be the case based on what Rank Tracker was saying, it was still important to check, just in case the homepage was still ranking but had been devalued for whatever reason. Now that I knew for sure that the homepage was gone from Google, it was time to start investigating what the actual cause might be... Check 1) Accidental noindexing via the meta noindex tagIn my experience, this is usually what's responsible when something like this happens... Given that the option to noindex a page is often a tick-box in most CMS systems these days, it's easy enough to do. In fact, one of the times I looked into the issue for someone, this was what was the cause - I just told them to untick the box in WordPress. In order to check, bring up the page's source code and look for this line (or something similar): <meta name="robots" content="noindex"> (Hit Ctrl + F and search for "noindex" if it's easier/quicker.) If you find this code in the source, then chances are that this is responsible. If it's not there, onto the next step... Check 2) Accidental inclusion in the site's robots.txt fileIt seems to be a somewhat common myth that robots.txt can noindex a page - it actually tells search engines not to crawl a page, so it'd only be true if the page had never actually appeared in Google's index in the first place (e.g. if it were a brand new site). Here's more info if you're interested. To be honest though, given what had happened, I didn't want to assume that this wasn't the cause and therefore I thought it would be best just to check anyway. But alas... The site's robots.txt file hadn't changed one iota. Onto step 3... Check 3) Penalty checks
Given that this was my client, I was already familiar with its history, and I was already adamant that a penalty wasn't behind it. But again, I wanted to do my due diligence - and you know what they say when you assume...! I jumped into Google Webmaster Tools and looked at the recently added Manual Actions tab. Unsurprisingly: "No manual webspam actions found." Good good. However, let's not rule out algorithmic penalties, which Google doesn't tell you about (and oh lordy, that's caused some confusion). As far as Pandas were concerned, there was no evidence of accidental or deliberate duplicate content either on the site or elsewhere on the Web. As for those dastardly Penguins, given that I'm the first SEO ever to work on the site and I don't build keyword anchor text links for my clients, the site has never seen any keyword anchor text, let alone enough to set off alarm bells. Following these checks, I was confident that a penalty wasn't responsible. Check 4) Remove URLs feature in Google Webmaster ToolsAnother check while you're in your Webmaster Tools account: go to Google Index > Remove URLs and check that the page hasn't been added as a removal request (whether by accident or on purpose). You never know... It's always best to check. Nope... "No URL removal requests" in this case. It was at this point, that I was starting to think: "what the hell else could it be?!" Check 5) Accidental 404 code
On the day that this happened, I met up with my good friends and fellow SEOs Andrew Isidoro (@Andrew_Isidoro) and Ceri Harris of Willows Finance for a drink and a bite to eat down the pub. I ran this whole story by them along with what I'd done so far, and Andrew suggested something that I hadn't considered: although extremely unlikely, what if the homepage was now showing up as a 404 (Not Found) code instead of a 200 (OK) code? Even if the page is live and performing normally (to the visitor), a 404 code would tell Google that that page "don't live here no more" (to quote the mighty Hendrix) and Google would remove it accordingly. Again, it was worth checking, so I ran it past SEO Book's HTTP header checker tool. The verdict: 200 code. It was a-OK (pun fully intended - it's a good thing that I'm an SEO and not a comedian...) Ok, so now what? Testing the page in Google Webmaster ToolsNow it was time to ask the big boss Googly McSearchengineface directly: what do you make of the page, oh mighty one? In order to do this, go to Google Webmaster Tools, click on the site in question and select Crawl > Fetch as Google from the side-menu. You should see a screen like this:
Simply put the affected page(s) into it (or leave it blank if it's the homepage) and see what Google makes of them. Of course, if it's "Failed," is there a reason why it's failed? It might also help to give you an idea about what could be wrong... Asking Google to (re)index the pageOnce you have done the above in GWT, you're given this option if Google can successfully fetch the page:
I decided to do just that: ask Google to (re)submit the page to its index. At this point I was confident that I had done pretty much everything in my power to investigate and subsequently rectify the situation. It was now time to break the news, by which I mean: tell the client... Inform the client
I thought it best to tell the client after doing all of the above (except for the 404 check, which I actually did later on), even if it was possible that the page might recover almost immediately (which it did in the end, pretty much). Plus I wanted to be seen as proactive, not reactive - I wanted to be the one to tell him, not for him to be the one finding out for himself and asking me about it... Here's the email that I sent: Hi [name removed], I just wanted to bring your attention to something. I conduct daily ranks checks just to see how your site is performing on Google on a day-to-day basis, and I've noticed that your homepage has disappeared from Google. Usually this is the result of a) accidental de-indexation or b) a penalty, but I have checked the usual suspects/causes and I see no sign of either of those occurring. I have checked in your Webmaster Tools account and Google can successfully read/crawl the page, so no problems there. I have taken appropriate steps to ask Google to re-index the page. I've done all that I can for now, but if we do not see everything back to normal in the next couple of days, I will continue to research the issue further. It's likely the case that it will recover of its own accord very soon. Like I say, I've checked the usual signs/causes of such an issue and it doesn't appear to be the result of any of those. Just to check, have you or your web designer made any changes to the website in the last couple of days/weeks? If so, could you please let me know what you have done? I know it's not an ideal situation, but I hope you can appreciate that I've spotted the issue almost immediately and have taken steps to sort out the issue. If you have any questions about it then please do let me know. In the meantime I will keep a close eye on it and keep you posted with any developments.
(Note: In this instance, my client prefers email contact. You may find that a phone call may be better suited, especially given the severity of the situation - I guess it will be a judgement call depending on the relationship that you have with your client and what they'd prefer, etc.) He took it well. He hadn't noticed the drop himself, but he appreciated me notifying him, filling him in on the situation and explaining what action I had taken to resolve the issue. * Recovery! *Later on the same day in the evening, I did another quick check. To my surprise, the homepage was not only back in Google, but the rankings were pretty much back to where they once were. PHEW! I say "surprised" not because of my ability to pull it off, but with how quickly it'd happened - I expected that it might've taken a few days maybe, but not a mere few hours. Oh well, mustn't complain...!
The real (possible) cause...So what did cause the deindexation? Well, another suggestion that came from Andrew while we were down the pub that I'd stupidly overlooked: downtime! It could've been an unfortunate and unlucky coincidence that Google happened to re-crawl the page exactly when the site had gone down. I hadn't added the site to my Pingdom account before all of this had happened (something that I have since rectified), so I couldn't know for sure. However, the site went down again a day or so later, which made me wonder if downtime was responsible after all... Even so, I advised the client that if this was a common occurrence that he should maybe consider switching hosting providers to someone more reliable, in order to reduce the chance of this happening all over again... Preparing yourself for when it happens to you or your clientsIn order to make sure that you're fully on top of a situation like this, make sure that you're carrying out daily rank checks and that you're quickly checking those rank checks, even if it's a quick once-over just to make sure that nothing drastic has happened in the last 24 hours. It's clear to say that if I hadn't have done so, I might not have realised what had happened for days and therefore might not have rectified the situation for days, either. Also, having a 'URL Found' column in addition to 'Ranking Position' in your rank checking tool of choice is an absolute must - that way you can see if it's a particular page that's affected if different pages are now the highest-ranking pages instead. Anyway, I hope that this case study/guide has been useful, whether you're reading it to brush up ready for when the worst happens, or whether the worst is happening to you right now (in which case I feel for you, my friend - be strong)...! Also, if you'd do anything differently to what I did or you think that I've missed a pivotal step or check, please let me know in the comments below! Did you like the comic drawings? If so, check out Age of Revolution, a new comic launched by Huw (@big_huw) & Hannah (@SpannerX23). Check them out on Facebook, Twitter and Ukondisplay.com (where you can pick up a copy of their first issue). Their main site - Cosmic Anvil - is coming soon... I'd like to say a massive thanks to them for providing the drawings for this post, which are simply and absolutely awesome, I'm sure you'll agree! Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! |
You are subscribed to email updates from Moz Blog To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google Inc., 20 West Kinzie, Chicago IL USA 60610 |