Quick Guide to Scaling Your Authorship Testing with Screaming Frog |
Quick Guide to Scaling Your Authorship Testing with Screaming Frog Posted: 28 Oct 2013 04:12 PM PDT Posted by kanejamison Nearly all of us have used Screaming Frog to crawl websites. Many of you have probably also used Google's Structured Data Testing Tool (formerly known as the Rich Snippet Testing Tool) to test your authorship setup and other structured data. This is a quick tutorial on how to combine these two tools to check your entire website for structured data such as Google Authorship and Rel="Publisher", along with various types of Schema.org markup. The concept:Google's structured data tester uses the URL you're testing right in their own URL. Here's an example:
We can take advantage of that URL structure to create a list of URLs we want to test for structured data markup, and process that list through Screaming Frog. Why this is better than simply crawling your site to detect markup:You could certainly crawl your site and use Screaming Frog's custom filters to detect things like rel="author" and ?rel=author within your own code. And you should. This approach will tell you what Google is actually recognizing, which can help you detect errors in implementation of authorship and other markup. Disclaimer: I've encountered a number of times when the Structured Data Testing Tool reported a positive result for authorship implementation, but authorship snippets in search results were not functioning. Upon further review, changing the implementation method resolved the issue. Also, authorship may not be granted or present for a particular Google+ user. As a result, it's important to note that the Structured Data Tester isn't perfect and will produce false positives, but it will suit our need in this case, quickly testing a large number of URLs all at once. Getting startedYou're going to need a couple things to get started:
The video optionThis short video tutorial walks through all eight steps outlined below. If you choose to watch the video, you can skip straight to the section titled "Four ways to expand this concept."
Steps 1, 2, and 3: Gather your list of URLs into the Excel templateYou can find the full instructions inside the Excel template, but here's the simple 1-2-3 version of how to use the Excel template (make sure URL Tools or SEO Tools is installed before you open this file or you'll have to fix the formula):
Step 4: Copy all of the URLs in Column B into a .txt fileNow that Column B of your spreadsheet is filled with URLs that we'll be crawling, copy and paste that column into a text file so that there is one URL per line. This is the .txt file that we'll use in Screaming Frog's list mode.
Step 5: Open up Screaming Frog, switch it to list mode, and upload your file
Step 6: Set up Screaming Frog custom filtersBefore we go crawling all of these URLs, it's important that we set up custom filters to detect specific responses from the Structured Data Testing Tool.
Since we're testing authorship for this example, here are the exact pieces of text that I'm going to tell Screaming Frog to track:
Just to be clear, here's the explanation for each piece of text we're tracking:
Step 7: Let 'er ripAt this point we're ready to start crawling the URLs. Out of respect for Google's servers and to avoid them disabling our ability to crawl URLs in this manner, you might consider adjusting your crawl rate to a slower pace, especially on large sites. You can adjust this setting in Screaming Frog by going to Configuration > Speed, and decreasing your current settings. Step 8: Export your results in the Custom tabOnce the crawl is finished, go to the Custom tab, select each filter that you tested, and export the results.
Wrapping it upThat's the quick and dirty guide. Once you export each CSV, you'll want to save them according to the filters you put in place. For example, my filter 3 was testing for pages that contained the phrase "Page does not contain authorship markup." So, I know that anything that is exported under Filter 3 did not return an authorship result in the Structured Data Testing Tool. Four ways to expand this concept:1: Use a proper scraper to pull data on multiple authorsScreaming Frog is an easy tool to do quick checks like the one described in this tutorial, but unfortunately it can't handle true scraping tasks for us. If you want to use this method to also pull data such as which author is being verified for a given page, I'd recommend redesigning this concept to work in Outwit Hub. John-Henry Scherck from SEOGadget has a great tutorial on how to use Outwit for basic scraping tasks that you should read if you haven't used the software before. For the more technical among us, there are plenty of other scrapers that can handle a task like this - the important part is understanding the process so you can use it in your tool of choice. 2: Compare authorship tests against ranking results and estimated search volume to find opportunitiesImagine you're ranking 3rd for a high-volume search term, and you don't have authorship on the page. I'm willing to bet it would be worth your time to add authorship to that page. Use hlookups or vlookups in Excel to compare data from three tabs: rankings, estimated search volume, and whether or not authorship is present on the page. It will take some data manipulation, but in the end you should be able to create a Pivot Table that filters out pages with authorship already, and sorts the pages by estimated search volume and current ranking. Note: I'm not suggesting you add Authorship to everythingâ"not every page should be attributed to an authorâ"e-commerce product pages, for example. 3: Use this method to test for other structured markup besides authorshipThe Structured Data Testing Tool goes far beyond just authorship. Here's a short list of other structured markup you can test:
4: Blend this idea with Screaming Frog's other capabilitiesThere's a ton of ways to use Screaming Frog. Aichlee Bushnell at SEER did a great job of cataloging 55+ Ways To Use Screaming Frog. Go check out that post and I'm sure you can come up with additional ways to spin this concept into something useful. Not to end on a dull note, but a couple comments on troubleshooting:
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! |
You are subscribed to email updates from Moz Blog To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google Inc., 20 West Kinzie, Chicago IL USA 60610 |
Facebook Twitter | More Ways to Engage