miercuri, 1 iulie 2015

Damn Cool Pics

Damn Cool Pics


Revealing Snapshots That Show How Much Really Changes over Time

Posted: 01 Jul 2015 02:54 PM PDT
























Lies All Web Designers Tell Their Clients

Posted: 01 Jul 2015 02:41 PM PDT

There's a big difference between what web designers tell you and what they actually mean.























Big Data, Big Problems: 4 Major Link Indexes Compared - Moz Blog

Big Data, Big Problems: 4 Major Link Indexes Compared

Posted by russangular

Given this blog's readership, chances are good you will spend some time this week looking at backlinks in one of the growing number of link data tools. We know backlinks continue to be one of, if not the most important parts of Google's ranking algorithm. We tend to take these link data sets at face value, though, in part because they are all we have. But when your rankings are on the line, is there a better way to get at which data set is the best? How should we go about assessing these different link indexes like Moz, Majestic, Ahrefs and SEMrush for quality? Historically, there have been 4 common approaches to this question of index quality...

  • Breadth: We might choose to look at the number of linking root domains any given service reports. We know that referring domains correlates strongly with search rankings, so it makes sense to judge a link index by how many unique domains it has discovered and indexed.
  • Depth: We also might choose to look at how deep the web has been crawled, looking more at the total number of URLs in the index, rather than the diversity of referring domains.
  • Link Overlap: A more sophisticated approach might count the number of links an index has in common with Google Webmaster Tools.
  • Freshness: Finally, we might choose to look at the freshness of the index. What percentage of links in the index are still live?

There are a number of really good studies (some newer than others) using these techniques that are worth checking out when you get a chance:

  • BuiltVisible analysis of Moz, Majestic, GWT, Ahrefs and Search Metrics
  • SEOBook comparison of Moz, Majestic, Ahrefs, and Ayima
  • MatthewWoodward study of Ahrefs, Majestic, Moz, Raven and SEO Spyglass
  • Marketing Signals analysis of Moz, Majestic, Ahrefs, and GWT
  • RankAbove comparison of Moz, Majestic, Ahrefs and Link Research Tools
  • StoneTemple study of Moz and Majestic

While these are all excellent at addressing the methodologies above, there is a particular limitation with all of them. They miss one of the most important metrics we need to determine the value of a link index: proportional representation to Google's link graph . So here at Angular Marketing, we decided to take a closer look.

Proportional representation to Google Search Console data

So, why is it important to determine proportional representation? Many of the most important and valued metrics we use are built on proportional models. PageRank, MozRank, CitationFlow and Ahrefs Rank are proportional in nature. The score of any one URL in the data set is relative to the other URLs in the data set. If the data set is biased, the results are biased.

A Visualization

Link graphs are biased by their crawl prioritization. Because there is no full representation of the Internet, every link graph, even Google's, is a biased sample of the web. Imagine for a second that the picture below is of the web. Each dot represents a page on the Internet, and the dots surrounded by green represent a fictitious index by Google of certain sections of the web.

Of course, Google isn't the only organization that crawls the web. Other organizations like Moz, Majestic, Ahrefs, and SEMrush have their own crawl prioritizations which result in different link indexes.

In the example above, you can see different link providers trying to index the web like Google. Link data provider 1 (purple) does a good job of building a model that is similar to Google. It isn't very big, but it is proportional. Link data provider 2 (blue) has a much larger index, and likely has more links in common with Google that link data provider 1, but it is highly disproportional. So, how would we go about measuring this proportionality? And which data set is the most proportional to Google?

Methodology

The first step is to determine a measurement of relativity for analysis. Google doesn't give us very much information about their link graph. All we have is what is in Google Search Console. The best source we can use is referring domain counts. In particular, we want to look at what we call referring domain link pairs. A referring domain link pair would be something like ask.com->mlb.com: 9,444 which means that ask.com links to mlb.com 9,444 times.

Steps

  1. Determine the root linking domain pairs and values to 100+ sites in Google Search Console
  2. Determine the same for Ahrefs, Moz, Majestic Fresh, Majestic Historic, SEMrush
  3. Compare the referring domain link pairs of each data set to Google, assuming a Poisson Distribution
  4. Run simulations of each data set's performance against each other (ie: Moz vs Maj, Ahrefs vs SEMrush, Moz vs SEMrush, et al.)
  5. Analyze the results

Results

When placed head-to-head, there seem to be some clear winners at first glance. In head-to-head, Moz edges out Ahrefs, but across the board, Moz and Ahrefs fare quite evenly. Moz, Ahrefs and SEMrush seem to be far better than Majestic Fresh and Majestic Historic. Is that really the case? And why?

It turns out there is an inversely proportional relationship between index size and proportional relevancy. This might seem counterintuitive, shouldn't the bigger indexes be closer to Google? Not Exactly.

What does this mean?

Each organization has to create a crawl prioritization strategy. When you discover millions of links, you have to prioritize which ones you might crawl next. Google has a crawl prioritization, so does Moz, Majestic, Ahrefs and SEMrush. There are lots of different things you might choose to prioritize...

  • You might prioritize link discovery. If you want to build a very large index, you could prioritize crawling pages on sites that have historically provided new links.
  • You might prioritize content uniqueness. If you want to build a search engine, you might prioritize finding pages that are unlike any you have seen before. You could choose to crawl domains that historically provide unique data and little duplicate content.
  • You might prioritize content freshness. If you want to keep your search engine recent, you might prioritize crawling pages that change frequently.
  • You might prioritize content value, crawling the most important URLs first based on the number of inbound links to that page.

Chances are, an organization's crawl priority will blend some of these features, but it's difficult to design one exactly like Google. Imagine for a moment that instead of crawling the web, you want to climb a tree. You have to come up with a tree climbing strategy.

  • You decide to climb the longest branch you see at each intersection.
  • One friend of yours decides to climb the first new branch he reaches, regardless of how long it is.
  • Your other friend decides to climb the first new branch she reaches only if she sees another branch coming off of it.

Despite having different climb strategies, everyone chooses the same first branch, and everyone chooses the same second branch. There are only so many different options early on.

But as the climbers go further and further along, their choices eventually produce differing results. This is exactly the same for web crawlers like Google, Moz, Majestic, Ahrefs and SEMrush. The bigger the crawl, the more the crawl prioritization will cause disparities. This is not a deficiency; this is just the nature of the beast. However, we aren't completely lost. Once we know how index size is related to disparity, we can make some inferences about how similar a crawl priority may be to Google.

Unfortunately, we have to be careful in our conclusions. We only have a few data points with which to work, so it is very difficult to be certain regarding this part of the analysis. In particular, it seems strange that Majestic would get better relative to its index size as it grows, unless Google holds on to old data (which might be an important discovery in and of itself). It is most likely that at this point we can't make this level of conclusion.

So what do we do?

Let's say you have a list of domains or URLs for which you would like to know their relative values. Your process might look something like this...

  • Check Open Site Explorer to see if all URLs are in their index. If so, you are looking metrics most likely to be proportional to Google's link graph.
  • If any of the links do not occur in the index, move to Ahrefs and use their Ahrefs ranking if all you need is a single PageRank-like metric.
  • If any of the links are missing from Ahrefs's index, or you need something related to trust, move on to Majestic Fresh.
  • Finally, use Majestic Historic for (by leaps and bounds) the largest coverage available.

It is important to point out that the likelihood that all the URLs you want to check are in a single index increases as the accuracy of the metric decreases. Considering the size of Majestic's data, you can't ignore them because you are less likely to get null value answers from their data than the others. If anything rings true, it is that once again it makes sense to get data from as many sources as possible. You won't get the most proportional data without Moz, the broadest data without Majestic, or everything in-between without Ahrefs.

What about SEMrush? They are making progress, but they don't publish any relative statistics that would be useful in this particular case. Maybe we can hope to see more from them soon given their already promising index!

Recommendations for the link graphing industry

All we hear about these days is big data; we almost never hear about good data. I know that the teams at Moz, Majestic, Ahrefs, SEMrush and others are interested in mimicking Google, but I would love to see some organization stand up against the allure of more data in favor of better data—data more like Google's. It could begin with testing various crawl strategies to see if they produce a result more similar to that of data shared in Google Search Console. Having the most Google-like data is certainly a crown worth winning.

Credits

Thanks to Diana Carter at Angular for assistance with data acquisition and Andrew Cron with statistical analysis. Thanks also to the representatives from Moz, Majestic, Ahrefs, and SEMrush for answering questions about their indices.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

You are subscribed to the Moz Blog newsletter sent from 1100 Second Avenue, Seattle, WA 98101 United States
To stop receiving those e-mails, you can unsubscribe now.
Newsletter powered by FeedPress

Seth's Blog : Announcing my candidacy

Announcing my candidacy

Today, with just 495 days before the election, I'm announcing my run for President of the United States.

I'm well aware the that electoral politics have been transformed by the collision of semi-modern marketing techniques with the money necessary to implement them. The TV-Industrial complex demands ever more partisan politics, more tribal division, more vote-suppressing vitriol. As we've turned raising money into a game similar to box office returns (where quantity appears to equal quality), candidates have almost no choice but to sell themselves to the highest bidder of the moment, again and again and again.

Once you see this, it's hard to miss, even though candidates and the media work to conceal it with big promises and lots of apparently retail politics.

Is it any wonder that voters are cynical? Marketers and marketing made us that way.

My candidacy, on the other hand, will be marked by stunning transparency:

  • I'm not promising to get anything done, anything at all, so there is no chance you will be disappointed.
  • I'm selling slots in my campaign to the highest bidder, Google style. Digitally organized bidding makes it easy for any corporation or mogul to determine what something will cost, and real-time auctions will maximize the return.
  • I'll just keep the money, because TV ads merely coarsen our political discourse, almost never leading to a more informed electorate.
  • Most of all, once elected I'll stick to talk shows and other feel-good interactions, which is what the public wants most from its President.

Marketing has changed, but someone forgot to tell the inside-the-beltway power brokers. Brands aren't built the way they used to be, but politicians insist on the impatient churn-and-burn mass market awareness that even Procter & Gamble is choosing to leave behind.

Consider this: In the 2016 election, the candidates for President will together spend more money on advertising than any single US brand. That's never been true before--and it's because marketers today know something that impatient, self-centered politicians don't. Money isn't enough.

The brand of the future (the candidate of the future) is patient, consistent, connected, and trusted. The new brand is based on the truth that only comes from experiencing the product, not just yelling about it. Word of mouth is more important (by a factor of 20) than TV advertising, and the remarkability word of mouth demands comes from what we experience, not from spin or taglines or a campaign slogan.

Movements have leaders, but mostly, they have a place to lead to. And their leader can't stop, won't stop, has no choice but stay connected, keep raising the bar, continue to cycle forward.

So no, of course I won't be running (but I was a candidate for six paragraphs).

If the history of politics catching up with commercial marketing is any guide, I think that we're about to see a fundamental shift in how we talk about our leaders (and they talk to us), and perhaps (we can hope), the media will respond in kind.

And in the meantime, your brand, your campaign, your project, will benefit from what's happening now, which is marketing, not advertising, which is connection, not interruption. We've moved past the long-lost Mad Men era. Don't do marketing the way they do.

       

More Recent Articles

[You're getting this note because you subscribed to Seth Godin's blog.]

Don't want to get this email anymore? Click the link below to unsubscribe.



Email subscriptions powered by FeedBlitz, LLC, 365 Boston Post Rd, Suite 123, Sudbury, MA 01776, USA.