vineri, 3 aprilie 2015

Understanding and Applying Moz's Spam Score Metric - Whiteboard Friday - Moz Blog


Understanding and Applying Moz's Spam Score Metric - Whiteboard Friday

Posted on: Friday 03 April 2015 — 02:14

Posted by randfish

This week, Moz released a new feature that we call Spam Score, which helps you analyze your link profile and weed out the spam (check out the blog post for more info). There have been some fantastic conversations about how it works and how it should (and shouldn't) be used, and we wanted to clarify a few things to help you all make the best use of the tool.

In today's Whiteboard Friday, Rand offers more detail on how the score is calculated, just what those spam flags are, and how we hope you'll benefit from using it.


For reference, here's a still of this week's whiteboard. 

Understanding and Applying Moz's Spam Score Metric

Click on the image above to open a high resolution version in a new tab!

Video transcription

Howdy Moz fans, and welcome to another edition of Whiteboard Friday. This week, we're going to chat a little bit about Moz's Spam Score. Now I don't typically like to do Whiteboard Fridays specifically about a Moz project, especially when it's something that's in our toolset. But I'm making an exception because there have been so many questions and so much discussion around Spam Score and because I hope the methodology, the way we calculate things, the look at correlation and causation, when it comes to web spam, can be useful for everyone in the Moz community and everyone in the SEO community in addition to being helpful for understanding this specific tool and metric.

The 17-flag scoring system

I want to start by describing the 17 flag system. As you might know, Spam Score is shown as a score from 0 to 17. You either fire a flag or you don't. Those 17 flags you can see a list of them on the blog post, and we'll show that in there. Essentially, those flags correlate to the percentage of sites that we found with that count of flags, not those specific flags, just any count of those flags that were penalized or banned by Google. I'll show you a little bit more in the methodology.

Basically, what this means is for sites that had 0 spam flags, none of the 17 flags that we had fired, that actually meant that 99.5% of those sites were not penalized or banned, on average, in our analysis and 0.5% were. At 3 flags, 4.2% of those sites, that's actually still a huge number. That's probably in the millions of domains or subdomains that Google has potentially still banned. All the way down here with 11 flags, it's 87.3% that we did find banned. That seems pretty risky or penalized. It seems pretty risky. But 12.7% of those is still a very big number, again probably in the hundreds of thousands of unique websites that are not banned but still have these flags.

If you're looking at a specific subdomain and you're saying, "Hey, gosh, this only has 3 flags or 4 flags on it, but it's clearly been penalized by Google, Moz's score must be wrong," no, that's pretty comfortable. That should fit right into those kinds of numbers. Same thing down here. If you see a site that is not penalized but has a number of flags, that's potentially an indication that you're in that percentage of sites that we found not to be penalized.

So this is an indication of percentile risk, not a "this is absolutely spam" or "this is absolutely not spam." The only caveat is anything with, I think, more than 13 flags, we found 100% of those to have been penalized or banned. Maybe you'll find an odd outlier or two. Probably you won't.

Correlation ≠ causation

Correlation is not causation. This is something we repeat all the time here at Moz and in the SEO community. We do a lot of correlation studies around these things. I think people understand those very well in the fields of social media and in marketing in general. Certainly in psychology and electoral voting and election polling results, people understand those correlations. But for some reason in SEO we sometimes get hung up on this.

I want to be clear. Spam flags and the count of spam flags correlates with sites we saw Google penalize. That doesn't mean that any of the flags or combinations of flags actually cause the penalty. It could be that the things that are flags are not actually connected to the reasons Google might penalize something at all. Those could be totally disconnected.

We are not trying to say with the 17 flags these are causes for concern or you need to fix these. We are merely saying this feature existed on this website when we crawled it, or it had this feature, maybe it still has this feature. Therefore, we saw this count of these features that correlates to this percentile number, so we're giving you that number. That's all that the score intends to say. That's all it's trying to show. It's trying to be very transparent about that. It's not trying to say you need to fix these.

A lot of flags and features that are measured are perfectly fine things to have on a website, like no social accounts or email links. That's a totally reasonable thing to have, but it is a flag because we saw it correlate. A number in your domain name, I think it's fine if you want to have a number in your domain name. There's plenty of good domains that have a numerical character in them. That's cool.

TLD extension that happens to be used by lots of spammers, like a .info or a .cc or a number of other ones, that's also totally reasonable. Just because lots of spammers happen to use those TLD extensions doesn't mean you are necessarily spam because you use one.

Or low link diversity. Maybe you're a relatively new site. Maybe your niche is very small, so the number of folks who point to your site tends to be small, and lots of the sites that organically naturally link to you editorially happen to link to you from many of their pages, and there's not a ton of them. That will lead to low link diversity, which is a flag, but it isn't always necessarily a bad thing. It might still nudge you to try and get some more links because that will probably help you, but that doesn't mean you are spammy. It just means you fired a flag that correlated with a spam percentile.

The methodology we use

The methodology that we use, for those who are curious -- and I do think this is a methodology that might be interesting to potentially apply in other places -- is we brainstormed a large list of potential flags, a huge number. We cut that down to the ones we could actually do, because there were some that were just unfeasible for our technology team, our engineering team to do.

Then, we got a huge list, many hundreds of thousands of sites that were penalized or banned. When we say banned or penalized, what we mean is they didn't rank on page one for either their own domain name or their own brand name, the thing between the www and the .com or .net or .info or whatever it was. If you didn't rank for either your full domain name, www and the .com or Moz, that would mean we said, "Hey, you're penalized or banned."

Now you might say, "Hey, Rand, there are probably some sites that don't rank on page one for their own brand name or their own domain name, but aren't actually penalized or banned." I agree. That's a very small number. Statistically speaking, it probably is not going to be impactful on this data set. Therefore, we didn't have to control for that. We ended up not controlling for that.

Then we found which of the features that we ideated, brainstormed, actually correlated with the penalties and bans, and we created the 17 flags that you see in the product today. There are lots things that I thought were going to correlate, for example spammy-looking anchor text or poison keywords on the page, like Viagra, Cialis, Texas Hold'em online, pornography. Those things, not all of them anyway turned out to correlate well, and so they didn't make it into the 17 flags list. I hope over time we'll add more flags. That's how things worked out.

How to apply the Spam Score metric

When you're applying Spam Score, I think there are a few important things to think about. Just like domain authority, or page authority, or a metric from Majestic, or a metric from Google, or any other kind of metric that you might come up with, you should add it to your toolbox and to your metrics where you find it useful. I think playing around with spam, experimenting with it is a great thing. If you don't find it useful, just ignore it. It doesn't actually hurt your website. It's not like this information goes to Google or anything like that. They have way more sophisticated stuff to figure out things on their end.

Do not just disavow everything with seven or more flags, or eight or more flags, or nine or more flags. I think that we use the color coding to indicate 0% to 10% of these flag counts were penalized or banned, 10% to 50% were penalized or banned, or 50% or above were penalized or banned. That's why you see the green, orange, red. But you should use the count and line that up with the percentile. We do show that inside the tool as well.

Don't just take everything and disavow it all. That can get you into serious trouble. Remember what happened with Cyrus. Cyrus Shepard, Moz's head of content and SEO, he disavowed all the backlinks to its site. It took more than a year for him to rank for anything again. Google almost treated it like he was banned, not completely, but they seriously took away all of his link power and didn't let him back in, even though he changed the disavow file and all that.

Be very careful submitting disavow files. You can hurt yourself tremendously. The reason we offer it in disavow format is because many of the folks in our customer testing said that's how they wanted it so they could copy and paste, so they could easily review, so they could get it in that format and put it into their already existing disavow file. But you should not do that. You'll see a bunch of warnings if you try and generate a disavow file. You even have to edit your disavow file before you can submit it to Google, because we want to be that careful that you don't go and submit.

You should expect the Spam Score accuracy. If you're doing spam investigation, you're probably looking at spammier sites. If you're looking at a random hundred sites, you should expect that the flags would correlate with the percentages. If I look at a random hundred 4 flag Spam Score sites, 7.5% of those I would expect on average to be penalized or banned. If you are therefore seeing sites that don't fit those, they probably fit into the percentiles that were not penalized, or up here were penalized, down here weren't penalized, that kind of thing.

Hopefully, you find Spam Score useful and interesting and you add it to your toolbox. We would love to hear from you on iterations and ideas that you've got for what we can do in the future, where else you'd like to see it, and where you're finding it useful/not useful. That would be great.

Hopefully, you've enjoyed this edition of Whiteboard Friday and will join us again next week. Thanks so much. Take care.

Video transcription by Speechpad.com

ADDITION FROM RAND: I also urge folks to check out Marie Haynes' excellent Start-to-Finish Guide to Using Google's Disavow Tool. We're going to update the feature to link to that as well.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

You are subscribed to the newsletter of Moz Blog sent from 1100 Second Avenue, Seattle, WA 98101 United States
To stop receiving those e-mails, you can unsubscribe now.
Newsletter powered by FeedPress
FeedPress is a service edited by Beta&Cie, www.betacie.com

Seth's Blog : The hard part about surfing

The hard part about surfing

Surfing, the conceptual kind, is more essential than ever, it's not optional.

And the hardest part of surfing, by far, is paddling out, not surfing in.

Carrying the board, getting back into the water, paddling through the waves, waiting for the next set...it's exhausting, and surfers spend far more time doing this than they do on the other part.

Having the guts to surf is what change demands. And finding the stamina to paddle back out is a key part of surfing.

       

More Recent Articles

[You're getting this note because you subscribed to Seth Godin's blog.]

Don't want to get this email anymore? Click the link below to unsubscribe.



Email subscriptions powered by FeedBlitz, LLC, 365 Boston Post Rd, Suite 123, Sudbury, MA 01776, USA.

joi, 2 aprilie 2015

Mish's Global Economic Trend Analysis

Mish's Global Economic Trend Analysis


Thrown Under the Bus: Another Look at the Self-Serving Launch of Ben Bernanke's Blog and the Brookings Institute's Pandering Role

Posted: 02 Apr 2015 09:08 PM PDT

Within hours of Ben Bernanke launching his blog at the Brookings Institute I commented Ben Bernanke, Confused as Ever, Starts His Own Blog to Prove It.

By that time, a few hundred comments to his blog had already been approved. I made two comments of my own.

I asked Bernanke about  Fed-sponsored bubbles, inflation as measured by the CPI while ignoring assets especially in housing (see charts in the above link), and whether or not the Fed had any culpability for that had happened.

I was 99% sure in advance my questions and comments would be deleted. They were twice.

Instead of posting serious comments and questions, the Brookings institute fawned all over Bernanke by posting numerous glowing appraisals, thanks, and other trivia.

Purpose of Bernanke's Blog

The sole purpose of Bernanke's blog, and the Brookings Institute is shamefully willing to go along with it, is to vindicate Ben Bernanke and the Fed from their role in the housing bubble.

Others have chimed in on Bernanke's post and have had their comments and questions deleted as well.

Since the Brookings Institute is willing to degrade itself to such a level, I thought I would post a reply to Bernanke's Blog that I found noteworthy.

Ben Bernanke's Apologia for the Fed

I invite you to read Ben Bernanke's Apologia for the Fed by Pater Tenebrarum at the Acting Man blog.

Instead of excerpting Pater's excellent post, I simply ask you to take a look for yourself. Pater makes a mockery of Bernanke's "natural interest rate thesis", of Bernanke's claim that the Fed does not distort markets, of the presumed role of the Fed as a benevolent institution, and of Bernanke's disingenuous comments about "throwing senior citizens under the bus".

I made many similar comments, but Pater also brought into play a critical discussion on time preference and why natural interest rates can never be less than zero.

Pater Tenebrarum was my primary teacher on Austrian economics and the Libertarian philosophy dating back to about 2001. His blog is one of few I bookmark and invariably read.

If you are genuinely interested in the Austrian economic point of view, you may wish to bookmark his blog as well. His feed is also on the right side of my blog.

Mike "Mish" Shedlock
http://globaleconomicanalysis.blogspot.com

Trade Deficit Shrinks; First Quarter GDP Estimate Ticks Up to 0.1%

Posted: 02 Apr 2015 10:18 AM PDT

Trade Deficit Shrinks

Inquiring minds are investigating the Commerce Department report on International Trade in Goods and Services for February 2015, for clues about first quarter GDP.

Highlights

  • Exports were $186.2 billion, down $3.0 billion from January.
  • Imports were $221.7 billion, down $10.2 billion from January.
  • Year-to-date, the goods and services deficit decreased $ 2.6 billion, or 3.2 percent, from the same period in 2014.
  • Year-to-date exports decreased $5.3 billion or 1.4 percent.
  • Year-to-date imports decreased $7.9 billion or 1.7 percent.

Balance of Trade



GDP Analysis

Recall that exports add to GDP and imports subtract from GDP. Thus my first reaction to the report was that GDP estimates would go up. They did, but very slightly.

Atlanta Fed GDPNow Model

Yesterday, following an Unexpected Decline in Construction activity, the Atlanta Fed GDPNow forecast dipped to 0.0%.

Today following the shrinkage in the trade deficit, the forecast is back in positive territory at 0.1%.

"The GDPNow model forecast for real GDP growth (seasonally adjusted annual rate) in the first quarter of 2015 was 0.1 percent on April 2, up from 0.0 percent on April 1. Following this morning's international trade release from the U.S. Census Bureau, the nowcast for the change in real net exports in 2009 dollars increased from -40 billion to -33 billion. The nowcast for real equipment investment growth declined from 7.5 percent to 6.1 percent following the international trade report and the Census Bureau's M3 manufacturing report."

GDPNow Estimate for 1st Quarter



Another Sign of Slowing Global Economy

The declining trade deficit is a good thing. However, the shrinking trade deficit is not as positive as it may look at first glance.

It would have been far better had the trade deficit shrinkage been on rising exports. Instead, imports and exports are both down. That is yet another sign of the slowing global economy.

Back in January, I forecast declining exports on the strength of the US dollar. Here we are. If oil ticks back up for any reason, so will imports.

There is not a lot to cheer about in today's reports (Also see Factory Orders Unexpectedly Rise Snapping String of 6 Straight Declines).

Mike "Mish" Shedlock
http://globaleconomicanalysis.blogspot.com 

Factory Orders Unexpectedly Rise Snapping String of 6 Straight Declines

Posted: 02 Apr 2015 09:21 AM PDT

After six straight months of factory orders unexpectedly declining, economists apparently were finally convinced that bad weather would continue indefinitely.

Factory orders rose, albeit barely, and last month was revised way lower, nonetheless it was amusing to see economists expectations were in the wrong direction for the seventh straight month.

This month, factory orders unexpectedly rose.

Consensus Estimates

The Bloomberg Consensus estimate was for no growth, while orders rose a modest 0.2%.

After 6 straight declines, factory orders finally moved to the plus column, up 0.2 percent in a February gain, however, that is tied largely to an upward price swing for petroleum and coal products. Another mitigating factor is a sharp downward revision to January orders, to minus 0.7 percent from minus 0.2 percent.

Durable goods show broad weakness with orders down 1.4 percent in data initially posted last week. Most readings show significant declines and underscore this morning's export dip in the international trade report and the Fed's concerns over weak export markets and the negative effects of the strong dollar. Core capital goods are down 1.1 percent in the month for a 6th straight decline in a reading that points to a lack of business confidence and business investment.

Total shipments bounced back 0.7 percent in February but follow a 2.3 percent plunge in January and which holds down factory contribution to first-quarter GDP. A clear negative is a 3rd straight decline for unfilled orders, down 0.5 percent for what is now the weakest string since way back in the recession days of late 2009. A lack of unfilled orders will not encourage manufacturers to add to their workforces. One positive is inventories which are less heavy, up only 0.1 percent and bringing down the inventory-to-shipments ratio to 1.35 from January's recovery high of 1.36.

The main positive in today's report is the non-durables component where a 1.8 percent gain ends 7 straight declines, declines all tied to oil-price effects. But the weakness in durables, tied to foreign demand, is becoming a significant negative for the economic outlook.
Orders and Shipments



As you can see, that is not much to write home about, especially given that last month was revised from -0.2% to -0.7%

Census Report

Diving into the Census Report, for February vs. January (seasonally adjusted) we find new orders look like this:

  • All Manufacturing: +0.2%
  • ....Excluding Transportation: +0.8%
  • ....Excluding Defense: +0.4%
  • ....With unfilled orders -2.0%
  • Durable Goods -1.4%
  • Nondurable Goods +1.8
  • Furniture -2.4%
  • Motor Vehicles , Bodies, Parts -1.2%

The string of declines ended. Hooray! Otherwise, this looks like another bad month.

Mike "Mish" Shedlock
http://globaleconomicanalysis.blogspot.com

Damn Cool Pics

Damn Cool Pics


The Best Memes From The Walking Dead Season 5

Posted: 02 Apr 2015 11:10 AM PDT

Season 5 of "The Walking Dead" may be over but when it comes to memes it's the gift that just keeps on giving.