Central Perk

luni, 1 august 2011

SEOmoz Daily SEO Blog

Better Understanding Link-based Spam Analysis Techniques

Posted: 31 Jul 2011 12:33 PM PDT

One frustrating aspect of link building is not knowing the value of a link. Although experience, and some data, can make you better at link valuation, it is impossible to know to what degree a link may be helping you. It’s hard to know if a link is even helping at all. Search engines do not count all links, they reduce the value of many that they do count, and use factors related to your links to further suppress the value that’s left over. This is all done to improve relevancy and spam detection.

Understanding the basics of link-based spam detection can improve your understanding of link valuation and help you understand how search engines approach the problem of spam detection, which can lead to better link building practices.

I’d like to talk about a few interesting link spam analysis concepts that search engines may use to evaluate your backlink profile.

Disclaimer:

I don’t work at a search engine, so I can make no concrete claims about how search engines evaluate links. Engines may use some, or none, of the techniques in this post. They also certainly use more (and more sophisticated) techniques than I can cover in this post. However, I spend a lot of time reading through papers and patents, so I thought it'd be worth sharing some of the interesting techniques.

#1 Truncated PageRank

The basics of Truncated PageRank are covered in the paper Linked-based Characterization and Detection of Web Spam. Truncated PageRank is a calculation that removes the direct “link juice” contribution provided by the first level(s) of links. So a page boosted by naïve methods (such as article marketing) are receiving a large portion of the PageRank value directly from the first layer. However, a link from a well linked to page will receive “link juice” contribution from additional levels. Spam pages will likely show a Truncated PageRank that is significantly less than the PageRank. The ratio of Truncated PageRank to PageRank can be a signal to indicate the spamminess of a link profile.

#2 Owned / Accessible Contributions

Links can be bucketed into three general buckets.

Links from owned content – Links from pages that search engines have determined some level of ownership (well-connected co-citation, IP, whois, etc.)
Links from accessible content – Links from non-owned content that is easily accessible to add links (blogs, forums, article directories, guest books, etc.)
Links from inaccessible content – Links from independent sources.

A link from any one of these source is neither good nor bad. Links from owned content, via networks and relationships, are perfectly natural. However, a link from inaccessible content could be a paid link, so that bucket doesn’t mean it’s inherently good. However, knowing the bucket a link falls into can change the valuation.

This type of analysis on two sites can show a distinct difference in a link profile, all other factors being equal. The first site is primarily supported on links from content it directly controls or can gain access to. However, the second site has earned links from a substantially larger percentage of unique, independent sources. All things being equal, the second site is less likely to be spam.

#3 Relative Mass

Relative Mass accounts for the percent distribution of a profile for certain types of links. The example of the pie charts above demonstrates the concept of relative massive.

Relative Mass is discussed more broadly in the paper Link Spam Detection Based on Mass Estimation. Relative Mass analysis can define a threshold at which a page is determined “spam”. In the image above, the red circles have been identified as spam. The target page now has a portion of value attributed to it via “spam” sites. If this value of contribution exceeds a potential threshold, this page could have its rankings suppressed or the value passed through these links minimized. The example above is fairly binary, but there is often a large gradient between not spam and spam.

This type of analysis can be applied to tactics as well, such as distribution of links from comments, directories, articles, hijacked sources, owned pages, paid links, etc. The algorithm may provide a certain degree of “forgiveness” before its relative mass contribution exceeds an acceptable level.

#4 Counting Supporters / Speeds to Nodes

Another method of valuing links is by counting supporters and the speed of discovery of those nodes (and the point at which this discovery peaks).

A histogram distribution of supporting nodes by hops can demonstrate the differences between spam and high quality sites.

Well-connected sites will grow in supporters more rapidly than spam sites and spam sites are likely to peak earlier. Spam sites will grow rapidly and decay quickly as you move away from the target node. This distribution can help signify that a site is using spammy link building practices. Because spam networks have higher degrees of clustering, domains will repeat upon hops, which makes spam profiles bottleneck faster than non-spam profiles.

Protip: I think this is one reason that domain diversity and unique linking root domains is well correlated with rankings. I don’t think the relationship is as naïve as counting linking domains, but an analysis like supporter counting, as well as Truncated PageRank, would make receiving links from a larger set of diverse domains more well correlated with rankings.

#5 TrustRank, Anti-TrustRank, SpamRank, etc.

The model of TrustRank has been written about several times before and is the basis of metrics like mozTrust. The basic premise is that seed nodes can have both Trust and Spam scores which can be passed through links. The closer to the seed set, the higher the likelihood you are what that seed set was defined as. Being close to spam, makes you more likely to be spam, being close to trust, makes you more likely to be trusted. These values can be judged inbound and outbound.

I won’t go into much more detail than that, because you can read about it in previous posts, but it comes down to four simple rules.

Get links from trusted content.
Don’t get links from spam content.
Link to trusted content.
Don’t link to spam content.

This type of analysis has also been used to use SEO forums against spammers. A search engine can crawl links from top SEO forums to create a seed set of domains to perform analysis. Tinfoil hat time....

#6 Anchor Text vs. Time

Monitoring anchor text over time can give interesting insights that could detect potential manipulation. Let’s look at an example of how a preowned domain that was purchased for link value (and spam) might appear with this type of analysis.

This domain has a historical record of acquiring anchor text including both brand and non-branded targeted terms. Then suddenly that rate drops and after time a new sudden influx of anchor text, never seen before, starts to come in. This type of anchor text analysis, in combination with orthogonal spam detection approaches, can help detect the point in which ownership was changed. Links prior to this point can then be evaluated differently.

This type of analysis, plus some other very interesting stuff, is discussed in the Google paper Document Scoring Based on Link-Based Criteria.

#7 Link Growth Thresholds

Sites with rapid link growth could have the impact dampened by applying a threshold of value that can be gained within a unit time. Corroborating signals can help determine if a spike is from a real event or viral content, as opposed to link manipulation.

This threshold can discount the value of links that exceed an assigned threshold. A more paced, natural growth profile is less likely to break a threshold. You can find more information about historical analysis in the paper Information Retrieval Based on Historical Data.

#8 Robust PageRank

Robust PageRank works by calculating PageRank without the highest contributing nodes.

In the image above, the two strongest links were turned off and effectively reduced the PageRank of a node. Strong sites often have robust profiles and do not heavily depend on a few strong sources (such as links from link farms) to maintain a high PageRank. Robust PageRank calculations is one way the impact of over-influential nodes can be reduced. You can read more about Robust PageRank in the paper Robust PageRank and Locally Computable Spam Detection Features.

#9 PageRank Variance

The uniformity of PageRank contribution to a node can be used to evaluate spam. Natural link profiles are likely to have a stronger variance in PageRank contribution. Spam profiles tend to be more uniform.

So if you use a tool, marketplace, or service to order 15 PR 4 links for a specific anchor text, it will have a low variance in PR. This is an easy way to detect these sorts of practices.

#10 Diminishing Returns

One way to minimize the value of a tactic is to create diminishing marginal returns on specific types of links. This is easiest to see in sitewide links, such as blogroll links or footer paid links. At one time, link popularity, in volume, was a strong factor which lead to sitewides carrying a disproportionate amount of value.

The first link from a domain carries the first vote and getting additional links from one particular domain will continue to increase the total value from a domain, but only to a point. Eventually inbound links from the same domain will continue to experience diminishing returns. Going from 1 link to 3 links from a domain will have more of an effect than 101 links to 103 links.

Protip: Although it’s easy to see this with sitewide links, I think of most link building tactics in this fashion. In addition to ideas like relative mass, where you don’t want one thing to dominate, I feel tactics lose traction overtime. It is not likely you can earn strong rankings on a limited number of tactics, because many manual tactics tend to hit a point of diminishing returns (sometimes it may be algorithmic, other times it may be due to diminishing returns in the competitive advantage). It's best to avoid one-dimensional link building.

Link Spam Algorithms

All spam analysis algorithms have some percentage of accuracy and some level of false positives. Through the combination of these detection methods, search engines can maximize the accuracy and minimize the false positives.

Web spam analysis allows for more false positives than email spam detection, because there are often multiple alternatives to replace a pushed down result. It is not like email spam detection, which is binary in nature (inbox or spam box). In addition to this, search engines don’t have to create binary labels of “spam” or “not spam” to effectively improve search results. By using analysis, such as some of those discussed in this post, search engines can simply dampen rankings and minimize effects.

These analysis techniques are also designed to decrease the ROI of specific tactics, which makes spamming harder and more expensive. The goal of this post is not to stress about what links work, and which don’t, because it’s hard to know. The goal is to demonstrate some of the problem solving tactics used by search engines and how this impacts your tactics.

Do you like this post? Yes No

Get the Facts on the Debt Deal

Monday, August 1, 2011

Yesterday evening the President spoke in support of a bipartisan deal to reduce the nation’s deficit and avoid default. This deal will extend the debt limit until 2013, ensuring stability and economic confidence, and it puts in place a framework for balanced long term fiscal discipline. The bipartisan compromise assures that the United States meets its obligations, providing monthly Social Security checks and veterans’ benefits, and fulfilling contracts with thousands of American businesses.

Watch the video of the President’s remarks, and read the fact sheet for a comprehensive breakdown of the new plan.

Have questions about the deal? White House advisors are taking time this week to answer your questions as part of White House Office Hours. Use the hashtag #whchat to ask your questions, and Brian Deese and Jason Furman of the National Economic Council will answer a selection of them directly. Check out this week's schedule below, and follow @WhiteHouse for more updates.

Monday, August 1
5:00 p.m. EDT: Office Hours with Brian Deese, Deputy Director of the National Economic Council
Tuesday, August 2
4:00 p.m. EDT: Office Hours with Jason Furman, Principal Deputy Director of the National Economic Council
Wednesday, August 3
4:00 p.m. EDT: Office Hours with Brian Deese, Deputy Director of the National Economic Council

We’ve already answered dozens of questions from fellow Americans, and we look forward to hearing from you. For more information about Office Hours, including a wrap-up from last week, click here.

This email was sent to e0nstar1.blog@gmail.com
Manage Subscriptions for e0nstar1.blog@gmail.com
Sign Up for Updates from the White House
Unsubscribe e0nstar1.blog@gmail.com | Privacy Policy
Please do not reply to this email.
Contact the White House

The White House • 1600 Pennsylvania Ave NW • Washington, DC 20500 • 202-456-1111

Debt Deal Reached: Get the Facts

Your Daily Snapshot for
Monday, August 1, 2011

Debt Deal Reached: Get the Facts

On Sunday night, President Obama informed the nation that an agreement had been reached between both parties to raise the debt ceiling, preventing our country from defaulting for the first time in its history. Read the factsheet for details of the deal and check out White House Office Hours on Twitter this week to get your questions answered.

In case you missed it: watch the video of the President outlining the deal or read his full remarks.

In Case You Missed It

Here are some of the top stories from the White House blog.

Office Hours 7/29/11 or "Time Flies When You're Tweeting": Brian Deese Answers Your Questions on Twitter
During White House Office Hours, Deputy Director of the Economic Council Brian Deese answers your questions live via Twitter about the ongoing debt negotiations.

Gayle Smith on Aid & Development: Twitter Interview Follow-up
Gayle Smith, Special Assistant to the President and Senior Director for Development and Democracy, answers follow-up questions from her live Twitter interview.

Over 1.5 Million Records Released
A new release of White House visitor records brings the grand total of records that this White House has released to over 1.5 million records.

Today's Schedule

All times are Eastern Daylight Time (EDT).

9:30 AM: The President receives the Presidential Daily Briefing

10:00 AM: The President meets with Commerce Secretary Locke

10:30 AM: The President meets with Senior Advisors

11:00 AM: The Vice President meets with the Senate Democratic Caucus at the U.S. Capitol

11:45 AM: Press Briefing by Press Secretary Jay Carney

12:00 PM: The Vice President meets with the House Democratic Caucus at the U.S. Capitol

Indicates events that will be live streamed on WhiteHouse.Gov/Live

Get Updates

Stay Connected

This email was sent to e0nstar1.blog@gmail.com
Manage Subscriptions for e0nstar1.blog@gmail.com
Sign Up for Updates from the White House
Unsubscribe e0nstar1.blog@gmail.com | Privacy Policy
Please do not reply to this email. Contact the White House

The White House • 1600 Pennsylvania Ave NW • Washington, DC 20500 • 202-456-1111

Seth's Blog : Responsibility and authority

Responsibility and authority

Achievers in traditional organizations often say, "I want more authority." They mean that they want the power to make things happen, the mantle of authority that will allow them to get things done.

This is an industrial-era mindset. Management by authority is top-down, risk-averse, measurable and perfect for the org chart. It's essential in organizations that are stable, asset-based and adverse to risk.

There's a different approach, though, one that's based on responsibility instead of authority. "Anyone who takes responsibility for getting something done is welcome to ask for the authority to do it."

Ah, your bluff is called. And so is your boss's.

• Email to a friend •

duminică, 31 iulie 2011

Mish's Global Economic Trend Analysis

Futures Surge on "Pathetic" Debt Deal; Congress Should be Ashamed; US Deserves Debt Downgrade; Is Boehner Balking Over Cuts to Military Spending?

Posted: 31 Jul 2011 07:51 PM PDT

After spending a day in the garden weeding and transplanting I arrive at my computer to see S&P futures up 20 points, 1.5% on news a compromise was reached. Quite frankly this is ludicrous given that anyone not brain dead knew a deal would be reached.

Let's pick up the action starting with U.S. Stock Futures Advance as Obama, Lawmakers Agree to Raise Debt Limit

U.S. stock futures rose, indicating the Standard & Poor's 500 Index may rebound from its worst weekly loss in a year, as President Barack Obama announced an agreement to raise the federal debt limit and avoid a default.

Obama said in remarks at the White House that both parties in the U.S. House and Senate had reached an agreement to raise the nation's borrowing limit and cut the federal deficit.

"A lot of people were short the dollar and U.S. equities into the weekend, betting that we wouldn't have a deal," Frederic Dickson, who helps oversee $28 billion as chief market strategist at D.A. Davidson & Co. in Lake Oswego, Oregon, said in a telephone interview. "Now investors will be reversing those positions as things are looking better than they did on Friday, even though there are still some hurdles to climb in the next 48 hours."

Let's stop right there and point out genuine BullSweet starting with a US dollar intraday chart.

US$ 15-Mimute Chart

Does anyone see a short covering rally in the dollar? I sure don't.

Had there been an agreement to reduce the deficit by $4 trillion we might have seem one, but this deal changes nothing. Bear in mind this is coming from someone who is currently bullish on the US dollar.

Let's ask another question: Who did not expect a deal?

I accuse Frederic Dickson of genuine BullSweet.

The article continues ...

The framework of the debt agreement would raise the $14.3 trillion debt ceiling through 2012, cut spending by about $1 trillion and call for enactment of a law shaving another $1.5 trillion from long-term debt by 2021 -- or institute punishing reductions across all government areas, including Medicare and defense programs, according to congressional officials.

Senate Majority Leader Harry Reid, a Democrat, endorsed the emerging accord among Republican leaders and the Obama administration even as negotiators were working out the final details. Senate Minority Leader Mitch McConnell told senators tonight that the U.S. will not default on its obligations.

Both S&P and Moody's Investors Service are weighing a reduction of the U.S. credit rating. The impasse boosted to 50 percent the chance S&P will cut the grade from AAA within three months, the ratings company said last month.

Pathetic Deal

This is a pathetic deal. It's no wonder futures are rallying. My dead grandmother could find more cuts than this. The S&P, Moody's, and Fitch should all downgrade US debt on this deal.

$1 trillion up front and promises to cut another $1.5 trillion is the wimpiest of wimpy deals. The deficit is 1.4 Trillion. The immediate cut is a back loaded $100 billion. Then there is a possibility of another $150 billion back loaded cuts.

Anyone voting for this monstrosity should be ashamed.

Is Boehner Balking?

Here is something I picked up from Zero Hedge.

The Wall Street Journal "Washington Wire" comments on the The U.S. Debt Battle

5:24 pm: House Speaker John Boehner (R., Ohio) appears to be balking at the debt ceiling deal that Senate Democratic Majority Leader Harry Reid of Nevada has signed. Mr. Boehner is concerned about provisions in the deal that could lead to sharp cuts in military spending, say people familiar with the situation. House aides have warned that just because Mr. Reid has signed off on the deal doesn't mean the deal is done.

Ludicrous Deal Solves Nothing

This deal is ludicrous because it does not cut enough. Congress should be ashamed.

If Boehner is concerned about excessive cuts to military spending in this deal he has truly lost his marbles.

If it was up to me, I would pull all our troops out of Iraq, Afghanistan, Japan, Europe, and 140 countries where we have troops. If we did that, we could concentrate on protecting our borders instead of being the world's policeman. The savings would be enormous.

By the way, it would be fitting if this futures ramp was the mother of all gap-and-craps. This deal solves nothing.

Mike "Mish" Shedlock
http://globaleconomicanalysis.blogspot.com
Click Here To Scroll Thru My Recent Post List

Weekend Diversions: "Spectacular" Double Meteor Shower This Week; Space-Time Cloak Could Make Events Disappear; Invisibility Cloaks from Calcite

Posted: 31 Jul 2011 11:26 AM PDT

Had enough of the debt ceiling fiasco? If so here are a few interesting weekend diversions courtesy of National Geographic.

"Spectacular" Double Meteor Shower This Week

One of the best shooting star events of the year is the annual August Perseid meteor shower. However this year's peak, on August 12, happens to coincide with a bright full moon—drastically cutting down the number of meteors visible to the naked eye.

Yet while the main event might be blocked out by the blinding moonlight, the opening act promises to be much better.

This year the lesser known Delta Aquarid meteor shower is expected to peak on Friday night, when the Delta Aquarids' more productive Perseid cousin is just starting to ramp up.

Together the showers will produce anywhere between 15 and 30 shooting stars per hour under clear, dark skies.

On average, the Perseids begin falling at a rate of around five meteors per hour. They're visible for a couple of weeks before mid-August, when they peak at hourly rates of 60 to 120 meteors.

Most people around the world can see the showers, best seen with the naked eye in a dark, rural area away from city lights. Since meteors will be streaking across the overhead skies, lie down on a blanket or recline in a lawn chair and allow your eyes to become adapted to the darkness, Samra suggested.

"Meteor shower activity always increases as the night progresses towards dawn. If you are a night owl, then staying up to catch a more spectacular show might be worth it."

But all may not be lost with the Perseids—observing the sky show a few days before the August 12 peak may work too, noted astronomer Geza Gyuk of the Adler Plaentarium in Chicago.

"For example, on the night of the ninth, morning of the tenth, there will be a couple hours after the moon has set [about 2 a.m. local time] and before the morning twilight begins when it's close enough to the peak that one might expect 15 per hour."

"They are also known for the occasional nice fireball with a long-lasting 'smoke trail,'" Gyuk said. "If we get more of these than usual, then even moonlight won't spoil the fun."

Perseid Pictures: Meteor Shower Dazzles Every August

One of many images in the link.

Space-Time Cloak Possible, Could Make Events Disappear?

It's no illusion: Science has found a way to make not just objects but entire events disappear, experts say.

According to new research by British physicists, it's theoretically possible to create a material that can hide an entire bank heist from human eyes and surveillance cameras.

"The concepts are basically quite simple," said Paul Kinsler, a physicist at Imperial College London, who created the idea with colleagues Martin McCall and Alberto Favaro.

Unlike invisibility cloaks—some of which have been made to work at very small scales—the event cloak would do more than bend light around an object.

(Also see "Acoustic 'Invisibility' Cloaks Possible, Study Says.")

Instead this cloak would use special materials filled with metallic arrays designed to adjust the speed of light passing through.

In theory, the cloak would slow down light coming into the robbery scene while the safecracker is at work. When the robbery is complete, the process would be reversed, with the slowed light now racing to catch back up.

If the "before" and "after" visions are seamlessly stitched together, there should be no visible trace that anything untoward has happened. One second there's a closed safe, and the next second the safe has been emptied.

Currently, nobody knows how to do that except in fiber optics, in which the speed of a signal can be varied by a few percent by changing the intensity of the light.

There are still a few hitches to address, though, before attempting such an experiment, according to the University of St. Andrews's Leonhardt.

For instance, being able to cloak an event lasting more than a few femtoseconds—one-millionth of a nanosecond—would require light from an immensely powerful laser, he said.

"The experiment is not entirely impossible, but it is at the limit of what one can do with present technology in an ordinary university laboratory," Leonhardt said.

New Invisibility Cloak Closer to Working "Magic"

A pink object seems to vanish behind a chunk of calcite, underwater and illuminated by green light.

Harry Potter and Bilbo Baggins, take note: Scientists are a step closer to conquering the "magic" of invisibility.

Many earlier cloaking systems turned objects "invisible" only under wavelengths of light that the human eye can't see. Others could conceal only microscopic objects. (See "Two New Cloaking Devices Close In on True Invisibility.")

But the new system, developed at Massachusetts Institute of Technology and the Singapore-MIT Alliance for Research and Technology (SMART) Centre, works in visible light and can hide objects big enough to see with the naked eye.

The "cloak" is made from two pieces of calcite crystal—a cheap, easily obtained mineral—stuck together in a certain configuration.

Calcite is highly anisotropic, which means that light coming from one side will exit at a different angle than light entering from another side. By using two different pieces of calcite, the researchers were able to bend light around a solid object placed between the crystals.

"Under the assembly there is a wedge-shaped gap," said MIT's George Barbastathis, who helped develop the new system. "The idea is that whatever you put under this gap, it looks from the outside like it is not there."

It is quite amazing the stuff scientists are working on and the images from National Geographic are spectacular. Inquiring minds will want to give some of those articles a closer look.

My weekend diversion is gardening and golf.

Mike "Mish" Shedlock
http://globaleconomicanalysis.blogspot.com
Click Here To Scroll Thru My Recent Post List

Seth's Blog : Difficult conversations at work

Difficult conversations at work

When the outcome of a conversation is in doubt, don't do it by email. And show up in person if you can.

The synchronicity of face to face conversation gives you the chance to change your tone in midstream. Ask questions. A great question is usually better than a good answer.

And don't forget--the value of a long pause is difficult to overstate.

• Email to a friend •

luni, 1 august 2011

SEOmoz Daily SEO Blog

SEOmoz Daily SEO Blog

#1 Truncated PageRank

#2 Owned / Accessible Contributions

#3 Relative Mass

#4 Counting Supporters / Speeds to Nodes

#5 TrustRank, Anti-TrustRank, SpamRank, etc.

#6 Anchor Text vs. Time

#7 Link Growth Thresholds

#8 Robust PageRank

#9 PageRank Variance

#10 Diminishing Returns

Link Spam Algorithms

Get the Facts on the Debt Deal

Debt Deal Reached: Get the Facts

Seth's Blog : Responsibility and authority

Responsibility and authority

More Recent Articles

duminică, 31 iulie 2011

Mish's Global Economic Trend Analysis

Mish's Global Economic Trend Analysis

Seth's Blog : Difficult conversations at work

Difficult conversations at work

More Recent Articles

Pagini

Persoane interesate

Arhivă blog

luni, 1 august 2011

#1 Truncated PageRank

#2 Owned / Accessible Contributions

#3 Relative Mass

#4 Counting Supporters / Speeds to Nodes

#5 TrustRank, Anti-TrustRank, SpamRank, etc.

#6 Anchor Text vs. Time

#7 Link Growth Thresholds

#8 Robust PageRank

#9 PageRank Variance

#10 Diminishing Returns

Link Spam Algorithms

More Recent Articles

duminică, 31 iulie 2011

More Recent Articles

Pagini

Persoane interesate

Arhivă blog

Subscribe to