Central Perk

miercuri, 24 aprilie 2013

Machine Learning and Link Spam: My Brush With Insanity

Posted: 23 Apr 2013 07:33 PM PDT

This post was originally in YouMoz, and was promoted to the main blog because it provides great value and interest to our community. The author's views are entirely his or her own and may not reflect the views of SEOmoz, Inc.

sadfishie

Know someone who thinks they’re smart? Tell them to build a machine learning tool. If they majored in, say, History in college, within 30 minutes they’ll be curled up in a ball, rocking back and forth while humming the opening bars of “Oklahoma.”

Sometimes, though, the alternative is rooting through 250,000 web pages by hand, checking them for compliance with Google’s TOS. Doing that will skip you right past the rocking-and-humming stage, and launch you right into writing-with-crayons-between-your-toes phase.

Those were my two choices six months ago. Several companies came to Portent asking for help with Penguin/manual penalties. They all, for one reason or another, had dirty link profiles.

Link analysis, the hard way. Back when I was a kid...

I did the first link profile review by hand, like this:

Download a list of all external linking pages from SEOmoz, MajesticSEO, and Google Webmaster Tools.
Remove obviously bad links by analyzing URLs. Face it: if a linking page is on a domain like “FreeLinksDirectory.com” or “ArticleSuccess.com,” it’s gotta go.
Analyze the domain and page trustrank and trustflow. Throw out anything with a zero, unless it’s on a list of ‘whitelisted’ domains.
Grab thumbnails of each remaining linking page, using Python, Selenium, and Phantomjs. You don’t have to do this step, but it helps if you’re going to get help from other folks.
Get some poor bugger a faithful Portent team member to review the thumbnails, quickly checking off whether they’re forums, blatant link spam, or something else.

After all of that prep work, my final review still took 10+ hours of eye-rotting agony.

There had to be a better way. I knew just enough about machine learning to realize it had possibilities, so I dove in. After all, how hard can it be?

Machine learning: the basic concept

The concept of machine learning isn’t that hard to grasp:

Take a large dataset you need to classify. It could be book titles, people’s names, Facebook posts, or, for me, linking web pages.
Define the categories. In this case, I’m looking for ‘spam’ and ‘good.’
Get a collection of those items and classify them by hand. Or, if you’re really lucky, you find a collection that someone else classified for you. The Natural Language Toolkit, for example, has a movie reviews corpus you can use for sentiment analysis. This is your training set.
Pick the right machine learning tool (hah).
Configure it correctly (hahahahahahaha heee heeeeee sniff haa haaa… sorry, I’m ok… ha ha haaaaaaauuuugh).
Feed in your training set, with the features — the item attributes used for classification — pre-selected. The tool will find patterns, if it can (giggle).
Use the tool to compare each item in your dataset to the training set.
The tool returns a classification of each item, plus its confidence in the classification and, if it’s really cool, the features that were most critical in that classification.

If you ignore the hysterical laughter, the process seems pretty simple. Alas, the laughter is a dead giveaway: these seven steps are easy the same way “Fly to moon, land on moon, fly home” is three easy steps.

Note: At this point, you could go ahead and use a pre-built toolset like BigML, Datameer, or Google’s Prediction API. Or, you could decide to build it all by hand. Which is what I did. You know, because I have so much spare time. If you’re unsure, keep reading. If this story doesn’t make you run, screaming, to the pre-built tools, start coding. You have my blessings.

The ingredients: Python, NLTK, scikit-learn

I sketched out the process for IIS (Is It Spam, not Internet Information Server) like this:

Download a list of all external linking pages from SEOmoz, MajesticSEO, and Google Webmaster Tools.
Use a little Python script to scrape the content of those pages.
Get the SEOmoz and MajesticSEO metrics for each linking page.
Build any additional features I wanted to use. I needed to calculate the reading grade level and links per word, for example. I also needed to pull out all meaningful words, and a count of those words.
Finally, compare each result to my training set.

To do all of this, I needed a programming language, some kind of natural language processing (to figure out meaningful words, clean up HTML, etc.) and a machine learning algorithm that I could connect to the programming language.

I’m already a bit of a Python hacker (not a programmer – my code makes programmers cry), so Python was the obvious choice of programming language.

I’d dabbled a little with the Natural Language Toolkit (NLTK). It’s built for Python, and would easily filter out stop words, clean up HTML, and do all the other stuff I needed.

For my machine learning toolset, I picked a Python library called scikit-learn, mostly because there were tutorials out there that I could actually read.

I smushed it all together using some really-not-pretty Python code, and connected it to a MongoDB database for storage.

A word about the training set

The training set makes or breaks the model. A good training set means your bouncing baby machine learning program has a good teacher. A bad training set means it’s got Edna Krabappel.

And accuracy alone isn’t enough. A training set also has to cover the full range of possible classification scenarios. One ‘good’ and one ‘spam’ page aren’t enough. You need hundreds or thousands to provide a nice range of possibilities. Otherwise, the machine learning program stagger around, unable to classify items outside the narrow training set.

Luckily, our initial hand-review reinclusion method gave us a set of carefully-selected spam and good pages. That was our initial training set. Later on, we dug deeper and grew the training set by running Is It Spam and hand-verifying good and bad page results.

That worked great on Is It Spam 2.0. It didn’t work so well on 1.0.

First attempt: fail

For my first version of the tool, I used a Bayesian Filter as my machine learning tool. I figured, hey, it works for e-mail spam, why not SEO spam?

Apparently, I was already delirious at that point. Bayesian filtering works for e-mail spam about as well as fishing with a baseball bat. It does occasionally catch spam. It also misses a lot of it, dumps legitimate e-mail into spam folders, and generally amuses serious spammers the world over.

But, in my madness, I forgot all about these little problems. Is It Spam 1.0 seemed pretty great at first. Initial tests showed 75% accuracy. That may not sound great, but with accurate confidence data, it could really streamline link profile reviews. I was the proud papa of a baby machine learning tool.

But Bayesian filters can be ‘poisoned.’ If you feed the filter a training set where 90% of the spam pages talk about weddings, it’s possible the tool will begin seeing all wedding-related content as spam. That’s exactly what happened in my case: I fed in 10,000 or so pages of spammy wedding links (we do a lot of work in the wedding industry). On the next test run, Is It Spam decided that anything matrimonial was spam. Accuracy fell to 50%.

Since we tend to use the tool to evaluate sites in specific verticals, this would never work. Every test would likely poison the filter. We could build the training set to millions of pages, but my pointy little head couldn’t contemplate the infrastructure required to handle that.

The real problem with a pure Bayesian approach is that there’s really only one feature: The content of the page. It ignores things like links, page trust and authority.

Oops. Back to the drawing board. I sent my little AI in for counseling, and a new brain.

Note: I wouldn’t have figured this out without help from SEOmoz’s Dr. Pete and Matt Peters. A ‘hat tip’ doesn’t seem like enough, but for now, it’ll have to do.

Second attempt: a qualified success

My second test used logistic regression. This machine learning model uses numeric data, not text. So, I could feed it more features. After the first exercise, this actually wasn’t too horrific. A few hours of work got me a tool that evaluates:

Page TrustFlow and CitationFlow (from MajesticSEO – I’m adding SEOmoz metrics now)
Links per word
Page Flesch-Kincaid reading grade level
Page Flesch Kincaid reading ease
Words per page
Syllables per page
Characters per page
A few other seemingly-random bits, like images per page, misspellings, and grammar errors

This time, the tool worked a lot better. With vertical-specific training sets, it ran with 85%+ accuracy.

In case you're wondering, this is what victory looks like:

This is what victory looks like

When I tried to use the tool for more general tests, though, my coded kid tripped over its big, adolescent feet. Some of the funnier results:

It saw itself as spam.
It thought Rand’s blog was a swirling black hole of spammy despair.

False positives remain a big problem if we try to build a training set outside a single vertical.

Disappointing. But the tool chugs along happily within verticals, so we continue using it for that. We build a custom training set for each client, then run the training set against the remaining links. The result is a relatively clear report:

excelreport

Results and next steps

With little IIS learning to walk, we’ve cut the brute-force portion of large link profile evaluations from 30 hours to 3 hours. Not. Too. Shabby.

I tried to launch a public version of Is It Spam, but folks started using it to do real link profile evaluations, without checking their results. That scared the crap out of me, so I took the tool down until we cure the false positives problem.

I think we can address the false positives issue by adding a few features to the classification set:

Bayesian filtering: Instead of depending on a Bayesian classification as 100% of the formula we’ll use the Bayesian score as one more feature.
Grammar scoring: Anyone know a decent grammar testing algorithm in Python? If so, let me know. I’d love to add grammar quality as a feature.
Anchor text matters a lot. The next generation of the tool needs to score the relevant link based on the anchor text. Is it a name (like in a byline)? Or is it a phrase (like in a keyword-stuffed link)?
Link position may matter, too. This is another great feature that could help with spam detection. It might lead to more false positives, though. If Is It Spam sees a large number of spammy links in press release body copy, it may start rating other links located in body copy as spam, too. We’ll test to see if the other features are enough to help with this.

If I'm lucky, one or more of these changes may yield a tool that can evaluate pages across different verticals. If I'm lucky.

Insights

This is by far the most challenging development project I've ever tried. I probably wore another 10 years' enamel off my teeth in just six weeks. But it's been productive:

When you start digging into automated page analysis and machine learning, you learn a lot about how computers evaluate language. That's awfully relevant if you're a 21st Century marketer.
I uncovered an interesting pattern in Google's Penguin implementation. This is based on my fumbling about with machine learning, so take it with a grain of salt, but have a look here.
We learned that there is no such thing as a spammy page. There are only spammy links. One link from a particular page may be totally fine: For example, a brand link from a press release page. Another link from that same page may be spam: For example, a keyword-stuffed link from the same press release.
We've reduced time required for an initial link profile evaluation by a factor of ten.

It's also been a great humility-building exercise.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Meet Mr. Charbonneau, Teacher of the Year

Your Daily Snapshot for
Wednesday, April 24, 2013

Meet Mr. Charbonneau, Teacher of the Year

Yesterday President Obama honored Jeff Charbonneau, a teacher from Washington state, as the 2013 National Teacher of the Year.

Educators like Jeff and everyone up here today, they represent the very best of America -- committed professionals who give themselves fully to the growth and development of our kids. And with them at the front of the classroom and leading our schools, I am absolutely confident that our children are going to be prepared to meet the tests of our time and the tests of the future.

President Barack Obama, with Education Secretary Arne Duncan, honors 2013 National Teacher of the Year Jeff Charbonneau, State Teachers of the Year, and Principals of the Year, in the Rose Garden of the White House, April 23, 2013. (Official White House Photo by Pete Souza)

Forget Google AdWords: Pay $3 Per Month NOT $3 Per Click

This is a SubmitStart Sponsor Update. Unsubscribe from this list.

Dear Business Owner,

If you've tried Google Adwords and other expensive search engine marketing programs that cost more than they deliver, now might be time to try a flat fee system that puts your ad on 90+ search engines and web directories as well as 2 PPC Networks (Advertise.com and Affinity.com) for just $3 to $4 per month.

No Bidding - No PPC - No Hassle - No SEO

We've provided budget-minded business owners with an inexpensive ad delivery system for 8 years. Some of our advertisers have been with us that entire time.

To find out more, visit our order page or watch our video introduction.

We provide a proven search engine marketing program for budget-minded online businesses. It doesn't matter whether you receive 10 clicks or 1,000 clicks, you still pay the same flat-fee rate of $3 - $4 per month.

As a further incentive to try our program, we'll throw in 6 free bonus ebooks (on SEO, Social Media & Traffic Building) valued at $100 with your purchase.

SEO Secrets v1.4 (49 page ebook)
Article Directory Marketing & Syndication (37 page ebook)
Using Twitter Effectively (71 page ebook)
How to Optimize for Google (10 page whitepaper)
LinkedIn Profile Optimization (40 page ebook)
Traffic Heist (56 page ebook)

Visit ExactSeek today to place your order

Sent to e0nstar1.blog@gmail.com — why did I get this?

unsubscribe from this list | update subscription preferences

SubmitStart · Trade Center · Kristian IV:s väg 3 · Halmstad 302 50

Seth's Blog : Your manifesto, your culture

Your manifesto, your culture

It's so easy to string together a bunch of platitudes and call them a mission statement. But what happens if you actually have a specific mission, a culture in mind, a manifesto for your actions?

The essential choice is this: you have to describe (and live) the difficult choices. You have to figure out who you will disappoint or offend. Most of all, you have to be clear about what's important and what you won't or can't do.

Here's one that was published this week, by my friends at Acumen:

Acumen: It starts by standing with the poor, listening to voices unheard, and recognizing potential where others see despair.

It demands investing as a means, not an end, daring to go where markets have failed and aid has fallen short. It makes capital work for us, not control us.

It thrives on moral imagination: the humility to see the world as it is, and the audacity to imagine the world as it could be. It's having the ambition to learn at the edge, the wisdom to admit failure, and the courage to start again.

It requires patience and kindness, resilience and grit: a hard-edged hope. It's leadership that rejects complacency, breaks through bureaucracy, and challenges corruption. Doing what's right, not what's easy.

Acumen: it's the radical idea of creating hope in a cynical world. Changing the way the world tackles poverty and building a world based on dignity.

Starts, demands, thrives and requires. Four words that are not in the vocabulary of most organizations.

Starts, as in, "here's where we are, where few others are." Most politicians and corporate entities can't imagine standing with the poor. Apart from them, sure. But with them?

Demands? Demands mean making hard choices about who your competition will be and what standards you're willing to set and be held to.

Thrives, because your organization is only worth doing if it gets to the point where it will thrive, where you will be making a difference, not merely struggling or posturing.

And requires, because none of this comes easy.

David highlights a very diffent (but strikingly similar) document from HubSpot. The same dynamic is at work: no platitudes, merely a difficult to follow (but worth it) compass for how to move forward.

Both require the hubris of caring, of thinking big and being willing to fail if that's what it takes to attempt the right thing.

It's easy to write something like this (hey, even the TSA has one) but it's incredibly difficult to live one, because it requires difficult choices and the willingness to own the outcome of your actions. If you're going to permit loopholes, wiggle room and deniability, don't even bother.

• Email to a friend •

Just for You from YouTube: Weekly Update - Apr 24, 2013

Mihai T, check out the latest videos from your channel subscriptions for Apr 24, 2013. Play all »

AutoPilot Beats, 2Face, and Crizmatik at Work

+ 1 video by AutoPilotBeats

FIFA 13 | Modo Carreira - My Player | O Empréstimo #1

by TheAndr3wGamer

INSANE TRICKSHOT ON STUDIO! (NEW MAP!)

by iStylzzHD

Устроили скачки на трассе

+ 4 videos by nakedassTv

Raising the Bar on Silly

+ 1 video by Just For Laughs Gags

Great Chaos Mobile Podcast #1: The Road to SMAtlanta

by Great Chaos Entertainment

Introducing Slovak | Arasaka | Let It Go

+ 2 videos by VCEditors

Glitch - Skillz

by Glitch Menace

Just your average QWAD!!

+ 12 videos by badSmarties

Man united vs Aston Villa 2013 Final Whistle Old Trafford United Ch...

by bozowantfood

There are more new videos waiting for you on your YouTube homepage »

marți, 23 aprilie 2013

Mish's Global Economic Trend Analysis

U.S. Mint Runs Out of Smallest American Eagle Gold Coin; Is There a Shortage of Physical Gold? Coordinated Smackdown by Naked Shorts?
Germany Private Sector Output Declines First Time Since November; Eurozone Activity Declines 19th Time in 20 Months
Wine Country Conference "Opening Remarks" Video Posted
Political Prediction: Merkel Loses Chancellorship in September as Support for AfD Soars

U.S. Mint Runs Out of Smallest American Eagle Gold Coin; Is There a Shortage of Physical Gold? Coordinated Smackdown by Naked Shorts?

Posted: 23 Apr 2013 11:21 PM PDT

Demand for gold coins has surged following the record price plunge in gold last week. Demand is so high that the U.S. Mint Runs Out of Smallest American Eagle Gold Coin.

The U.S. Mint ran out its smallest American Eagle gold coin after demand surged following the biggest drop in futures prices in 33 years.

Sales of the coins weighing a 10th of an ounce were suspended after demand more than doubled in 2013 from a year earlier, the Mint said today in a statement. Total sales of American Eagles in April have almost tripled from a month earlier, according to Mint data on the website.

On April 15, gold futures in New York plunged 9.3 percent, the most since 1980. Retail sales and jewelry demand soared in India, the world's top buyer, and China, the second-biggest. Coin sales also surged in Australia.

The Mint also sells 22-karat American Eagle coins of 1 ounce, half an ounce and a quarter of an ounce.

The U.S. Mint suspended sales of silver coins in January for more than a week because of lack of inventory. Sales of the coins jumped to a record that month.

Bullish or Bearish?

It's possible to make a bullish or bearish argument out of this shortage. The bullish argument is simple: demand is strong. The bearish argument is small investors are a contrarian indicator just as they were with silver in January.

I am not taking a short-term stance one way or another, so don't ask. I do like my chances longer-term as I explained at the Wine Country Conference. See Mike "Mish" Shedlock: A Brief Lesson in History.

Shortage of Physical Gold?

Some writers have spun this story into the message there is a shortage of physical gold. No there isn't. There is a temporary shortage of certain coins, no more no less.

Divergence Between Physical Gold and Paper Gold?

Other writers have noticed the price premium on small denomination coins and concluded there is some sort of "divergence between physical gold and paper gold".

Once again, that's nonsense. Premiums on small denomination coins is not the same a general premium on physical gold itself.

How do I know?

Easy: If I went to buy or sell at GoldMoney (and GoldMoney only deals in physical metals with allocated, audited storage), I would pay the same small markup as before, based on the current futures price.

Here is another way to tell. Go buy or sell a one ounce bar and see how much it costs or how much you can get. Here's a hint: your selling price will not fetch $1900 as it once did, nor would it cost you over $1900 to buy.

Smackdown by Naked Shorts?

Many claim blatant manipulation by naked shorts. Mercy! Under this theory, shorts piled on to the tune of 163,000 gold futures. Really?

Keith Weiner tackles that theory for the Acting Man Blog in The Last Contango. Here is the pertinent chart.

Weiner asks "If someone had sold 163,000 futures to cause the price to drop, then wouldn't the open interest [in futures] have risen? If Santa went down chimneys, wouldn't there be soot on his red and white uniform?"

The answer to both questions is of course "yes". Instead, the chart shows a 16,000 open interest drop in gold futures and a 12,000 drop in silver futures.

Ignore the Hype in Both Directions

Bulls blame every drop on manipulation and frequently tout preposterous price targets. Bears cite jewelry demand and other nonsense as if it's important (and it isn't).

It is best to ignore the hype and silliness on both sides.

Fundamentally, what has changed? I suggest nothing.

Mike "Mish" Shedlock
http://globaleconomicanalysis.blogspot.com

Germany Private Sector Output Declines First Time Since November; Eurozone Activity Declines 19th Time in 20 Months

Posted: 23 Apr 2013 10:18 PM PDT

As expected in this corner, the Markit Flash Germany PMI® shows German private sector output declines for first time since November 2012.

Key Points:

Flash Germany Composite Output Index(1) at 48.8 (50.6 in March), 6-month low.

Flash Germany Services Activity Index(2) at 49.2 (50.9 in March), 6-month low.

Flash Germany Manufacturing PMI(3) at 47.9 (49.0 in March), 4-month low.

Flash Germany Manufacturing Output Index(4) at 47.9 (50.0 in March), 4-month low.

PMI vs. GDP

Lower levels of private sector business activity reflected a decrease in new order volumes for the second successive month during April. The overall pace of contraction was the steepest since October 2012, largely driven by a marked decrease in new work received by service providers. Manufacturing new orders dropped at the fastest pace so far this year but, in contrast to the service sector, the rate of new business decline remained slower than on average in 2012. In the manufacturing sector, new export orders declined at the most marked pace so far in 2013, but the rate of contraction was slightly slower than seen for overall new work.

April data suggested a general lack of pressure on operating capacity in the German private sector, as backlogs of work decreased for the twenty-second month running. The current period of declining work-in-hand but not yet completed) is the longest since this series began over 10 years ago.

Eurozone Activity Declines 19th Time in 20 Months

The Markit Flash Eurozone PMI® shows Eurozone suffers ongoing downturn in April.

Key Points:

Flash Eurozone PMI Composite Output Index at 46.5 (46.5 in March).

Flash Eurozone Services PMI Activity Index at 46.6 (46.4 in March). Two-month high.

Flash Eurozone Manufacturing PMI at 46.5 (46.8 in March). Four-month low.

Flash Eurozone Manufacturing PMI Output Index at 46.3 (46.7 in March). Four-month low.

GDP vs. PMI

Summary:

The Markit Eurozone PMI® Composite Output Index was unchanged on March's reading of 46.5
in April, according to the flash estimate. The sub-50 reading indicated a drop in activity for the nineteenth time in the past 20 months, the exception being a marginal increase in January
2012.

Activity fell sharply again in both manufacturing and services. While the former saw the steepest rate of decline for four months, the latter saw the downturn ease slightly compared with March.

New business fell for the twenty-first successive month, with the rate of deterioration accelerating for the third month in a row to signal the steepest decline since December. Marked falls were seen in both manufacturing and services.

The ongoing deterioration in the order book pipeline prompted firms to cut payroll numbers for the sixteenth month running. The rate of job losses accelerated slightly on March, reflecting stronger rates of job shedding in both manufacturing and services.

Wishin' and Hopin'

Those wishing and hoping that Germany was going to remain divergent from the rest of the eurozone can now safely toss that notion on the scrapheap of foolish ideas.

As I have been saying, at some point Germany will start a steep acceleration of the overall eurozone recession. That time may be at hand now.

Mike "Mish" Shedlock
http://globaleconomicanalysis.blogspot.com

Wine Country Conference "Opening Remarks" Video Posted

Posted: 23 Apr 2013 07:21 PM PDT

My opening remarks for the Wine Country Conference are available at Wine Country Conference Speaker Presentations.

Unlike the speaker presentations, my opening speech is a short 7 minutes or so.

We are making a couple of video presentations available each week for three weeks.

John Hussman's presentation "An Unstable Equilibrium" was posted last week. My presentation "A Brief Lesson in History" is also available.

Speaker presentation material, Yahoo! Finance media interviews, and associated articles on Advisor Perspectives are now available online at Wine Country Conference Speaker Slides.

If you enjoy the videos and slides please consider making a Donation to the Les Turner ALS Foundation. Specify "Mish Campaign" on the donation to earmark funds for research.

All told, we raised nearly $500,000 for ALS research, subject to final audit. $100,000 of that came from a very generous matching donation from the Hussman Foundation.

Thanks again for a successful 2013 WCC and we look forward to many more! The 2014 conference will raise money for autism research and programs.

Mike "Mish" Shedlock
http://globaleconomicanalysis.blogspot.com

Political Prediction: Merkel Loses Chancellorship in September as Support for AfD Soars

Posted: 23 Apr 2013 11:18 AM PDT

Now that we have an Understanding of German Political Parties let's consider some scenarios from reader Bernd who lives in Germany about how the election on September 22 plays out.

In past elections, as many as 40% of the eligible population did not vote. This brings a very large pool of voters that may turn out for the anti-Euro AfD party.

Bernd comments "All indications show that AfD has a huge potential to convert nonvoters to voters. If election turnout is high, seats in Parliament may look drastically different than today."

For the past two weeks I have been watching the iPhone app Wahl-O-Meter and AfD has risen from 5% of the vote to 6.6% now.

Not only has AfD been rising, but today is the first day I have seen AfD above Die Linke. In the same timeframe, support for FDP has been sinking.

FDP is right on the bubble, polling 5% down from 5.4% or so. It takes 5% of the vote to stay in parliament.

This is significant because FDP currently has the third largest representation in parliament and is the junior partner in Merkel's existing CDU/CSU/FDP coalition. The very real danger to Merkel is FDP goes to zero percent representation.

In theory CDU, CSU, and SPD could easily form a coalition that totals over 50% of parliament. However, the SPD political party leadership has ruled out a coalition government that includes Merkel as chancellor.

AfD Support Rises to 19% in Handelsblatt Online Poll

Via Google translate, please note 19 percent would vote for the anti-euro party.

The new anti-euro party alternative for Germany (AFD) has a good chance to collect in autumn in the Bundestag. The result of a representative survey of online market research company market research on behalf of Handelsblatt Online. 19.2 percent of the 1,003 respondents affirmed therefore the question of whether they would give the party their vote in the general election (24.9 percent of men and 14.8 percent of women).

54.6 percent of respondents (56.7 percent of men, 53 percent of women) would not choose the AFD on the other hand, 26 percent of respondents stated that they have not made a choice decision (18.4 percent of men, 32.2 percent of women).

Their greatest potential voters, the party in the 46 - to 65-year-olds: 23.1 percent of this age group, the AFD would give their vote (in the 31 - to 45-year-olds: 19.3 percent among 18 - to 30 - year: 14.2 percent).

Online polls are notoriously inaccurate, so the key takeaway is continually rising interest.

A few weeks ago many news agencies were stating AfD would not reach the five percent threshold. Bernd and I think 12% or more is easily achievable.

With that backdrop out of the way, let's take a look at reader Bernd's speculative estimates for the September election.

Bernd's Speculative Estimates

CDU/CSU: 36%
SPD: 23%
Grünen: 13%
AfD: 12%
Die Linke: 06%
FDP: 04%
Piraten: 03%
Others: 03%

Should that scenario or any close approximation play out, it will be quite difficult for Merkel to stay in power.

A "natural coalition" between CDU/CSU and AfD could in theory work, and it might not take 50% of the popular vote to form such a coalition on account of the parties losing representation. Yet, even in such scenario, the price to pay for CDU/CSU would likely be Merkel's chancellorship.

Bernd outlines four possibilities

Four Possibilities

SPD, CDU, CSU form a coalition in which Merkel steps aside or is forced out.
SPD forms a coalition with half term Merkel (CDU) and half term Steinbrück (SPD) as chancellor.
The politics splinters as happened in Italy with unknown effects for the coming government.
CDU/CSU and AfD form a coalition under a new and unknown chancellor. Together they reform EU politics.

Bernd states option four would be ideal but right now such a possibility is wishful thinking as opposed to a strong likelihood.

Mike "Mish" Shedlock
http://globaleconomicanalysis.blogspot.com

miercuri, 24 aprilie 2013

Machine Learning and Link Spam: My Brush With Insanity

Machine Learning and Link Spam: My Brush With Insanity

Link analysis, the hard way. Back when I was a kid...

Machine learning: the basic concept

The ingredients: Python, NLTK, scikit-learn

A word about the training set

First attempt: fail

Second attempt: a qualified success

Results and next steps

Insights

Meet Mr. Charbonneau, Teacher of the Year

Forget Google AdWords: Pay $3 Per Month NOT $3 Per Click

No Bidding - No PPC - No Hassle - No SEO

Visit ExactSeek today to place your order

Seth's Blog : Your manifesto, your culture

Your manifesto, your culture

More Recent Articles

Just for You from YouTube: Weekly Update - Apr 24, 2013

marți, 23 aprilie 2013

Mish's Global Economic Trend Analysis

Mish's Global Economic Trend Analysis

Pagini

Persoane interesate

Arhivă blog

miercuri, 24 aprilie 2013

Link analysis, the hard way. Back when I was a kid...

Machine learning: the basic concept

The ingredients: Python, NLTK, scikit-learn

A word about the training set

First attempt: fail

Second attempt: a qualified success

Results and next steps

Insights

No Bidding - No PPC - No Hassle - No SEO

Visit ExactSeek today to place your order

More Recent Articles

marți, 23 aprilie 2013

Pagini

Persoane interesate

Arhivă blog

Subscribe to