miercuri, 15 aprilie 2015
Using Term Frequency Analysis to Measure Your Content Quality - Moz Blog
Using Term Frequency Analysis to Measure Your Content Quality Posted on: Wednesday 15 April 2015 — 02:14 Posted by EricEnge It's time to look at your content differently—time to start understanding just how good it really is. I am not simply talking about titles, keyword usage, and meta descriptions. I am talking about the entire page experience. In today's post, I am going to introduce the general concept of content quality analysis, why it should matter to you, and how to use term frequency (TF) analysis to gather ideas on how to improve your content.
TF analysis is usually combined with inverse document frequency analysis (collectively TF-IDF analysis). TF-IDF analysis has been a staple concept for information retrieval science for a long time. You can read more about TF-IDF and other search science concepts in Cyrus Shepard's excellent article here. For purposes of today's post, I am going to show you how you can use TF analysis to get clues as to what Google is valuing in the content of sites that currently outrank you. But first, let's get oriented. Conceptualizing page qualityStart by asking yourself if your page provides a quality experience to people who visit it. For example, if a search engine sends 100 people to your page, how many of them will be happy? Seventy percent? Thirty percent? Less? What if your competitor's page gets a higher percentage of happy users than yours does? Does that feel like an "uh-oh"? Let's think about this with a specific example in mind. What if you ran a golf club site, and 100 people come to your page after searching on a phrase like "golf clubs." What are the kinds of things they may be looking for?
Here are some things they might want:
This is really only a partial list, and the specifics of your site can certainly vary for any number of reasons from what I laid out above. So how do you figure out what it is that people really want? You could pull in data from a number of sources. For example, using data from your site search box can be invaluable. You can do user testing on your site. You can conduct surveys. These are all good sources of data. You can also look at your analytics data to see what pages get visited the most. Just be careful how you use that data. For example, if most of your traffic is from search, this data will be biased by incoming search traffic, and hence what Google chooses to rank. In addition, you may only have a small percentage of the visitors to your site going to your privacy policy, but chances are good that there are significantly more users than that who notice whether or not you have a privacy policy. Many of these will be satisfied just to see that you have one and won't actually go check it out. Whatever you do, it's worth using many of these methods to determine what users want from the pages of your site and then using the resulting information to improve your overall site experience. Is Google using this type of info as a ranking factor?At some level, they clearly are. Clearly Google and Bing have evolved far beyond the initial TF-IDF concepts, but we can still use them to better understand our own content. The first major indication we had that Google was performing content quality analysis was with the release of the Panda algorithm in February of 2011. More recently, we know that on April 21 Google will release an algorithm that makes the mobile friendliness of a web site a ranking factor. Pure and simple, this algo is about the user experience with a page. Exactly how Google is performing these measurements is not known, but what we do know is their intent. They want to make their search engine look good, largely because it helps them make more money. Sending users to pages that make them happy will do that. Google has every incentive to improve the quality of their search results in as many ways as they can. Ultimately, we don't actually know what Google is measuring and using. It may be that the only SEO impact of providing pages that satisfy a very high percentage of users is an indirect one. I.e., so many people like your site that it gets written about more, linked to more, has tons of social shares, gets great engagement, that Google sees other signals that it uses as ranking factors, and this is why your rankings improve. But, do I care if the impact is a direct one or an indirect one? Well, NO. Using TF analysis to evaluate your pageTF-IDF analysis is more about relevance than content quality, but we can still use various precepts from it to help us understand our own content quality. One way to do this is to compare the results of a TF analysis of all the keywords on your page with those pages that currently outrank you in the search results. In this section, I am going to outline the basic concepts for how you can do this. In the next section I will show you a process that you can use with publicly available tools and a spreadsheet. The simplest form of TF analysis is to count the number of uses of each keyword on a page. However, the problem with that is that a page using a keyword 10 times will be seen as 10 times more valuable than a page that uses a keyword only once. For that reason, we dampen the calculations. I have seen two methods for doing this, as follows:
The first method relies on dividing the number of repetitions of a keyword by the count for the most popular word on the entire page. Basically, what this does is eliminate the inherent advantage that longer documents might otherwise have over shorter ones. The second method dampens the total impact in a different way, by taking the log base 10 for the actual keyword count. Both of these achieve the effect of still valuing incremental uses of a keyword, but dampening it substantially. I prefer to use method 1, but you can use either method for our purposes here. Once you have the TF calculated for every different keyword found on your page, you can then start to do the same analysis for pages that outrank you for a given search term. If you were to do this for five competing pages, the result might look something like this:
I will show you how to set up the spreadsheet later, but for now, let's do the fun part, which is to figure out how to analyze the results. Here are some of the things to look for:
You can then tag these words for further analysis. Once you are done, your spreadsheet may now look like this:
In order to make this fit into this screen shot above and keep it legibly, I eliminated some columns you saw in my first spreadsheet. However, I did a sample analysis for the movie "Woman in Gold". You can see the full spreadsheet of calculations here. Note that we used an automated approach to marking some items at "Low Ratio," "High Ratio," or "All Competitors Have, Client Does Not." None of these flags by themselves have meaning, so you now need to put all of this into context. In our example, the following words probably have no significance at all: "get", "you", "top", "see", "we", "all", "but", and other words of this type. These are just very basic English language words. But, we can see other things of note relating to the target page (a.k.a. the client page):
Note that the last item is only visible if you open the spreadsheet. The issues above could well be significant, as the lead actors, reviews, and other indications that the page has in-depth content. We see that competing pages that rank have details of the story, so that's an indication that this is what Google (and users) are looking for. The fact that the main key phrase, and the word "billing", are used to a proportionally high degree also makes it seem a bit spammy. In fact, if you look at the information closely, you can see that the target page is quite thin in overall content. So much so, that it almost looks like a doorway page. In fact, it looks like it was put together by the movie studio itself, just not very well, as it presents little in the way of a home page experience that would cause it to rank for the name of the movie! In the many different times I have done an analysis using these methods, I've been able to make many different types of observations about pages. A few of the more interesting ones include:
These types of observations are interesting and valuable, but it's important to stress that you shouldn't be overly mechanical about this. The value in this type of analysis is that it gives you a technical way to compare the content on your page with that of your competitors. This type of analysis should be used in combination with other methods that you use for evaluating that same page. I'll address this some more in the summary section of this below. How do you execute this for yourself?The full spreadsheet contains all the formulas so all you need to do is link in the keyword count data. I have tried this with two different keyword density tools, the one from Searchmetrics, and this one from motoricerca.info. I am not endorsing these tools, and I have no financial interest in either one—they just seemed to work fairly well for the process I outlined above. To provide the data in the right format, please do the following:
This may sound a bit tedious (and it is), but it has worked very well for us at STC. SummaryYou can also use usability groups and a number of other methods to figure out what users are really looking for on your site. However, what this does is give us a look at what Google has chosen to rank the highest in its search results. Don't treat this as some sort of magic formula where you mechanically tweak the content to get better metrics in this analysis. Instead, use this as a method for slicing into your content to better see it the way a machine might see it. It can yield some surprising (and wonderful) insights! Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! |
You are subscribed to the newsletter of Moz Blog sent from 1100 Second Avenue, Seattle, WA 98101 United States To stop receiving those e-mails, you can unsubscribe now. | Newsletter powered by FeedPress |
FeedPress is a service edited by Beta&Cie, www.betacie.com |
Seth's Blog : Are you feeling lucky?
Are you feeling lucky?
Expected value is a powerful concept, easy to understand, often difficult to use in daily life.
It's the value of an outcome multiplied by the chances it will happen.
If there's a one in ten chance you'll get a $50 ticket for parking here, the expected value (the cost) of parking here is $5. Park here enough times, and that's what it's going to cost you.
If there's a one in five chance you'll win that lawsuit for a million dollars, the expected value of the suit is $200,000.
That's not a guess or a vague hunch, it's actually true. If the odds are described properly (and setting those odds is an entirely different discussion) then the value of the opportunity (or the cost of it) is clear.
And yet...
And yet we anchor our risks, often overestimating just how much it's going to cost us to get a ticket.
And we anchor our possible gains, usually overestimating how much that opportunity is worth (which is why so few lawsuits that should settle, do).
Humans are quite bad at dealing with ambiguity, and even worse when there's money on the table. Ellsberg's paradox helps us understand some of the bugs in the system, and perhaps we can take better risks by using a pencil, not our gut, to decide what a chance is worth.
More Recent Articles
- "I'm not the kind of person who..."
- Hope and expectation
- The noise in our head (and artificial intelligence)
- Five steps to digital hygiene
- Why not?
[You're getting this note because you subscribed to Seth Godin's blog.]
Don't want to get this email anymore? Click the link below to unsubscribe.
marți, 14 aprilie 2015
Mish's Global Economic Trend Analysis
Mish's Global Economic Trend Analysis |
- "Inconceivable" Negative Interest Rates on Mortgages in Portugal and Spain, with Italy On Deck
- France Considers Forcing Google to Disclose Search Algorithm; Too Much Satisfaction!
- Experts Confounded: Retail Sales Rise First Time in Four Months, But Weaker Than Expected
- Spotlight on China: Margin Debt, Trading Accounts, Construction Equipment
"Inconceivable" Negative Interest Rates on Mortgages in Portugal and Spain, with Italy On Deck Posted: 14 Apr 2015 11:22 PM PDT The vast majority of mortgages in Portugal, and a huge number in Italy and Spain are tied to Euribor, the rate it costs European banks to borrow from each other. If Euribor drops low, enough banks will have to pay borrowers. It has already happened in Spain. The WAll Street Journal reports Tumbling Interest Rates in Europe Leaves Some Banks Owing Money on Loans to Borrowers. Tumbling interest rates in Europe have put some banks in an inconceivable position: owing money on loans to borrowers.Inconceivable Payback Reflections on the Inconceivable link if video does not play: Princess Bride. Mike "Mish" Shedlock http://globaleconomicanalysis.blogspot.com Mike "Mish" Shedlock is a registered investment advisor representative for SitkaPacific Capital Management. Sitka Pacific is an asset management firm whose goal is strong performance and low volatility, regardless of market direction. Visit http://www.sitkapacific.com/account_management.html to learn more about wealth management and capital preservation strategies of Sitka Pacific. |
France Considers Forcing Google to Disclose Search Algorithm; Too Much Satisfaction! Posted: 14 Apr 2015 01:18 PM PDT Too Much Satisfaction! Heaven forbid consumers actually like something too much. If they do, they buy it or use it more than they buy or use competing products. And when that happens, well it must be "unfair" competition. Lord knows we cannot possibly tolerate too much consumer satisfaction. So, with that line of thinking Europe to Accuse Google of Illegally Abusing its Dominance. Google will on Wednesday be accused by Brussels of illegally abusing its dominance of search in Europe, a step that ultimately could force it to change its business model fundamentally and pay hefty fines.Requiring Google to disclose its algorithms is tantamount to requiring Google give away its trade secrets and patents for free. Requiring Google to list other search engines is like requiring Ford dealerships to sell GM autos. Search Engine Choices People can choose from any number of search engines. Here are the Top 15 Search Engines. I show a selection below.
No Tracking DuckDuckGo bills itself as the "Search Engine That Does Not Track You". No tracking is an important issue to some people, not others. If the issue becomes important enough, Google will have to change its model or it will lose traffic to DuckDuckGo. That is how change should happen, not by EU witch hunts. By the way, I do not believe Google locks any publishers into using Google search ads. If someone wants to use non-google ads, they are free to do so. If the results are not as good, well, maybe the higher price of Google ads is worth it. Why Do People Use Google Search? Search users use Google for a simple reason: They like it. It does not matter why. In the eyes of the EU, Google provides too much satisfaction. And the EU will not allow that! Mike "Mish" Shedlock http://globaleconomicanalysis.blogspot.com Mike "Mish" Shedlock is a registered investment advisor representative for SitkaPacific Capital Management. Sitka Pacific is an asset management firm whose goal is strong performance and low volatility, regardless of market direction. Visit http://www.sitkapacific.com/account_management.html to learn more about wealth management and capital preservation strategies of Sitka Pacific. |
Experts Confounded: Retail Sales Rise First Time in Four Months, But Weaker Than Expected Posted: 14 Apr 2015 10:45 AM PDT The Bloomberg retail sales consensus estimate was for a 1.1% gain. Sales did rise for the first time in four months, but not as much as expected. Retail sales in March rebounded 0.9 percent after dropping 0.5 percent in February. The market consensus for March was for a 1.1 percent boost. Excluding autos, sales gained 0.4 percent, following no change in February. Expectations were for a 0.6 percent increase. Gasoline sales dipped 0.6 percent after 2.3 percent increase in February. Excluding both autos and gasoline sales rebounded 0.5 percent after declining 0.3 percent in February. Expectations were for a 0.4 percent increase.Experts Confounded Please consider U.S. Retail Sales Rise for First Time in Four Months. U.S. retail sales rose for the first time in four months in March, but the gain wasn't enough to offset weaker spending during the winter months as consumers continued to largely pocket savings from cheaper gasoline prices.Possible Explanations
I opt for a combination of two and three. Mike "Mish" Shedlock http://globaleconomicanalysis.blogspot.com Mike "Mish" Shedlock is a registered investment advisor representative for SitkaPacific Capital Management. Sitka Pacific is an asset management firm whose goal is strong performance and low volatility, regardless of market direction. Visit http://www.sitkapacific.com/account_management.html to learn more about wealth management and capital preservation strategies of Sitka Pacific. |
Spotlight on China: Margin Debt, Trading Accounts, Construction Equipment Posted: 14 Apr 2015 01:25 AM PDT In response to my April 1, post China Margin Debt Soars to Record 1 Trillion Yuan; Another Central Bank Sponsored Bubble I received an email from reader Nicolas. He writes ... Hello MishI certainly was unaware I was followed by banks in Switzerland. Thanks! The Bloomberg data is from SSE Margin, in Chinese. I asked my friend Chris Puplava at Financial Sense if he was aware of a Bloomberg tracking symbol. We do not believe there is such a symbol for margin. However, Chris did locate this interesting chart of the Shanghai stock market vs. new accounts that is available on Bloomberg. Shanghai Stock Index vs. New Accounts click on chart for sharper image I get lots of data from readers, and I appreciate it! In regards to China, reader Norman writes ... Hello Mish,Komatsu Orders click on chart for sharper image Komatsu is just a single manufacturer. It may not be representative of all such activity and orders. But given the collapse in commodity prices such as iron ore, I suspect it is. If so, this segment of the Chinese economy looks like a disaster. Those expecting a rebound in Chinese housing or construction are likely mistaken. The new game in town is clearly stock market speculation. Chinese Growth My post Reality Check: How Fast is China Growing? Global Recession at Hand is also consistent with the China rapid slowdown thesis. Mike "Mish" Shedlock http://globaleconomicanalysis.blogspot.com Mike "Mish" Shedlock is a registered investment advisor representative for SitkaPacific Capital Management. Sitka Pacific is an asset management firm whose goal is strong performance and low volatility, regardless of market direction. Visit http://www.sitkapacific.com/account_management.html to learn more about wealth management and capital preservation strategies of Sitka Pacific. |
You are subscribed to email updates from Mish's Global Economic Trend Analysis To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google Inc., 1600 Amphitheatre Parkway, Mountain View, CA 94043, United States |