A Peek Under the Hood: How we Manage the SEOmoz Community |
- A Peek Under the Hood: How we Manage the SEOmoz Community
- 58 billion URLs in the Latest, Largest Linkscape Index Update Yet
- How To Handle Downtime During Site Maintenance
A Peek Under the Hood: How we Manage the SEOmoz Community Posted: 17 Jan 2012 04:39 PM PST Posted by jennita Have you ever been a part of a community and wondered, “How does it all happen?” Well today is your lucky day! In the spirit of TAGFEE, I've decided to lift the Moz hood and show you what it takes to manage a large community. In fact, this is just the first post in a series of posts on Community Management. Today I'll be explaining the who, what, when, where and how of how we manage the SEOmoz community. It's important to know who are the people behind the scenes, keeping the community in order and running smoothly, just as much as it is to know what exactly we consider the community to be and how we do it. In the next posts, I'll dig deeper into subjects such as how we deal with negativity, how we gained over 100k Twitter followers and what we're planning to do for our Google+ strategy. For now, let's jump into the SEOmoz community and see how it's done. Who Are We?Over the last couple years, the community has grown immensely. It quickly became imperative to build a team to help take care of different aspects of the community. I simply couldn’t handle all aspects on my own anymore. So, before I jump too far into the what, why and how we manage the community, I’d like to introduce you to the “who.” |
58 billion URLs in the Latest, Largest Linkscape Index Update Yet Posted: 17 Jan 2012 08:01 AM PST Posted by randfish I've got good news. Today marks a new Linkscape index (only 14 days after our previous index rollout) which means new data in Open Site Explorer, the Mozbar, the Web App and the Moz API. It's also more than 60% larger than our previous update in early January and shows better correlations with rankings in Google.com; I'm pretty excited. For the past couple years, SEOmoz has focused on surfacing quality links and high quality, well-correlated-with-rankings metrics to help provide a link graph that shows off a large sample of the web's link graph. However, we've heard feedback that this isn't enough and may not be exactly what many who research links are seeking (or at least, it's not fulfilling all the functions you need). We're responding by moving, starting with today's launch, to a new, consistently larger link index. Today's data is different from how we've done Linkscape index updates in the past. Rather than take only those pages we've crawled in the past 3-4 weeks, we're using all of the pages we've found since October 2011, replacing anything that's been more recently updated/crawled with a newer version and producing an index more like what you'd see from Google or Bing (where "fresh" content gets recrawled more frequently and static content is crawled/updated less often). This new index format is something that will let us expose a much larger section of the web ongoing, and reduces the redundancies of crawling web pages that haven't been updated in months or years. Below are two graphs showing the last year of Linkscape updates and their respective sizes in terms of individual URLs (at top) and root domains (at bottom): As you can see, this latest index is considerably larger than anything we've produced recently. We had some success growing URL counts over the summer, but this actually lowered our domain diversity (and hurt some correlation numbers of metrics) so we rolled back to a previous index format until now. This means you'll see more links pointing to your sites (on average, at least) and to those of your competitors. Our metrics' correlations are slightly increased (I hope to show off more detailed data on that in a future post with help from our data scientist, Matt), which was something we worried about with a much larger index, but we believe we've managed to retain mostly quality stuff (though I would expect there'll be more "junk" in this index than usual). The oldest crawled URLs included here were seen 82 days ago, and the newest stuff is as fresh as the New Year. Despite this mix of old + new, the percent of "fresh" material is actually quite high. You can see a histogram below (ignore the green line) showing the distribution of URLs from various timeframes going into this new index. The most recent portion, crawled in the last 2/3rds of December, represents a solid majority. Let's take a look at the raw stats for index 49:
In addition to this good news, I have some potentially more hilarious and/or tragic stuff to share. I've made a deal with our Linkscape engineering group that if they release an index with 100+ billion URLs by March 30th (just 72 days away), I will shave/grow my facial hair to whatever style they collectively approve*. Thus, you may be seeing a Whiteboard Friday with a beardless or otherwise peculiar-looking presenter in the early Spring. :-) As always, feedback is welcome and appreciated on this new index. If some of the pages or links are looking funny, please let us know. * 20th century European dictator mustaches excluded |
How To Handle Downtime During Site Maintenance Posted: 17 Jan 2012 03:30 AM PST Posted by Frederik Hyldig This post was originally in YouMoz, and was promoted to the main blog because it provides great value and interest to our community. The author's views are entirely his or her own and may not reflect the views of SEOmoz, Inc. In this post I will explain how to handle cases of planned downtime. That is, a short period of time wherein you purposely make your website inaccessible. This can be due to significant changes to the site or because of server maintenance. It should always be the last resort to make the entire website inaccessible, but in some cases it can be necessary. Below you will find suggestions as to how to proceed with SEO in mind. Tell both humans and robots that it's only a temporary shutdown.In the case of a temporary shutdown, one should always inform both humans (visitors) and robots (search engines) so that they are aware that it is a planned closure, and that it is just temporary. If possible one should also state when the website is expected to be back online. This will ensure that both humans and robots will return at a later time to find what they expected to find in the first place. There are two mistakes often seen when a website is made temporarily unavailable: Mistake 1 - All files are removed from the server.When both humans and robots attempt to find the website, it will result in a 404 error, which means that the requested page cannot be found. This informs neither humans nor search engines on what is actually happening. One will typically be shown a page that looks something like this: The worst case scenario is that people will think the website no longer exists, and will therefore give up trying to find it again. Search engines handle this situation in a similar fashion. To them, a 404 error means that the page no longer exists, and it will in time be deleted from their index. Mistake 2 - A simple page is put on the server with a short message explaining the closure.An alternative solution to the one above is to remove all files and then put one very simple file on the server that explains why the website is closed in one or two sentences. All the old pages are then redirected to this file. This method may tell humans what the problem is, but it still makes no sense to the search engines. The search engines can in fact become so confused by this that they believe that the temporary state of the website – the few sentences explaining the problem – is the permanent website in future. Depending on how the redirection of the other pages has been carried out, one also risks the search engines thinking that all the other pages of the website have been (re)moved, and that only the front page is to be ranked in search results. This is a sure way to lose rankings. Briefly on HTTP Status codesEvery time you visit a website your browser receives a message from the server that hosts the website. This message is called a HTTP Status code. As a SEO it is necessary to understand what the most important codes mean. 200 OK - The request has succeeded. This is the standard response for successful HTTP requests. 301 Moved Permanently - The requested resource has been assigned a new permanent location. This and all future requests should be directed to the given location. This status code is used for 301 redirects. In most instances, the 301 redirect is the best method for implementing redirects on a website. A 301 redirect will pass most, if not all the linkjuice from the original location. 302 Found - The requested resource resides temporarily at a different location. By using a 302 redirect instead of a 301, search engines will know that this is only a temporary state. No appreciable amount of linkjuice will be passed. 404 Not Found - The server has not found anything matching the requested location. No indication is given of whether the condition is temporary or permanent. In time, the page will be removed from the search engine's index. 503 Service Unavailable - The server is currently unavailable (this could be due to overload or maintenance). Search engines will know that this is a temporary state. This status code should be used when taking down a site for maintenance. You can read more about HTTP status codes here. Also check out this infographic on HTTP status codes by Dr. Pete. How to inform search engines that the downtime is temporary.If you take down your website temporarily, you must inform search engines such as Google. As you could read above, this is done by utilizing the HTTP status code: 503 Service Unavailable, that informs the search engines that the server is temporarily unavailable. To do this one must first create a file that returns a 503 status code on the server. When the search engine sees this, it will understand the situation. This can be done by copying the four lines below into Notepad (or the like) and saving it as 503.php. You must then place this file in the root of your server. The first two lines tell us that it is a 503 status code, and the last line is used to tell when the website is expected to be online again. Google understands this message, so it is possible to tell Google when to visit the website again. You must either provide a number (seconds) or a date. If you live in Denmark like I do and you expect to return on the 5th of January 2012, at 14:00, you must put down: Notice that I wrote 13:00:00 in the code, even though I wrote 14:00:00 above. This is due to the fact that the time must be provided in GMT/UTC, which is, in my case, 1 hour behind local time. But it is not enough to just put a 503 message on your server. You will receive visitors (Google included) from many different sources and to all sorts of pages of your website. They must all be redirected to the message explaining that the website is temporarily closed. On an Apache/Linux server, this can be easily solved by using a .htaccess file to redirect all the pages towards the 503.php file. The .htaccess file is often used for 301 redirects, but that is not our purpose here. We will use a 302 redirect. You may have been previously warned about using this sort of redirect, and for good reason. It can do a great deal of damage if not used correctly. But in this case, it must be used, and in fact a 301 redirect would be detrimental in its place. Save the 6 following lines as a .htaccess file and place it in the root of your server as well. The 'R' in the last line indicates that this is a 302 redirect. R is 302 by default. To create a 301 redirect, it would have said [R=301, L]. The clever thing about this file, however, is that we can give ourselves access to the site and simultaneously show everyone else a 503 message. Let’s say you have the following IP address: 12.345.678.910. You then put the numbers in line 4 as shown below: When you have placed the two files (503.php and .htaccess) on your server, you’re done. You now have peace and quiet to tinker with your website, as long as you leave those two files in the root of your server – and if Google visits, they’ll know that the site will be back later, and you’ve even let them know when to try again. But what about passing on the message to your visitors? How to tell your visitors that the website is only closed temporarily.With a few additions to the 503.php file, which we made just before, we can pass on a message to visitors: The above will result in the following message when one visits the website: And if we look at the response the server provides Google with, with a tool such as FireBug, Web-Sniffer.net or the like, we get the following: Now you have informed both humans and robots to come back later. This is the best way to handle server maintenance in order to prevent Google from indexing the temporary version of the website. It should be possible to get through a temporary closure without the website’s rankings suffering serious consequences. A Quick Note about SOPA ProtestsKeri from SEOmoz here! This post is also helpful if you're wanting to protest SOPA tomorrow (January 18th) and want to minimize the effect on your rankings. Pierre Far from Google shared a a post on Google+ called Website outages and blackouts the right way that you might want to check out for some information straight from Google. |
You are subscribed to email updates from SEOmoz Daily SEO Blog To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google Inc., 20 West Kinzie, Chicago IL USA 60610 |
Niciun comentariu:
Trimiteți un comentariu