SEOmoz Daily SEO Blog |
Excel For SEOs: Manipulating URL Strings with Functions Posted: 19 Dec 2010 01:17 PM PST Posted by MikeCP Over the past few months, I've been preparing a guide with the tentative title, Excel for SEOs: Lessons for Aspiring Ninjas. I've embarked on this great journey because sometime during the #Mozinar back in September I decided that my Excel abilities were far too limited, especially compared to all the smarty pants that were giving great presentations and advice about SEO data analysis. After talking with some of my colleagues I found that I wasn't alone in my feelings of Excel inadequacy. So Excel for SEOs: Lessons for Aspiring Ninjas was born as a means to help others by documenting my quest for Excel Ninj-ocity. The full document will be released sometime in January, but here is a selection that I hope provides some value alone. While having a working knowledge of the functions described below can help with a multitude of SEO tasks, the specific tasks that we'll be completing in the following include:
-----SNIP----- Lesson 2: More Functions - Text ManipulationThe functions on which we'll be focusing in this lesson are useful for dealing with text manipulation. As we'll see from the examples, there are quite a few scenarios wherein the SEO has to manipulate a text string. Some of the formulas we'll talk about are pretty simple to grasp individually, but can get a bit confusing when used together. We'll touch on:
LENMicrosoft Excel Definition: Returns the number of characters in a text string. Syntax: LEN(text)
SEARCH/FINDMicrosoft Excel Definition: Syntax: SEARCH(find_text,within_text,start_num) and FIND(find_text,within_text,start_num) There are two differences between SEARCH and FIND. SEARCH is not case-sensitive, FIND is. SEARCH allows the use of wildcards, FIND does not. Under most circumstances, SEARCH is all you need, but it helps to know that FIND is always there if you've got to deal with pesky capital letters in URLs or something similar. Another reason to choose FIND is if you're dealing with URLs that contain parameters. Without properly escaping question marks, they will act as wild cards, which may cause some frustration. In our example below, we've pulled out the character number at which the "/blog/" string begins. Much like LEN, this function is a bit silly on its own, but can be combined with some of our other functions to do some cool things.
The IFERROR syntax: IFERROR(value,value_if_error)
LEFT, RIGHT, MIDMicrosoft Excel Definition: Syntax: Both LEFT and RIGHT return the characters from a given position in a text string starting from either side of a string. MID is great for extracting a portion of a text string. I've lumped the three together because they are often used in conjunction with each other (along with a few of the earlier functions). Let's dive into an example: Bringing it all together - Example 1Let's say we've been given a list of URLs, and we want to extract just the domain. This formula will do the job. Let's break down this nested formula, and see how it pulls just the domain out of our URL. Starting from the middle we see SEARCH, which uses the syntax: SEARCH(find_text,within_text,start_num) In plain terms, this formula finds the first instance of "/" in the cell to the left, starting at the 8th character from the beginning, which is done to start passed the double slash in http://. As we see below, the result for the first row of data is 22.
Now we are left with a simple LEFT formula. Remember, the syntax for LEFT is LEFT(text,num_chars). In plain terms: Give us the first 22 characters starting from the beginning. The number of characters differs depending on URL, but adjusts accordingly when applied to the rest of the table. We now have a nice listing of just root domains.
Example 2Let’s use SEARCH (with wildcards) and MID together to extract a portion of a URL:
We’ll definitely be making use of MID, as the text we want is in the MIDdle of our string. We’ll need to determine how many characters make up the "-tXXX.html" bit at the end of each URL. Since the length of this portion of the URL varies, but the format doesn’t (that is, "-t" + "numbers" + ".html"), we can use wildcards to find this character count. Again, the syntaxes for these 2 functions: Let's break down the formula for the first URL in our list. Cell A2: http://www.example.com/lamp-maintenance-t83.html =MID(A2,SEARCH("/",A2,8),SEARCH("-t*.html",A2)-SEARCH("/",A2,8)) =MID(A2,23, SEARCH("-t*.html",A2)-23) We've calculated the first instance of a "/" after the 8th character. This gives us our start_num values. We're also using the * wildcard to help us get the character count of the right-most chunk of text. =MID(A2,23,SEARCH("-t*.html",A2)-23) =MID(A2,23,40-23) We can easily calculate the number of characters for our MID once we know where our non-descriptive characters begin. =MID(A2,23,17) /lamp-maintenance Hooray! Example 2.5Let's make a small adjustment to our original URL to demonstrate how we can use LEN in this formula. Cell A2: http://www.example.com/t1521-lamp-maintenance.html =MID(A2,SEARCH("-",A2)+1,LEN(A4)-SEARCH("-",A2)-5) =MID(A2,29+1,50-29-5) lamp-maintenance The additional +1 and -5 are necessary to make minor adjustments to the final outcome. Without them, our final result would have been "-lamp-maintenance.html". -----SNIP----- Thus concludes the sample lesson on manipulating text strings in Excel. I hope you've found it useful, and not too confusing! If you have any questions, ask away in the comments and follow me (@MikeCP) and @Distilled on Twitter for more info on the full release of Excel for SEOs: Lessons for Aspiring Ninjas. There's much more where this came from, including bits on pivot tables, VLOOKUP, Index/Match, OFFSET, and more! |
You are subscribed to email updates from SEOmoz Daily SEO Blog To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google Inc., 20 West Kinzie, Chicago IL USA 60610 |
Niciun comentariu:
Trimiteți un comentariu