Last Updated: 06 Nov 2023

   |   

Author: 114.119.155.83

Basic Guide to SEO

Search Engine Optimization is a wide and rather unscientific field which changes constantly. Nevertheless, it can be a key to unlocking huge traffic…and huge profits…for a website.

Tools

FireFox Extensions

  • SearchStatus – Firefox extension for seeing page rank, etc.

Keyword Tools

  • Google Keyword Suggest Tool – I like this the best; it uses Google's data which seems to be much more accurate than the WordZe data, etc. And as of July 2008, they're giving raw search numbers!
  • Google Search Based Keyword Tool – new tool from Google, lets you search for keywords that you may be missing and provides additional data about relevant keywords.
  • Google Sets – new (Jan 09) tool from Google, lets you figure out 'sets' of related keywords.
  • Keyword Discovery – best of the paid services (as of July 08)
  • Competition – spyfu shows you what competitors are doing on google

Webmaster Tools

Forums

Good Resources

SEO is divided into two major parts: on-page SEO and off-page SEO.

On Page SEO

  • Page Titles: all pages should have a page title with keywords and a good call to action. Make sure page titles and meta descriptions are different; making them the same is at best a wasted opportunity; at worst, it can hurt your rankings.
    • Titles should be no more than 80 characters, and ideally 67 characters or less, which is the max that Google will display. Keywords should not be repeated more than three times in a title tag, as Yahoo will penalize this. CTR has been observed to go down with more than two repetitions in a title tag, so that should probably be your max.
  • Page Headers (H1 to H6): Pages should have headers <h1> to <h6> with keywords, etc. Headers can be styled. Do NOT use <span>, <div> etc. to do headers. Also, do not duplicate keywords from title tag and anchor text… this can hurt rankings. (note that not everybody agrees that this is important)
  • If possible, put 'long tail' keywords in the <h1> tag. 'Long tail keywords' are words that are not searched frequently, but when combined can add up to significant traffic. Long tail search is the best way to go if it works for your site, because popular keywords are very hard to crack, but you can frequently get decent rankings for long-tail keywords within a few weeks.
  • Invisible stuff on the page doesn't make much difference anymore. Don't bother with it.
  • Copy should contain variations on keywords, if possible. For example: “The 2008 Mardi Gras Parade Schedule is…” Next paragraph you might use “The Mardi Gras Schedule for 2008 is..” or even “In 2008 the Schedule for the Mardi Gras is..”. This helps different variations of keywords come up.
  • Keep links relevant. Also keep the structure of links relevant, so the most important pages are linked from the most visible, most important areas of your site. See Page Navigation Structure
  • Do pagination with 1,2,3,4 links rather than «Prev and Next». Page 4 takes four clicks with «Prev and Next», but only one with the 1,2,3,4 method.
  • Meta description is important, as it is usually what Google uses for the context it provides with each result. Meta descriptions are limited to 150 chars at Google and 170 chars at Yahoo.
  • Meta keywords aren't very important, but Yahoo still uses them in some fashion. It can be particularly good to put mis-spellings in the meta keywords tag. No more than 300 characters.
  • alt= and title= attributes, if done, should be short and sweet. Do not spam alt & title tags. If you do both, make sure that they are the same.
  • Add relevant content over time, rather than all at once. This tells Google et. al. that you're a growing website.
  • Use <strong> tags rather than <b> tags to make text bold.
  • Longer and more authoritative documents are better than breaking articles into many shorter pages.
  • Semantically correct markup does NOT make a huge difference, although its good to use block level elements (like <div>) for document layout, and inline elements (like <span>) for text layout. This helps Google understand the document better, and keeps your pages smaller and faster-loading.
  • Think about adding rel=“nofollow” to outbound links, particularly if those links are generated by users (e.g. via a comment or forum posting). Also look at rel=“nofollow” on non-critical internal pages.
  • Try to eliminate other unnecessary meta tags, such as meta robots. All these do is push your actual content farther down the page, and they aren't generally that useful.

Duplicate Content

  • Duplicate content (from one site to another) can be really bad. This can happen if somebody steals your content.
  • Duplicate content, if it links back to your site, can actually be good. For example, if you have a series of articles and a site with a high PageRank wants to republish them, you should let them, but make sure they give you a byline back to your site.
  • Duplicate content within a site isn't that bad, although should be avoided if at all possible. Generally, you make to make sure you have one URL for any given page, and use 301 redirects from other URLs that display the 'same' page. Using the 301 (permanent) redirect tells Google et. al. that you want to transfer the link equity from one page to another.

Time-Based Factors

  • Domain age is important. Very important, according to some people. So is document age, but less so. Yahoo, in particular, seems to place great importance on site age.
  • Internal link popularity: link to the most important pages
  • Use ONLY search engine friendly URLS (e.g. apple.com/products/apple-macbook-pro, not apple.com/product.php?product=apple-macbook-pro ). Having keywords in URLS is important, but mainly as a call to action for the user when they see the result on the Search Page. You can use mod-rewrite to rewrite the former to the latter, so you can still pass paramaters (e.g. ?product=apple-macbook-pro ) but you don't have to put that in the link.
  • You need to keep parameters (such as session IDs, extra parameters, etc.) out of the query string. Otherwise search engines may index the same page twice, because it has slightly different query strings. There are a number of ways to do this; e.g. use cookies instead of session IDs, pass data via POST rather than GET, etc.
  • Use dashes, not underscores in links. Dashes are treated like a space in most cases.
  • If you do have to pass parameters into a page, follow a standard parameter order.
  • Use 301 (move permanently) and 302 (moved temporarily) redirects. 301 should transfer link equity from an old page to a new page, and then the search engines will update their indexes. 302 doesn't transfer link equity or cause an index update, and as a result I don't use it very often, unless a page has truly moved temporarily.
  • Use 404 for deleted pages, or 301 redirect deleted pages to non-deleted pages.
  • Use 500 for server error, NOT a 404 not found. Since 404 means page deleted, and all 404 pages in the site will be de-indexed.
  • Make sure your website only points to one place; e.g. www.apple.com and apple.com are seen as two different sites, so 301 redirect www.apple.com to apple.com. You can do this with the following code in an .htaccess file:
    # Add www. if we don't have it
    RewriteCond %{HTTP_HOST} ^apple.com$ [NC]
    RewriteRule ^(.*)$ http://www.apple.com/$1 [R=301,L]
  • Try to keep urls less than four slashes (directory breaks)
  • Offsite links from your site to 'spammy' neighborhoods is a bad thing… if you've got user generated content, this can be a real problem.
  • Make sure you redirect your index.php, index.html, index.shtml, etc. files to /, so you don't have duplicate content issues:
    # Rewrite index.php, index.html, etc.
    RewriteCond %{THE_REQUEST} ^GET\ .*/index\.(php|html|shtml)\ HTTP
    RewriteRule ^(.*)index\.(php|html|shtml)$ /$1 [R=301,L]
  • You don't have to do this, but you might want to make sure that you redirect all .php files to / :
    # 301 Redirect foo.php to foo/. We don't want to do this for certain types
    # of files, such as .js.php (javascript files generated by PHP)
    RewriteCond %{REQUEST_URI} !\.js\.php$
    # If client request header contains php file extension
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^.]+\.)+php\ HTTP
    # externally redirect to extensionless URI
    RewriteRule ^(.+)\.php$ /$1/ [R=301,L] 
    # Now, if we see something of the form foo/, internally (not externally
    # with a 301) get the actual PHP file requested. See this for details:
    # http://www.webmasterworld.com/apache/3371997.htm, post 3386144
    # If the requested URL contains a period in the final path-part
    RewriteCond %{REQUEST_URI} (\.[^./]+)$ [OR]
    # Or if it exists as a directory
    RewriteCond %{REQUEST_FILENAME} -d [OR]
    # Or if it exists as a file
    RewriteCond %{REQUEST_FILENAME} -f
    # Then leave URL alone and skip the next rule
    RewriteRule .* - [S=1]
    # Otherwise, if requested extensionless URL exists as .php
    RewriteCond %{REQUEST_FILENAME}.php -f
    # then add .php to get the actual filename
    RewriteRule (.*)/ /$1.php [L]

Sitemaps

  • There is a new sitemap standard, available at sitemaps.org, that all the major search engines adhere to.
  • Generally, providing a sitemap is seen as a good thing, because it helps search engines spider your site more quickly. You can also do timestamp based stuff and whatnot, to help tell them when to come back.
  • You can tell the search engines about a sitemap one of three ways:
    • via the search engine submission process (e.g. Webmaster Tools)
    • via robots.txt's Sitemap directive
    • via an HTTP ping
  • I usually setup the Google Webmaster Tools and the robots.txt.
  • For small sites, it looks like generating the sitemap with a tool and updating by hand is probably the best option. For bigger sites, you should auto-generate the sitemap with your own code. For reference/ideas, there is a good tool available at http://www.xml-sitemaps.com/
  • To comply with the spec, and to be authoritative:
    • The sitemap files must be located in the root directory of the webserver, e.g. http://startupcto.com/sitemap.xml
    • All URLs in the file must be fully qualified and be on the same host the sitemap file is located on.

Offsite SEO

Offsite SEO is even more powerful than onsite SEO, but a lot harder to accomplish.

  • Links in a 'topical community' are very important. That means trying to get links from sites whose content is related to yours.
  • Anchor text of inbound links is important, but less so than it was before. Text around the links also helps a bit.
  • You can find what links you have to your site using a Google Search: link:site.com. I'm not convinced this works that well, because many queries don't return nearly as many links as I think they should (even with pages that contain links and I know are in Google's index)
  • Try to get linked in from 'good neighborhoods', particularly from links that rank highly on Google results for the keywords you want.
  • That said 'a link is a link'. If someone wants to link to you then accept. BUT: don't necessarily reciprocate. In fact linking back can actually negate the value of the link, and can hurt your rankings if you link to somewhere 'shady'.
  • Google apparently likes better quality links (quality over quantity), while Yahoo likes sheer numbers (quantity over quality)
  • Google also looks at who you link to: this is considered to be another measure of trust because you presumably control all of your content. Link to highly ranked sites as appropriate from your site.

Choosing Keywords

  • Very important to choose keywords that you want to rank highly on.
  • Depending on what you want to do, you can either go for 'long tail' keywords (a lot of keywords, none of which are searched very often), or you can go for high-traffic keywords.
  • You should use the keyword research tools to figure out what keywords to use. My very favorite is the free Google Keyword Tool, because it is free, accurate, and uses Google's dataset. You can also use SpyFu and see what competitors are ranking themselves for. Keyword Discovery is the best of the paid keyword tools. The rest of the keyword tools are mediocre at best, and should probably either be avoided or taken with a huge grain of salt.
    • If you use the Google Tool, I recommend choosing 'Exact Match' from the drop down, as that gives you the number of people searching for that exact set of words. If you're using it to figure out relevance for Adwords, you may want to select 'Broad Match' & 'Phrase Match' as well.
  • If you're running adwords, see how those are doing and use that as part of your dataset.
  • If your site has been running for a while, use your analytics packages to see what people are currently searching for. If you're tracking conversions in your Analytics package, see how those are doing too.
  • You may want to use Google Sets to do some additional idea generation.
  • From all these various sources, you need to compile a list of keywords to target your SEO efforts around. I like to divide this list into several categories:
    • Popular/Top Keywords: contains popular or frequently searched topics;
    • Secondary Keywords: contains 'top' keywords… keywords people search for frequently on this topic.
    • [Optional] Long Tail Keywords: You may want to spend some time looking at possible 'long tail' keywords; you won't be able to (or want to) compile a whole list of these, but doing a bit of research may provide some insight.
  • You should include search volume data (from the Google Keyword Tool). You may also want to include secondary keywords off of each main keyword in your list; e.g. if you have a main keyword 'startup', you might note that 'startup ideas', 'startup financing' and 'startup businesses' are all top related keywords.
  • For each main keyword, you also want to include the number of results that keyword returns from the main Google index. Then, divide the number of searches by the number of results. This number can tell you whether a keyword is very popular or not, and may help you decide whether you want to target it.
  • A note on plurals: if the keyword you're looking at has a plural version that just includes an 's' (e.g. book vs. books) then you can safely target the plural, and you don't need to worry about the singular (or worry about it much, anyway). If the plural does not contain the singular (e.g. category vs. categories) then you should either pick the most popular or target both.

Specific Page Optimization

  • In addition to optimizing for an entire site, you can also optimize for specific pages. This typically makes sense on a larger or more broadly focused site; the general process is the same as what you'd do for a full site:
    • develop a list of keywords for that page
    • optimize on-page content for those keywords
    • get offsite links directly to that page, preferably with your keywords in the offsite link

Monitoring Rankings

  • Once you've figured out your keywords, its a good idea to monitor your rankings.
  • I'm still looking for a good tool that does this; right now it happens by hand.

Targeting Search Engines: Google, Yahoo & MSN

There are three main search engines that you need to worry about. Google currently (April 2008) controls about 60% of the search engine market, so you should focus your efforts there. Yahoo is next, with about 20%. MSN/Live.com is third, at about 10%. The rest (Ask.com, AOL, etc. are too small to consider.)

Generally, techniques for the major engines are the same. However, there are some slight differences:

  • Yahoo apparently prefers quantity over quality of links, so getting as many back-links as possible is useful. Still, Google et. al. will see the same backlinks, so you have to be somewhat careful: don't go spamming all over the place.
  • Yahoo crawls quickly and aggressively, but takes a long time (months, in some cases) to update its index. Google, on the other hand, both crawls and updates fairly quickly (days; sometimes less).

Discussion

81.104.155.186, Dec 31, 2008 11:01 AM

In response to:
Use 404 for deleted pages, or 301 redirect deleted pages to non-deleted pages.

I would say:
Use 410 for deleted pages, or 301 redirect deleted pages to replacement/relevant non-deleted pages.

404's indicate that the file is not there, and the cause is unknown, if you delete a file make sure you replace it with a file that returns a 410 status code.

Enter your comment. Wiki syntax is allowed: