Skip to page content or Skip to Accesskey List.

Work

Main Page Content

Targeting Search Engines

Rated 4.16 (Ratings: 4)

Want more?

 
Picture of wolf

Wolfgang Bromberger

Member info

User since: 14 Dec 1998

Articles written: 34

Search engines are a very important thing to care about on the web. Even if you do not have the time to update your pages according to every new trendy funky idea a search engine algorithms might set up, you should at least know the basics and act according to them.

Overview

1 Intro
2 Register your site with search engines
3 Improving indexability through coding
4 Add a robots.txt file to your main directory
5 Add meta-tag descriptions and keywords
6 What to do before registering
7 The right way to register
8 What you should not do
9 How to check your position
10 What bots are coming to your place

Intro

back to top

If you look around for promotion on search engines, you can find services that will submit hundreds, 500, or even thousand search engines. Using one of these services would be misleading or just plain wrong. There are

  • search engines
  • directories
  • FFA (Free for all links) pages

Search engines are basically data bases populated by a bot (automatic software agent), which judges intelligence by mostly secret algorithms.

Directories are human edited databases, ordered by topic usually. Pros: Mostly relevant information on topics Cons: Not many entries according to the size of the web, it is not always easy to get your site mentioned.

FFA are pages, that are only capable of a number of links, and delete the oldest entries. Most you get out of it is spam, not many exceptions that bring you traffic, but can be useful in adding relevance to your site in SE, as some count the links back to you. (But as the intelligence of this mechanism rises, it also counts the fact that you are linked from such a page and not cnn.com)

According to Searchenginewatch.com: In April 1998, a Science magazine study estimated that there were 320 million indexable pages on the web.

This number is constantly rising, so it might not be a bad idea to get familiar with methods of trying to use the SE for your advantage.

Register your site with search engines

back to top

It is important that search engines list you. We have learned now that not everything that calls itself a search engine is a real one. So which are the important ones?

There are several search engines that are important to be listed in.

Once referred to the "big seven" they are (alphabetical order):

  • Alta Vista
  • Excite
  • Hotbot
  • Infoseek
  • Lycos
  • Northern Light
  • Webcrawler
  • Yahoo! (the most popular, or to be fair the most visited search site out there)

There are constant changes going on. Other, newer services are Go, Google, Magellan, Snap to name just a few.

You should try to get a good position in as many indexes as possible. If your site is about some niche or special topic, be sure also to look for specialized engines or directories. Depending on your local, cultural and language aspect of your site, you should look for some European places as well, if you plan to target them.

Improving indexability through coding

back to top

Use a descriptive page title.

The title tag is still the most important fact for page indexing (after the domain name). Be careful to write a good line with keywords in them.

Some basic things to keep in mind: Bots can do no scripts.

Basic HTML is best indexed.

Most bots will not index more then a certain number of lines of your page.

So if you have a long javascript in the beginning of your page, chances are low that the visible words in the content will be indexed.

If you have a nice drop down menu and wonder why the spiders do not come to your other pages, it's because they can't read javascript. Add normal links.

Bots cannot process question marks in URLs.

Most search engines rank you lower if your pages are in the cgi-bin, because they are afraid of bot traps.

Frames are definitely a thing to avoid by all means if you want to be visible at search engines.
Without discussing whether frames are good or not, for search engines frames are not OK.

While you see content based on one or more HTML files, the bots just get the separate pages. If you have to do it though, add the noframes tag and your navigation scheme on every content page. You can also use some Javascript to check the frames are right, but this goes back to the point about bots parsing length and Java Script above.

Comment tags are added to relevance counting by some search engines.

Good written content with many keywords is still one of the best recipes for good search engine position. Be careful not to repeat your keywords too often in the page, or your results will drop. Adding some is always a good idea. For pictures add alt tags with full descriptions. You can also use the alt tag for links (only displayed in in IE4).

Add a robots.txt file to your main directory

back to top

The robots.txt file is the first thing the bots will search for when coming to your site.

If you do not have one, the robots will usually not dig deeper. The robots.txt file will also let you specify what directories in your site should not be read.

You make a simple text file with this content:

# /robots.txt for http://evolt.org
User-agent: *
Disallow:

Under User-agent you can add bots you know that should not index the site. You can add other User-agents as well. The * means that all bots are meant in this command. Disallow is the command to exclude the indexing of pages.

As this is something you usually do not want to change, most times the text above will be enough.


Add meta-tag descriptions and keywords

back to top

Add meta-tag descriptions and keywords, WISELY, to each HTML page. Unfortunately, this is no longer the "golden key" it used to be.

In general, meta tags are still a good idea. Excite does not support them, it relies on a different scheme of reading the common words or themes within a page. Then it selects sentences for the summary that either contain these words or convey the overall theme.

Meta tags would be very powerful if all of them were implemented in all search engines. Unfortunately this not the case. Many useful meta tags are unused or unsupported. Now the most important:

<META name="author" Content="author_of_site">

<META name="publisher"Content="content_publisher">

<META name="copyright" Content="copyright_holder">

<META name="keywords" Content="keywords,single,whole phrases">

<META name="description" Content="what_this_is_about">

<META name="page-topic" Content="general_topic">

<META name="page-type" Content="what_this_is_for">

<META name="audience" Content="for_whom_the_site_was_made">

<META name="expires" Content="time_to_get_unindexed">

<META name="robots" Content="Index, Follow"> bots shall follow and index your site

<META name="robots" Content="Noindex, Follow"> bots shall follow and not index your site

<META name="robots" Content="Index, Nofollow"> bots shall index this page, but not follow

<META name="robots" Content="Noindex, Nofollow"> bots shall not index but not follow

What to do before registering

back to top

Check the pages you want to submit. Make sure they have no broken links, have good meta tags, and a good title.

In a plain text editor write up descriptions of your site in 25, 50, and 75 words.

Keep the text editor open so that you can cut and paste the descriptions along with all the other information you are submitting to the search engines. You can also use a free clipboard program like Classic Clipboard to copy and paste.

Proofread your descriptions! Nothing is more embarrassing then being listed with some misspellings, unless you misspell on purpose.

On purpose?

Yes, especially for heavily targeted keywords, this is sometimes a good way to get a good ranking, because you do not have to compete with other sites that much.

Of course, this yields not as many results, but still constant one. You can use common misspellings as well.

The right way to register

back to top

There may be more than one right way, as people have different approaches (depending on what services they are trying to sell you).

Of course if there were just one right way, it would already be patented and locked by lawyers never to be seen again. There are different approaches: use additional software, buy services, do it yourself or let others do it for you.

Let's put it this way. It is not wrong to open your browser, type in the search engine submit URL, and enter the information, field by field, all by yourself.

I estimate that should get you better results then in most of other alternatives. So everyone has to find his own way, mainly according to his time and resources. You should not only rely on search engines though. There are other ways to bring new viewers to your site, but of course using search engines is one part of the promotion game.

What you should not do

back to top

spam search engines

add text that is in the same color as the background color

use keywords too often in your page in general and in your meta tags especially

insert multiple occurrences of titles (title stacking)

build identical pages and submit them as doorway pages

add keywords that have nothing to do with your site

use copyright protected or trademarked terms- you could face legal actions leaving you paying for the rest of your coding days.

How to check your position

back to top

Finally, check your efforts to confirm that you have been listed.

You can search for the URL of your site, and see if it turns up. You can search by hand, for Alta Vista:

url:your_address

Or do you not have the time to search for them all the time by hand? Then use one of these free services. There are Tracerlock for free Alta Vista monitoring, and The Informant for free.

What bots are coming to your place

back to top

Here are some:

  • Altavista: Scooter/1.0 scooter@pa.dec.com
  • Goto: robot@idealab.com
  • Lycos: spider2-uu.wisewire.com Lycos_Spider_(T-Rex)
  • Webcrawler WebCrawler/3.0_Robot libwww/5.0a
  • Hotbot slurp@inktomi.com http://www.inktomi.com/slurp.html
  • Infoseek InfoSeek Sidewinder/0.9
  • Northernlight taz.northernlight.com
  • Google BackRub/2.1 backrub@google.stanford.edu
  • Answerz AnswerzCrawl

If you have anything to add, corrections or questions, let me know. Best wishes and good luck, sincerly, Wolf

Wolfgang .wolf Bromberger has been around online since 1996. He started to get into web design after he and some other students developed a concept for the online presence of their home town, Salzburg in Austria, a site Bill Gates used years later as a good example of e-government (as still not nearly all points of the concept have been made reality, .wolf disagrees).
Being interested in search engines and information systems, .wolf specialized in search engine optimization, online promotion and analysis.
.wolf was one of the founding fathers of evolt.org
He is working for Kreiseder.com and can also be reached there.
He is always interested in learning new programming or other web related skills, when time permits.

The access keys for this page are: ALT (Control on a Mac) plus:

evolt.org Evolt.org is an all-volunteer resource for web developers made up of a discussion list, a browser archive, and member-submitted articles. This article is the property of its author, please do not redistribute or use elsewhere without checking with the author.