A
Web search engine is a tool
designed to search for information on the World Wide
Web. Information may consist of web pages,
images, information and other types of files.
Whenever you enter a query in a search engine and hit 'enter' you get a list of web results that contain that query term. SEO is a technique which helps search engines find and rank your site higher than the millions of other sites in response to a search query. SEO thus helps you get traffic from search engines.
Friday, 9 March 2012
What is spider
A
program that automatically
fetches Web pages.
Spiders are used to feed pages to search
engines. It's called a spider because it crawls over the Web.
Another term for these programs is webcrawler.
Because most Web pages 

contain
links to other pages, a
spider can start almost anywhere. As soon as it sees a link to another page, it
goes off and fetches it. Large search engines ,
like Alta Vista,
have many spiders working in parallel.
How Web Search Engines Work
Crawler-based search
engines
are those that use automated software agents
(called crawlers) that visit a Web site, read the information on the actual
site, read the site's meta tags
and also follow the links that the site connects to performing indexing on all
linked Web sites as well. The crawler returns all that information back to a
central depository, where the data is indexed. The crawler will periodically
return to the sites to check for any information that has changed. The
frequency with which this happens is determined by the administrators of the
search engine.
Human-powered search
engines
rely on humans to submit information that is subsequently indexed and catalogued.
Only information that is submitted is put into the index. In both cases, when
you query a search engine to locate information, you're actually searching
through the index that the search engine has created —you are not actually
searching the Web. These indices are giant databases of
information that is collected and stored and subsequently searched. This
explains why sometimes a search on a commercial search engine, such as Yahoo!
or Google, will return results that are, in fact, dead links. Since the search
results are based on the index, if the index hasn't been updated since a Web
page became invalid the search engine treats the page as still an active link
even though it no longer is. It will remain that way until the index is
updated.
Major
search engines
Google
Yahoo
MSN/Bimg
Robots
Google:Googlebot
MSN / Bing: MSNBOT/0.1
Yahoo: Yahoo!
Slurp
Robot.txt file
Robot.txt is a file that gives instructions to all search engine spiders
to index or follow certain page or pages of a website. This file is normally
use to disallow the spiders of a search engines from indexing unfinished page
of a website during it's development phase. Many webmasters also use this file
to avoid spamming. The creation and uses of Robot.txt file are listed below:
Robot.txt Creation:
To all robots out
User-agent: *
Disallow: /
To prevent pages from all crawlers
User-agent: *
Disallow: /page name/
To prevent pages from specific crawler
User-agent: GoogleBot
Disallow: /page name/
To prevent images from specific crawler
User-agent: Googlebot-Image
Disallow: /
To allows all robots
User-agent: *
Disallow:
Finally, some crawlers now support an additional field called
"Allow:", most notably, Google.
To disallow all crawlers from your site EXCEPT Google:
User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /
"Robots" Meta Tag
If you want a page indexed but do not want any of the links on the page to be
followed, you can use the following instead:
< meta name="robots"
content="index,nofollow"/>
If you don't want a page indexed but want all links on the page to be followed,
you can use the following instead:
< meta name="robots" content="noindex,follow"/>
If you want a page indexed and all the links on the page to be followed, you
can use the following instead:
< meta name="robots"
content="index,follow"/>
If you don't want a page indexed and followed, you can use the following
instead:
< meta name="robots"
content="noindex,nofollow"/>
Invite robots to follow all pages
< meta name="robots"
content="all"/>
Stop robots to follow all pages
< meta name="robots"
content="none"/>
Robots.txt Vs Robots Meta Tag
Robots.txt
While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.
In order to use a robots.txt file, you'll need to have access to the root of your domain (if you're not sure, check with your web hoster). If you don't have access to the root of a domain, you can restrict access using the robots meta tag.
Robots Meta Tag
To entirely prevent a page's contents from being listed in the Google web index even if other sites link to it, use a noindex meta tag. As long as Googlebot fetches the page, it will see the noindex meta tag and prevent that page from showing up in the web index.
When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. Other search engines, however, may interpret this directive differently. As a result, a link to the page can still appear in their search results.
Note that because we have to crawl your page in order to see the noindex meta tag, there's a small chance that Googlebot won't see and respect the noindex meta tag. If your page is still appearing in results, it's probably because we haven't crawled your site since you added the tag. (Also, if you've used your robots.txt file to block this page, we won't be able to see the tag either.)
If the content is currently in our index, we will remove it after the next time we crawl it. To expedite removal, use the URL removal request tool in Google Webmaster Tools.
While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.
In order to use a robots.txt file, you'll need to have access to the root of your domain (if you're not sure, check with your web hoster). If you don't have access to the root of a domain, you can restrict access using the robots meta tag.
Robots Meta Tag
To entirely prevent a page's contents from being listed in the Google web index even if other sites link to it, use a noindex meta tag. As long as Googlebot fetches the page, it will see the noindex meta tag and prevent that page from showing up in the web index.
When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it. Other search engines, however, may interpret this directive differently. As a result, a link to the page can still appear in their search results.
Note that because we have to crawl your page in order to see the noindex meta tag, there's a small chance that Googlebot won't see and respect the noindex meta tag. If your page is still appearing in results, it's probably because we haven't crawled your site since you added the tag. (Also, if you've used your robots.txt file to block this page, we won't be able to see the tag either.)
If the content is currently in our index, we will remove it after the next time we crawl it. To expedite removal, use the URL removal request tool in Google Webmaster Tools.
Validate Your Code
There
are several ways to validate the accuracy of your website's source code. The
four most important, in my opinion, are validating your search engine
optimization, HTML, CSS and insuring that you have no broken links or images.
Start
by analyzing broken links. One of the W3C's top SEO tips would be
for you to use their tool to validate links. If you have
a lot of links on your website, this could take awhile.
Next,
revisit the W3C to analyze HTML and CSS. Here is a link to the W3C's HTML
Validation Tool and to their CSS Validation Tool.
The
final step in the last of my Top SEO
Tips is to validate your search engine optimization. Without having to
purchase software, the best online tool I've used is ScrubTheWeb's Analyze
Your HTML tool. STW has built an extremely extensive online
application that you'll wonder how you've lived with out.
One
of my favorite features of STW's SEO Tool is their attempt to mimic a search
engine. In other words, the results of the analysis will show you
(theoretically) how search engine spiders may see the website.
Install a sitemap.xml for Google
Though
you may feel like it is impossible to get listed high in Google's search engine
result page, believe it or not that isn't Google's intention. They simply want
to insure that their viewers get the most relevant results possible. In fact,
they've even created a program just for webmasters to help insure that your
pages get cached in their index as quickly as possible. They call the program Google Sitemaps. In
this tool, you'll also find a great new linking tool to help discover who is
linking to your website.
For
Google, these two pieces in the top SEO tips would be to read the tutorial
entitled How Do I Create a
Sitemap File and to create your own. To view the one on this page,
website simply right-click this SEO
Tips Sitemap.xml file and save it to your desktop. Open the file
with a text editor such as Notepad.
Effective
11/06, Google, Yahoo!, and MSN will be using one standard for sitemaps. Below
is a snippet of the standard code as listed at Sitemaps.org. Optional fields are lastmod,
changefreq, and priority.
<?xml
version="1.0" encoding="UTF-8"?>
<urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
The
equivilant to the sitemap.xml file is the urllist.txt for Yahoo!.
Technically you can call the file whatever you want, but all it really contains
is a list of every page on your website. Here's a screenshot of my urllist.txt:
Include a robots.txt File
By
far the easiest top SEO tips you will ever do as it relates to search engine
optimization is include a robots.txt file at the root of your website. Open
up a text editor, such as Notepad and type "User-agent: *". Then save
the file as robots.txt and upload it to your root directory on your domain.
This one command will tell any spider that hits your website to "please
feel free to crawl every page of my website".
Here's
one of my best top SEO tips: Because the search engine analyzes
everything it indexes to determine what your website is all about, it might be
a good idea to block folders and files that have nothing to do with the content
we want to be analyzed. You can disallow unrelated files to be read by adding
"Disallow: /folder_name/" or "Disallow: /filename.html".
Here is an example of the robots.txt file on this site:
Nomenclatures
Whenever
possible, you should save your images, media, and web pages with the keywords
in the file names. For example, if your keyword phrase is "golf
putters" you'll want to save the images used on that page as
golf-putters-01.jpg or golf_putters_01.jpg (either will work). It's not
confirmed, but many SEO's have experienced improvement in ranking by renaming
images and media.
More
important is your web page's filename, since many search engines now allow
users to query using "inurl:" searches. Your filename for the golf
putters page could be golf-putters.html or golf_putters.html. Anytime there is
an opportunity to display or present content, do your best to insure the content
has the keywords in the filename (as well as a Title or ALT attribute).
Use Title and ALT Attributes
More
often then not, web addresses (URL's) do not contain the topic of the page. For
example, the URL www.myspace.com says nothing about being a place to make
friends. Where a site like www.placetomakefriends.com would tell Google right
away that the site being pointed to is about making friends. So to be more
specific about where we are pointing to in our links we add a title
attribute and include our keywords.
Using
the Title Attribute is an direct method of telling the search engines about the
relevance of the link. It's also a W3C standard for making your page accessible
to disabled people. In other words, blind folks can navigate through your
website using a special browser that reads Title and ALT attributes. The syntax
is:
.
<a
href="http://www.seotips.com/seo_software.htm" title="SEO Software">SEO
Software</a>
The
ALT Attribute is used for the same reasons as the Title Attribute, but
is specifically for describing an image to the search engine and to the
visually disabled. Here's how you would use ALT in an IMG tag:
<img src="http://top10seotips.com/img/image01.jpg"
alt=" SEO Tips">
Use Headings
In
college and some high schools, essays are written using a standard guideline
created by the Modern Language Association (MLA). These guidelines included how
to write you cover page, title, paragraphs, how to cite references, etc. On the
Web, we follow the W3C's guidelines as well as commonly accepted "best
practices" for organizing a web page.
Headings
play an important role in organizing information, so be sure to include at
least H1-H3 when assembling your page. Using cascading style Sheets (CSS), I
was able to make my h1 at the top of this page more appealing. Here's a piece
of code you can pop into your heading:
<style
type="text/css">
h1 font-size: 18px; h2 font-size: 16px; h3 font-size: 14px;
</style>
h1 font-size: 18px; h2 font-size: 16px; h3 font-size: 14px;
</style>
Since
a page full of headings would look just plain silly, my SEO tip would be to
fill in the blank space with paragraphs, ordered and unordered lists, images,
and other content. Try to get at least 400+ words on each page.
Optimize Your META Tags
The meta description tag or element does not appear anywhere on a web
page, so why bother making it part of your on-page optimization
strategy? Because optimizing it can drastically influence the amount of
organic traffic you get from search engines once your URL begins to
rank well.
META
tags are hidden code read only by search engine webcrawlers (also called
spiders). They live within the HEAD section of a web page. There are actually 2
very important META tags you need to worry about: description and keywords.
Meta tags summarize what the site is about, and despite some SEO controversy,
they still play an instrumental role in meta-based search engines. The META
tags you need to be the most concerned about are:
- Description
- Keywords
Sequencing
of these tags may be extremely important. I say "may" because SEO is
mostly hypothesis due to the changing algorithms of the search engines. Even
though the W3C states that tag attributes do not have to be in any particular
sequence, I've noticed a significant difference when I have the tags and
attributes in the order described here. The only deviation from the list above
is that the Title tag should come before the META description.
The
description META tag is the text that will be displayed under your title
on the results page. See the OC Internet Advertising example above. There's
also a lot of controversy about the number of characters you should have in
this tag. I've seen sites with a paragraph in their description listed in the
top results, so I don't think the number of characters here plays any kind of
role with the search engines.
However,
if you want the listing to look clear and to the point, META tag would be to keep it under 150 characters and to not repeat
your keywords more than 3 times. It may be a coincidence, but I've also noticed
ranking improvements when I put my keywords at the beginning of the
description. Here's the syntax:
<meta name="description"
content="your_keywords_here followed by a statement about your product
service or organization." />
The
last important META tag is the keywords META tag, which some time ago
lost a lot of points in Google's search engine algorithm. this tag is still important to many other search
engines and should not be ignored. Based on my experience with this tag, you
can have approximately 800 characters in this tag (including spaces).
if you repeat your keywords more than 3 times it can be a pretty good
indication to the search engine that you are trying to spam their search
results. Also, don't waste your time including keywords that aren't used in the
BODY section of your website, that could be seen as another spam technique.
Here's the syntax used on this
<meta name="keywords"
content="top 10 seo tips, what is seo, resources, seo software, seo ebook,
search engine optimization" />
<meta name=”description” content=”This is a sample meta description value.”>or
<meta name=”description” content=”This is a sample meta description value.” />depending on whether the document is HTML or XHTML, respectively.
Keyword density placement & Research
Keyword density :- No
SEO consultants will tell you the correct keyword density for a keyword. what is the ideal keyword density for your
targeted keywords. The reason is because keyword density is tricky, it does not mean that having
a high keyword density will guarantee you a top ranking but it will guarantee you a low position in
the SERP if you have an extremely low keyword density. The following techniques show how can you achieve an ideal
keyword density for your targeted keyword(s).
Places for keywords are important so that it can give the Search Engine an idea of what content are you stressing in your page.We cannot deny that your domain name is the most obvious place for any keyword(s) you are targeting. Choosing an appropriate name for your domain is the first step you should do when deciding on a website. The following explains the other useful areas in addition to the domain names that you should consider when developing your website.
Keyword Research :-You have come up with a great idea for your next web project (or your first web project) and are getting ready to build your site.
Most people go out and build a website, then when after a few months of not getting any site visitors they either give up or think about trying to Optimize their site for the Search Engines. If you do not plan for SEO to be part of the site-building process, you will probably spend a lot of time going back and almost re-building your entire site later on.
Part of your planning should be to do some research on keywords for your site and pages. Your selected keywords form the basis for the SEO of your website.
It is getting harder and harder to get a new website to the top of the Search Engines, so your choice of keywords and knowing a bit about them is important to your success. Search Engines index words (and phrases). You want the words you choose to be words that are searched for and not too competitive.
Places for keywords are important so that it can give the Search Engine an idea of what content are you stressing in your page.We cannot deny that your domain name is the most obvious place for any keyword(s) you are targeting. Choosing an appropriate name for your domain is the first step you should do when deciding on a website. The following explains the other useful areas in addition to the domain names that you should consider when developing your website.
Keyword Research :-You have come up with a great idea for your next web project (or your first web project) and are getting ready to build your site.
Most people go out and build a website, then when after a few months of not getting any site visitors they either give up or think about trying to Optimize their site for the Search Engines. If you do not plan for SEO to be part of the site-building process, you will probably spend a lot of time going back and almost re-building your entire site later on.
Part of your planning should be to do some research on keywords for your site and pages. Your selected keywords form the basis for the SEO of your website.
It is getting harder and harder to get a new website to the top of the Search Engines, so your choice of keywords and knowing a bit about them is important to your success. Search Engines index words (and phrases). You want the words you choose to be words that are searched for and not too competitive.
Optimize Your Title & Optimizing Your Website Title
The
Title and META tags should be different on every page of your website if you
wish for most search engines to store and list them in the search results. Us SEO
Expert's have experimented with these two pieces of code to help us
reach an accepted conclusion about how best to use them. Don't click off this
site until you've read the top 10 SEO tips below to see what I've discovered
works best for search engine optimization.
There
are different theories about how long your Title should be. Since Google only
displays the first 66 or so characters (with spaces), my Top 10 SEO Tips for
the title would be to keep it under 66 characters and relevant to the
content on the page. However, some may argue that the value of the homepage
title may warrant additional search term inclusion.
Bar
none the most important of the top 10 SEO tips involves your keywords.
If you wish to be on the first page of the search results, you must include
your keywords in your Title tag. Preferably before all other words in the
Title. No need to repeat your keywords in the Title, that's interpreted as spam
by the search engines. Here is an example of good Title:
This one excercise could make or break your SEO campaign.
Click-Through Rate (CTR) plays an instrumental role in how relevant
Google thinks your website is. By compelling users to click with clear
call-to-actions (buy, order, download, beat, fix, etc) and by using value
propositions (guaranteed, on sale now, etc), one can improve their CTR and
search engine ranking. Oh, don't forget to squeeze your keywords in there as
well.
Discover Your Competitors
It's
a fact and that search engines analyze incoming
links to your website as part of their ranking criteria. Knowing how many
incoming links your competitors have will give you a fantastic edge. Of course,
you still have to discover your competitors before you can analyze them.
My
tool of choice is SEO Elite, which digs
through the major search engines by keyword to not only tell you who your
competitors are, but also provides you with an in-depth analysis of each
competitor. The analysis includes these extremely important linking criteria
(super SEO tips), such as:
- Competitor rank in the Search Engines
- Number of incoming links
- What keywords are in the title of linking page
- % of links containing keywords in the link text
- The PageRank of linking pages
- The Alexa traffic ranking information
Stats,
such as the above, play a critical part in determining what tools your website
will need to compete in the Internet marketing competition. SEO Elite also offers you the ability to see who
the website owner is and even send emails to all websites discovered to have
quality link potential.
Find the Best Keywords
It
would be a waste of your time to optimize your website for keywords that are
not even being searched for. Therefore you should invest some energy into
finding the best keywords. There are several SEO tools available on the
Internet to help you find the best keywords. Don't be deceived by
organizations that require you to register first. The two most popular
resources are WordTracker and
KeywordDiscovery.com.
Below is a screenshot from WT that shows the results you'll get when doing a query for "putter". Notice that "golf putters" has the highest search volume with 100 searches in the last 24 hours, yet there are over 100,000 websites to compete against. Using the tool's Keyword Effectiveness Index (KEI), you'll be able to see that "custom putter" would have a better chance at higher ranking, since there are only 2,640 competing.
When using any SEO tool for doing keyword
research, start by keeping your searches ambiguous like we did in the example
above for "putters". The results will always return suggestions,
sometimes surprising ones that you may not have thought of.
You can get less comprehensive results by using
DigitalPoint's Keyword Suggestion Tool.
This SEO tool will give you a summary of information without the KEI.
Personally, I like to know how many people are competing before I design a webpage.
Subscribe to:
Posts (Atom)