LD SoftwareBespoke Software, Web Design, Security Consultants and Host Services.

Menu

Sentinel
You have been warned!
We have caught 5884 shameful hackers.

NukeSentinel(tm)

Paypal Referral
Sign up for PayPal and start accepting credit card payments instantly.

Link Exchange
Join our free link exchange

Click Here
 
Google Sitemaps
Home UP

Google Sitemaps

From Wikipedia the free encyclopedia, by MultiMedia


Google Sitemaps is a service offered by Google to help its crawlers better index webpages.

Google Sitemaps page

About

The Google Sitemap Protocol allows you to inform search engines about URLs on your website that are available for crawling. A Sitemap is an XML file that lists the URLs for a site using the Google Sitemap Protocol. The protocol was written to be highly scalable so it can accommodate sites of any size. It also enables webmasters to include additional information about each URL. when it was last updated; how often it changes; how important it is in relation to other URLs in the site etc. so that search engines can crawl the site more intelligently.

Sitemaps are particularly beneficial in situations when it is difficult for users to access all areas of a website through the browseable interface. For example, any site where certain pages are only accessible via a search form would benefit from creating a Sitemap and submitting it to search engines.

Note that the Sitemap Protocol is only a supplement and does not in any way replace, the existing crawl-based mechanisms that search engines already use to discover URLs. By submitting a Sitemap (or Sitemaps) to a search engine, you are only helping that engine's crawlers to do a better job of crawling your site.

Using this protocol does not guarantee that your webpages will be included in search indexes nor does it influence the way your pages are ranked by a search engine.

If you submit an XML sitemap via Google account, Google provides current crawler problem reports after a painless verification procedure to ensure only the site owner gets access to the stats area. For WAP sites Google uses a different procedure, and the URLs contained in the XML sitemap must be renderable on a mobile device.

XML Sitemap Format

The Sitemap Protocol format consists of XML tags. The file itself must be UTF-8 encoded.

Sample

A sample Sitemap that contains just one URL and uses all optional tags is shown below.

<urlset
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.google.com/schemas/sitemap/0.84
http://www.google.com/schemas/sitemap/0.84/sitemap.xsd">
<url>
<loc>http://www.yoursite.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>

As with all XML files, any data values (including URLs) must use entity escape codes for the characters : ampersand(&), single quote ('), double quote ("), less than (>) and greater than (<).

You can compress your Sitemap files using gzip. Compressing your Sitemap files will reduce your bandwidth requirement. Please note Google requires that your uncompressed Sitemap file is not larger than 10MB. Search engines will not process Sitemaps larger than 10MB.

You can also provide multiple Sitemap files, but each file that you provide must have no more than 50,000 URLs and must be no larger than 10MB (10,485,760 bytes) when uncompressed. These limits help to ensure that your web server does not get bogged down serving very large files.

If you want to list more than 50,000 URLs, you must create multiple Sitemap files. If you anticipate your Sitemap growing beyond 50,000 URLs or 10MB, you can create multiple Sitemap files and list them in a Sitemap index file. Sitemap index files should not list more than 1,000 Sitemaps and should be named as sitemap_index.xml.

It is strongly recommended that you place your Sitemap at the root directory of your Web server (http://yoursite.com/sitemap.gz). After you produce your Sitemap, you will need to notify search engines of the Sitemap's location. The search engines that you notify will they retrieve your Sitemap and make the URLs available to their crawlers.

Tools

Official Google Sitemaps Generator

Google is providing a script to generate the XML file based on the Sitemap Protocol, it will look at your server logs, web directory, or a list of URLs. The script is written in Python, and hosted on Sourceforge. (Sitemap Python Script) The script can be scheduled to run, via cron or Windows Task Scheduler. During the script's execution it will notify Google that the sitemap has changed and to schedule a download of that sitemap.

XML Configuration File

When the script is run it requires at least the configuration file.

$ python sitemap_gen.py --config=<path/config.xml>

In the zip file or gzip download there will be an example_config.xml file which has LOTS of documentation included in it.

Validation Tools

There are a number of tools available to help you validate the structure of your Sitemap based on this schema. You can find a list of XML related tools at each of the following locations:

http://www.w3.org/XML/Schema#Tools

http://www.xml.com/pub/a/2000/12/13/schematools.html

http://www.smart-it-consulting.com/internet/google/submit-validate-sitemap/

Third Party Generation Tools

There are a number of 3rd party tools available to generate, edit, view, submit and validate XML sitemaps. You can find a list of Google Sitemaps related tools at each of the following locations:

http://code.google.com/sm_thirdparty.html

http://www.sitemaptools.com/

http://www.smart-it-consulting.com/article.htm?node=133&page=41

XSLT Stylesheet

Normally the xml files are not human friendly. To improve the readability by humans you can add a xslt script to transform it to html. More information can be found at http://enarion.net/google/sitemaps/stylesheet/

External links


Google Guide made by MultiMedia | Free content and software

This guide is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

PREVIOUS NEXT
 
You can syndicate our News with backend.php And our Forums with rss.php
You can also access our feeds via Feedburner Site News and LD Software Forums
© 2009 ld-software.co.uk All Rights Reserved.
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.
Page Generation: 0.53 Seconds