Jump to content

Robots.txt


Rod Hagen
This topic is 6265 days old and is no longer open for new replies.  Replies are automatically disabled after two years of inactivity.  Please create a new topic instead of posting here.  

Recommended Posts

Could someone walk me through how exactly to keep the archivist bots off my website? What is the code exactly and exactly where in my html do I put it? In what pages, all pages, or just the index.htm? I assume inserting <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> into my index.htm isn't enough (or correct), right?

 

Once done, I assume that there is no way to remove my old pages (5/01/99-4/7/07) from the archivists, correct?

 

THANKS!

Link to comment
Share on other sites

Rod,

 

One thing that might help is to create a file in the root of your website with the name robots.txt that contain instructions to the web crawlers on which folders, if any, that they are allowed to view.

 

For example, these lines indicate that no crawler should archive any files in the xxx and yyy folders:

 

# Disallow anyone from looking at xxx and yyy

User-agent: *

Disallow: /xxx

Disallow: /yyy

 

You can provide hints to specific crawlers by using the name of the crawler in the User-agent field. For example:

 

User-agent: WebReaper

Disallow: /

 

User-agent: SiteSnagger

Disallow: /

 

Note, that the robots.txt file can definitely help block web crawlers , but it is only a collection of hints that any visitor can freely ignore.

 

...Hoover

Link to comment
Share on other sites

As I said before, each site has its own policies so you'll have to visit them to look them up. I gave the link to google's above. There's no central clearinghouse.

 

The "big 3" right now are google, Yahoo, and MSN. You could start with the list browsers give you under search options, or you could use the search engines themselves to search for lists of search engines. As with everything about the web, the best source of information is the web itself.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...