Robot.txt file (Full Version)

All Forums >> [Web Development] >> Search Engine Optimization and Web Business



Message


oraclewiz -> Robot.txt file (2/28/2006 15:35:56)

Alexa tells me that you should definitely have a robot.txt file in your root directory. They specify:
When a search engine crawler comes to your site, it will look for a special file on your site. That file is called robots.txt and it tells the search engine spider, which Web pages of your site should be indexed and which Web pages should be ignored. If its missing you may be not indexed at all.

The robots.txt file is a simple text file.(no HTML),
( User-agent: *
Disallow:)
that must be placed in your root directory, for example:

http://www.yourwebsite.com/robots.txt
Since I am newbe and file illierate, I'm not sure what this means. Is this my root directory file C:/MyWebs/oraclewiz
and does if have to be the first file in the hierarchy??
Using FP 2000 should I create another page and put this text in it and upload to GoDaddy my provider. And does if have to be the first file in the hierarchy??






Reflect -> RE: Robot.txt file (2/28/2006 16:18:30)

Hi,

Do not use a wysiwyg editor to make this file, use notepad.

Once created it goes in the root of your web site (where your index page resides). FP publish is OK to push it out with. As for hierarchy, don't worry about it.

Take care,

Brian




coreybryant -> RE: Robot.txt file (2/28/2006 17:44:20)

Your root is basically where your index.html file is (your home page of your website). Good robots will follow them, bad robots will not.

It does not have to be the first file - chances are it will not be - but the name does have to be robots.txt




Mojo -> RE: Robot.txt file (2/28/2006 18:53:56)

quote:

If its missing you may be not indexed at all


Not true. Missing this file has zero impact on wether or not your site will be indexed.




Peppergal -> RE: Robot.txt file (3/23/2006 22:25:35)

Here's a dumb question.

If I have a few pages in the main directory that I don't want indexed, and I just want to use the robots.txt file instead of <meta> tags on each of those pages, do I just use /page.html ? or do I have to have the entire URL?

or should I use the meta tags?




womble -> RE: Robot.txt file (3/24/2006 5:30:53)

For a single page I'd use meta tags and use the robots.txt for if you want to block whole directories.




Reflect -> RE: Robot.txt file (3/24/2006 9:06:05)

Peppergal,

robots.txt is also used to disallow individual pages. I use it to block individual pages all the time. Your syntax is correct also. I also use on the page the no index no follow meta just to hedge my bets [;)]. I then make sure to leave the page out of my google sitemap and my normal site map.

Take care,

Brian




Peppergal -> RE: Robot.txt file (3/24/2006 11:13:01)

google site map vs. normal site map?[&:] I didn't know there was a difference! I've been away from frontpagewebmaster too long!!

I can't leave the page(s) out of the site map, as it would be something the clients may need (legal document, Consumer Notice, as well as a contact form.) They just don't need to be indexed in the search engines - as just about every other real estate website must have them, by PA law. I doubt that even the clients read them [consumer notice], but they must be there.




Reflect -> RE: Robot.txt file (3/27/2006 10:56:31)

Google sitemap explained (It's sort of new so don't sweat not knowing about it)....

http://www.google.com/webmasters/sitemaps/login

Take care,

Brian




Page: [1]

Valid CSS!




Forum Software © ASPPlayground.NET Advanced Edition 2.4.5 ANSI
0.0625