navigation
a webmaster learning community
     Home    Register     Search      Help      Login    
Sponsors

Shopping Cart Software
Ecommerce software integrated into Frontpage, Dreamweaver and Golive templates. No monthly fees and available in ASP and PHP versions.

Website Templates
We also have a wide selection of Dreamweaver, Expression Web and Frontpage templates as well as webmaster tools and CSS layouts.

Frontpage website templates
Creative Website Templates for FrontPage, Dreamweaver, Flash, SwishMax

Search Forums
 

Advanced search
Recent Posts

 Todays Posts
 Most Active posts
 Posts since last visit
 My Recent Posts
 Mark posts read

Microsoft MVP

 

Google not following robots.txt?

 
View related threads: (in this forum | in all forums)

Logged in as: Guest
Users viewing this topic: none
Printable Version 

All Forums >> Web Development >> Search Engine Optimization and Web Business >> Google not following robots.txt?
Page: [1]
 
womble

 

Posts: 5594
Joined: 3/14/2005
From: Living on the edge
Status: offline

 
Google not following robots.txt? - 7/3/2008 10:02:55   
Maybe I'm confused bewildered, and maybe I'm doing it wrong, but I've just been doing a search on Google using a search term that's in one of my domain names, and unsurprisingly some of the pages from that particular site came up, but then I switched to an image search, and although there's not huge amounts, some images from the site, in a folder unsurprisingly called "images" that should be blocked to search engines indexing, are coming up in the image search.

The robots.txt file goes:

User-agent: *

Allow: /
Disallow: /test/
Disallow: /scripts/
Disallow: /images/
Disallow: /photos/
Disallow: /styles/
Disallow: /includes/


...and a few other folders I don't want indexing.

Looking at the robots.txt files on some of my other sites, they don't have the "Allow: / " line. Could it be that that's causing the problem...or do I just need to go round and give Google a good slapping? :)

_____________________________

~~ "A cruel god ain't no god at all" ~~
:)
rdouglass

 

Posts: 9265
From: Biddeford, ME USA
Status: offline

 
RE: Google not following robots.txt? - 7/3/2008 10:57:15   
Remove or put your Allow line at the end. I believe the parser will go down the list 'till it hits a valid rule and then stop. Hence, your Allow line as the first line will allow everything and never reach the Disallow lines.

The Allow is virtually redundant (at least IMO) 'cause if it doesn't see a rule, it assumes Allow and I don't think 'Allow' is actually valid for all 'bots.

Hope it helps.

< Message edited by rdouglass -- 7/3/2008 11:10:14 >


_____________________________

Don't take you're eye off your final destination.

ASP Checkbox Function Tutorial.

(in reply to womble)
jurgen

 

Posts: 385
Joined: 1/9/2007
From: Castle Rock, Colorado
Status: offline

 
RE: Google not following robots.txt? - 7/3/2008 16:46:03   
He is right Womble, the Allow doesn't do you any good. Below is what I use and it works. You could also specify certain bots and what they can do.

quote:

User-agent: CazoodleBot
Disallow: /

User-agent: *
Disallow: /style/
Disallow: /images/
Disallow: /gfx/


_____________________________

Wedding Dresses Colorado

(in reply to rdouglass)
surajseo

 

Posts: 7
Joined: 6/23/2008
From: www.directoryurlsubmission.com
Status: offline

 
RE: Google not following robots.txt? - 7/24/2008 2:01:11   

quote:

ORIGINAL: womble

The robots.txt file goes:

User-agent: *

Allow: /
Disallow: /test/
Disallow: /scripts/
Disallow: /images/
Disallow: /photos/
Disallow: /styles/
Disallow: /includes/




You dont need to put allow command "Allow: /" only write here about disallow remaining all folder would be read by Google's bot or other SE's bot.

I define only about disallow in my robot.txt and its working fine and i think its basic rules.

According to your command first you permit the bot to read all ...... :)

Thanks :)
Thanks :)

(in reply to womble)
womble

 

Posts: 5594
Joined: 3/14/2005
From: Living on the edge
Status: offline

 
RE: Google not following robots.txt? - 7/24/2008 8:49:37   
OMG! There's an echo in here! :) Thanks guys! :)

Now all I need to do is remember which site this particular post was about so I can remove it! :)

_____________________________

~~ "A cruel god ain't no god at all" ~~
:)

(in reply to surajseo)
Page:   [1]

All Forums >> Web Development >> Search Engine Optimization and Web Business >> Google not following robots.txt?
Page: [1]
Jump to: 1





New Messages No New Messages
Hot Topic w/ New Messages Hot Topic w/o New Messages
Locked w/ New Messages Locked w/o New Messages
 Post New Thread
 Reply to Message
 Post New Poll
 Submit Vote
 Delete My Own Post
 Delete My Own Thread
 Rate Posts