Google's only indexing some of my pages? (Full Version)

All Forums >> [Web Development] >> Search Engine Optimization and Web Business



Message


Mike_R -> Google's only indexing some of my pages? (9/10/2003 11:13:52)

For the past several weeks, I have been editing my pages, adding content and targeting specific keyword groups. Today I noticed that Google ran a crawl yesterday. Oh how excited I was! I immediately began running searches for my keyword sets, but didn't find myself anywhere in the SERPs. I went back and looked at the pages that Google indexed, and noticed that only some of them were indexed.

Does anyone know how that could happen? I have a site map that is accessible from all pages. All of my pages are linked to. The crucial pages that didn't get indexed are linked to from a pop out menu in my left sidebar. My in-site navigation is great . . . I think. This has got me talking to myself.

Mike R




erinatkins -> RE: Google's only indexing some of my pages? (9/10/2003 14:11:38)

Did you look at the information listed on Googles webmaster page?




Mike_R -> RE: Google's only indexing some of my pages? (9/10/2003 15:06:21)

Erin, yes, I've reviewed this page and seem to have done everything as I should. I am concerned about one thing, though. Google says it doesn't like doorway pages, but I may have some. I am somewhat forced to because I use Microsoft bCentral's Commerce Manager, and it's the only way I have control over presentation flow.

Actually, I'm not sure if they qualify as "doorway pages." My site works like this. In my left menu bar, I have links to product category pages. I'll use widgets as an example. My menu link to widgets takes the surfer to a page with content about widgets and shows the various groups of widgets that I sell. From here, the groups can be clicked on to see a list of products in that category. In the list, one can click further to access a detail page for a specific product and, if they wish, order it. Does my initial product category page of content and groups sound like a doorway page? I have read truckloads about these and am still not sure.

Mike R




Mike_R -> RE: Google's only indexing some of my pages? (9/11/2003 9:56:30)

In my research to determine why some of my pages were not indexed, I discovered a few more things.

1.) All of the pages that weren't indexed were new ones that I had created since the last crawl.

2.) All of the pages were ones that I removed two of the FrontPage meta tags from:

<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">

Incidentally, I noticed that when I look at the tree structure for these pages, none of them has the FrontPage symbol associated with it. It has been replaced with the IE icon. Is this anything to worry about? Doesn't seem like it should be.

Any ideas?

Mike R




Reflect -> RE: Google's only indexing some of my pages? (9/11/2003 12:35:43)

quote:

<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">


Mike,

I always remove these two METAs so i would rule that one out. I think the new pages VS. last crawl is your key here.

Brian




burgi82 -> RE: Google's only indexing some of my pages? (9/11/2003 15:58:04)

hi,
to be honest, I remove all meta tags except title and description. I don't have meta keywords on my sites because I don't think they are worth it.
So, as stated above, the removal of these tags has nothing to do with it.




Mike_R -> RE: Google's only indexing some of my pages? (9/12/2003 12:21:16)

Originally, I didn't submit my site to Google. They picked it up in a crawl because I was linked to. What if I submit the site now? In doing so, can I submit individual pages (the one's that aren't being indexed currently), or do you have to submit the home page and just hope they crawl to all your pages?

Mike R




erinatkins -> RE: Google's only indexing some of my pages? (9/12/2003 14:00:09)

You might want to see what happens when they crawl your site again. My guess is they will pick them up.




Mike_R -> RE: Google's only indexing some of my pages? (9/24/2003 11:28:11)

Arghhh! It looks like Google's done another crawl because its cache of my home page shows my latest update from Sunday. But my new pages are still not being indexed! This is so frustrating. I have added clear text links to the pages, and they are all in my site map. The only other thing I can think of is that Google doesn't like the size of my pages. I heard once that Google won't index past 100k, and most of my pages are pushing that (When I redesign the site in a few months, I will be correcting this). Most of these pages, however, are in the 96 - 98k range, so they should be okay, right?

In any case, that doesn't mean that Google won't index any of the page; It just means that it stops at the 100k mark, right? I am without a clue here. Does anyone have any ideas?

Mike R




Mike_R -> RE: Google's only indexing some of my pages? (9/24/2003 23:33:12)

I am examining my site in excruciating detail trying to figure out why Google doesn't like my new pages. One question: The FrontPage Navigation views only really matter if you've linked pages using FrontPage tools like "back" and "next," right? I just create my pages and make the appropriate links. This won't have any affect on how Google interprets my links, will it?

Second, does Google perform different types of crawls? I am basing my belief that Google crawled my site based on the page in its cache, which is my most current. Do they have some crawls that update already indexed pages and other crawls that index the entire site?

Third, would it help if I used absolute pathnames? Currently, mine are all relative. It seems much easier to manage this way.

And finally, could someone take a look at my robots.txt. I have been looking at various sites on this subject, and I am seeing formats that differ slightly. I am concerned as I know that one little error could cost me. Here it is:

__________________
User-agent: *
Allow /
Disallow: /_private/
Disallow: /cgi-bin/
Disallow: /_derived/
Disallow: /_vti_log/
Disallow: /catalog_images/
Disallow: /catalog_templates/
Disallow: /images/
Disallow: /Includes/
Disallow: /fpdb/
Disallow: global.asa
Disallow: TEMPLATE.htm
Disallow: login.htm
Disallow: rollovers.js
Disallow: /*?
__________________

I just added 2 things: one, the "Allow: /"; and two, the "Disallow: /*?" to take care of my .asp pages.

Any help on any of these trouble points is greatly appreciated.

Thanks,

Mike




Andy from Spain -> RE: Google's only indexing some of my pages? (9/25/2003 4:20:05)

Hi Mike

I think it's more a question of patience - you'll probably find that with each crawl a few more pages are indexed and later on are added to the Google database - I think you might be expecting things to happen too quickly.

I haven't seen your site but I'd always add a text based site map, helps both users and Googles.

Cheers
Andy




Reflect -> RE: Google's only indexing some of my pages? (9/25/2003 7:26:05)

Hi Mike,

I am getting really curious. Can you drop a URL to two of the "not indexed" pages please?

Also you can validate your robots.txt file here...

http://www.searchengineworld.com/cgi-bin/robotcheck.cgi

I think these might be an issue but let the validator confirm it...

Here I believe that you have allowed all bots at the root level. It invalidates all disallows...

User-agent: *
Allow /

Here I believe that this is not proper syntax...

Disallow: /*?

Brian




Mike_R -> RE: Google's only indexing some of my pages? (9/25/2003 9:44:43)

quote:

I haven't seen your site but I'd always add a text based site map, helps both users and Googles.


Andy, I have a text-based site map. In fact, the funny thing is that Google indexed the site map. Why does it ignore the contents of the site map? Odd. Maybe I do need to be more patient. But I have close to 15 new pages that have missed 2 crawls. I'm just getting a little concerned.

quote:

I am getting really curious. Can you drop a URL to two of the "not indexed" pages please?


Sure:

http://www.darncoolstuff.com/remote-control-trucks.htm
http://www.darncoolstuff.com/security-professional.htm

quote:

I think these might be an issue but let the validator confirm it...


Don't worry about those the "Allow: /"; and the "Disallow: /*?". I removed them both. They were added after the last two crawls and didn't have any bearing on my indexing status. Another funny thing, though, is that I did run the validator on those two lines, and it didn't like either of them. But I got the idea to add them from a link at Google's site submission section. Oh well. By the way, I like the validator. Great tool!

Mike




_gail -> RE: Google's only indexing some of my pages? (9/25/2003 16:05:54)

quote:

quote:

I am getting really curious. Can you drop a URL to two of the "not indexed" pages please?

Sure:

http://www.darncoolstuff.com/remote-control-trucks.htm
http://www.darncoolstuff.com/security-professional.htm


I'm far from a expert in any of this but I note you have a lot of graphics in your site. Perhaps the following may apply:

http://www.searchengineworld.com/misc/avoid.htm

quote:

Site Promotion Things You Should Avoid

5: Don't Abuse Images. I recently had someone ask me why their site couldn't get indexed on the search engines. I wasn't surprised when I looked at their site - 41 pages of pure images only - not a shred of text on the site. That is the worst case scenario of course, but you should keep pages under 64k (max) total graphics and text. Anything else, your losing your search engine food, and the load time is driving away users before the page ever loads.




Reflect -> RE: Google's only indexing some of my pages? (9/26/2003 7:34:27)

Hi Mike,

This is what the spider sees for the first link you dropped...

Become a Darn Cool Stuff dot com Member Educational Toys Educational Software Stun Guns Air Tasers Software Games Darn Cool Stuff.com Self Defense Remote Control Cars Planes Trucks Boats Pellet Guns Airguns BB Guns Air DarnCoolStuff.com Home Page Frequently Asked Questions Search Privacy Policy Contact Us Link Bar: Home | Frequently Asked Questions | Search Darn Cool Stuff.com | Email Us | Privacy Policy Software Games Educational Software Self Defense and Home Security Products Sporting Goods Pellet Guns BB Guns Paintball Guns Remote Control Radio Control Cars Trucks Planes Boats Educational Toys Custom Product Search Gift Certificate JUST ADDED!!! Voice Alert Home Protection System ------ Age of Empires 2 Gold ------ Megatech Sky Vector RTF Airplane ------ Splatmatic XJ40 Paintball Gun Coming Soon : The Ultimate Canasta Strategy Guide. Get a Preview Here "COOL" Fun Educational Trendy Cost-Saving What YOU Want !!! It's All About Choice!!! DarnCoolStuff.com Darn Cool Stuff dot com Custom Product Search YOU NAME IT, WE FIND IT!!! Cool Links to other Darn Cool Sites! Remote Control Trucks Airplanes - Cars - Trucks - Boats Experience monster truck madness with our Nikko, Tamiya, and Traxxas remote control trucks. High performance, high ground clearance, and high times are a part of the package. Many models are ready-to-run so you can get running off-road quickly. State-of-the-art suspension systems make these radio control truck a must have! New to R/C? Need to learn the remote control truck ropes? Check out our "Getting Started" series for newbies!!! Trucks Radio Control Trucks Remote Control - Traxxas Check out our Inventory We Carry Nikko, Tamiya, and Traxxas Trucks!!! Here are just a few of our R/C trucks: NIKKO 1/10 SCALE HUMMER WAGON BANDIT OFF-ROAD BUGGY RTR with Radio TRAXXAS STAMPEDE 1/10 SCALE HIGH-PERFORMANCE MONSTER TRUCK New to R/C? Need to learn the remote control truck ropes? Check out our "Getting Started" series for newbies!!! Airplanes - Cars - Trucks - Boats Search DCS.com Review Your Selections View Shopping Cart Darn Cool Deal of the Day - Daily Special Come here every day to find special Darn Cool Savings! Bookmark this page Don't Forget BOOKMARK US and Check in Daily!!! Bookmark this page Internet Shopping Newsletter Sign-Up GET THE DETAILS HERE! Gift Certificates RESELL OUR PRODUCTS!!! $$$ Wanna make big bucks reselling our products on the Internet--or anywhere? See our Reseller Information page CHECK OUT OUR LIST OF OVER 800 NAME-BRAND SUPPLIERS Darn Cool Stuff.com Home | FAQs | Search DCS.com | Links American Express Visa Master Card Contact Us | Privacy Policy | Site Map Darn Cool Deal of the Day! | Custom Product Search | Gift Certificates | Resell Our Products | Internet Shopping Newsletter Coming in December : Tad Roberts' long-awaited book, "Secret Strategies to Winning at YahooŽ Canasta" -- Get a preview here! Software: Games / Educational / Misc . | Security: Stun Guns / Self-Defense / Home / Professional | Sporting Goods: BB Guns and Air Rifles / Paintball Guns and Crossbows | Remote Control: Airplanes / Cars / Trucks / Boats | Toys: Educational Working Hard to Become the World's #1 Online Discounter!!

You really wanty your content to be a lot closer to the ending </head> tag. You can use CSS or you can inset a blank cell right above your DHTML menu. That way the spider sees teh blank cell, then it sees the content next, then it sees you menu.

This is the keyword flow that the SE thinks you are targeting:

One word:
control
trucks
cool

Two words:
darn cool
remote control
control trucks

Next I looked at you root page. I am unsure why it got ranked. It shows pretty much the same thing.

Brian




Mike_R -> RE: Google's only indexing some of my pages? (9/26/2003 10:08:37)

quote:

I'm far from a expert in any of this but I note you have a lot of graphics in your site.

Gail, you are correct. I do have more image content than is recommended. I plan on changing my entire layout and adding CSS in about 3 months. This should make my pages more efficient and reduce my page weight considerably.

quote:

You really want your content to be a lot closer to the ending </head> tag. You can use CSS or you can inset a blank cell right above your DHTML menu. That way the spider sees teh blank cell, then it sees the content next, then it sees you menu.


A lot to talk about here: First, what keyword density analyzer are you using. Just curious. Second, I had read before that content was better placed as far up the page as possible; however, as long as the content is there, shouldn't the page still get indexed? Positioning would affect rank more, wouldn't it? Third, is it possible that my header, which contains 80% of my images, is stopping the spiders? Can they run into so much image content that they stop crawling the page. Fourth, if so, then your recommendation of getting the content up closer to the </head> tag seems right on target. I haven't learned CSS yet, but have gone through a mini tutorial on it, so I have some idea of how it works. Is it possible to create that cell with the content you were talking about and place it above my header, but have CSS position it where it currently is. If so, can you give me an example of the code or point me to a good tutorial on it. Finally, I'm not quite sure I understand what you mean by putting a cell above my DHTML menu. That would mess up the layout. Or, is there a way around that that I'm missing?

I know that was a bunch, so just answer whenever you have time. No rush.

Thanks,

Mike




_gail -> RE: Google's only indexing some of my pages? (9/27/2003 15:55:27)

Oooh, a PR rather than a site review. Great idea!


quote:

This is what the spider sees for

And what would it see in mine (if someone has a moment)?

www.digicamhelp.com

thanks a bunch,

gail




Reflect -> RE: Google's only indexing some of my pages? (9/29/2003 8:17:25)

Tool used for this run ...

http://www.searchengineworld.com/cgi-bin/sim_spider.cgi


Digital camera resource guide for the beginner Photos of people Home | Newsletter | FAQs | About | Feedback Introduction to digital cameras Buying a digital camera Digicam features Digicam care Taking photos Working with pics Useful resources All the floppies in the world don't amount to Zip Tech Depot - An Office Depot Co. DELL Limited Time Offers: Digital Cameras, PDAs, and more! DIGITAL CAMERA RESOURCE GUIDE Digital camera help for the beginner - plain and simple So you just bought a digital camera and barely know how to use it. Or you're thinking about buying one but wonder if digital is the way to go . Besides, you don't have the time or inclination to wade through pages of mumbo-jumbo technical jargon to get the help you need. You've come to the right place! D igicamhelp.com is the digital camera resource guide created specifically for beginners who want basic and practical insights about using a digital camera . It can help you determine if you want to get into digital photography and if it's worth all the fuss and the expense. You'll find information and tips about taking photos , image editing , photo editing software , printing and supplies and more. Take a look too at digital image options . Don't miss the DIGI KNOW?! sections scattered throughout this site. They contain interesting tidbits relating to digital cameras and digital photography. RGB logo EASY AS 1-2-3 Why is this digital camera resource site different than others? 1- Digicamhelp is for the beginner digital camera user and those thinking about getting into digital photography. 2- Digicamhelp provides the basics about using a digital camera and working with digital photos. 3- Digicamhelp is non-technical, down-to-earth and easy-to-understand. RGB bullet FREE Newsletter Digital photography, image editing tips and more. Sign up now . Home | Newsletter | FAQs | Link to us | About | Feedback Copyright (c) 2002 - 2003 Digicamhelp.com All rights reserved Digicamhelp.com is not responsible for content found on other websites. All company & product names mentioned herein are trademarks and/or registered trademarks of their respective owners.




Reflect -> RE: Google's only indexing some of my pages? (9/29/2003 8:31:25)

quote:

First, what keyword density analyzer are you using.


This is just one of them that I use I use (being around 6)...

http://www.searchengineworld.com/cgi-bin/kwda.cgi

quote:

Second, I had read before that content was better placed as far up the page as possible; however, as long as the content is there, shouldn't the page still get indexed? Positioning would affect rank more, wouldn't it?


I stand by my first comment on that in my prior post. The closer to the <body> tag in the code the better.

quote:

, is it possible that my header, which contains 80% of my images, is stopping the spiders?


Same as my last comment.

quote:

Is it possible to create that cell with the content you were talking about and place it above my header, but have CSS position it where it currently is. If so, can you give me an example of the code or point me to a good tutorial on it.


Do a search on CSS+absolute positioning on Google. Too involved to go into.

quote:

Finally, I'm not quite sure I understand what you mean by putting a cell above my DHTML menu. That would mess up the layout. Or, is there a way around that that I'm missing?


Here is a walk through...

http://www.apromotionguide.com/tabletrick.html

Brian




_gail -> RE: Google's only indexing some of my pages? (9/29/2003 8:48:02)

Is that good or bad?

thanks for the link...quite a place.

gail




Reflect -> RE: Google's only indexing some of my pages? (9/30/2003 7:06:59)

quote:

Is that good or bad?


I did not check your keywords. If they appear in the above that is a good thing. If they appear in the first sentence that is even better.

Brian




Mike_R -> RE: Google's only indexing some of my pages? (10/2/2003 14:05:30)

Brian, I figured out the CSS absolute positioning and got my content at the top near the body tag. I did have one last question, though. I had heard that the spiders pay particular attention to your first sentence or two. Obviously, then, you would want them to contain keywords that are in your title and content. I hope this doesn't sound like a dumb question, but how does the spider know where the first sentence is? I assume from the <p> tag. What I'm worried about is that I use a lot of the following to adjust line spacing:

<p align="center" style="line-height: 55%; margin-top: 0; margin-bottom: 0"> </p>

I don't, however, want the spiders to take that as my first significant element. Do I have anything to worry about?

Thanks a million,

Mike




Reflect -> RE: Google's only indexing some of my pages? (10/3/2003 7:28:21)

quote:

I figured out the CSS absolute positioning and got my content at the top near the body tag.


Way to go!!!!!!!!!!!!!!

Most excellent!

quote:

<p align="center" style="line-height: 55%; margin-top: 0; margin-bottom: 0"> </p>


I would move that over to your CSS file and setup a class or ID for it. It isn't going to be a major obstacle for the spider but it would help.

Best of luck Mike,

Brian




Page: [1]

Valid CSS!




Forum Software © ASPPlayground.NET Advanced Edition 2.4.5 ANSI
0.140625