I want to be exposed. Google me now! (Full Version)

All Forums >> [Web Development] >> Search Engine Optimization and Web Business



Message


Spooky -> I want to be exposed. Google me now! (4/27/2003 16:35:11)

Ive given up hope of this site being googled.
Id like to find out why and what can be done?

Previously it involved a redirection to a page called first.asp when a sessionID failed. Ive removed that, and no redirection occurs.

However, some session / cookie checking still occurs without redirection.
How do I trace why the site isnt indexed?




pageoneresults -> RE: I want to be exposed. Google me now! (4/27/2003 21:36:18)

Spooky, you are referring to www.frontpagewebmaster.com, correct?

It is in the Google index. If you are wondering why the forum topics don' t get indexed, you have too many variables in the urls. Remember when I talked to you about parsing the urls?

IISRewrite will probably solve the problem.




pageoneresults -> RE: I want to be exposed. Google me now! (4/27/2003 21:41:01)

Server Response: http://www.frontpagewebmaster.com/
Status: HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Mon, 28 Apr 2003 01:39:40 GMT
MicrosoftOfficeWebServer: 5.0_Pub
pragma: no-cache
cache-control: private
REFRESH: 1140;url=http://www.frontpagewebmaster.com/default.asp?
Content-Length: 22801
Content-Type: text/html; Charset=ISO-8859-1
Expires: Sat, 20 Mar 1999 02:39:40 GMT
Set-Cookie: forum%5FpgdlastVisit=4%2F27%2F2003+9%3A39%3A40+PM; expires=Tue, 27-May-2003 04:00:00 GMT; path=/
Set-Cookie: forum%5Fpgdmembrowser=moz4; expires=Tue, 27-May-2003 04:00:00 GMT; path=/
Set-Cookie: ASPSESSIONIDSQCCAAQQ=JADPGGFBPLDEMJBJDIADNBMO; path=/
Cache-control: no-cache

What is that refresh doing there?




pageoneresults -> RE: I want to be exposed. Google me now! (4/27/2003 21:43:26)

There are also 167 html errors for the home page alone. You might want to take a look at some of those and see if they are fatal.




pageoneresults -> RE: I want to be exposed. Google me now! (4/27/2003 21:57:35)

Spooky, you' ve got 3 variables in most of the forum topic urls. Based on experience, Google usually only sees two of those.

My advice would be to purchase IISRewrite and parse all of the urls. You' ll be surprised at the outcome. As soon as we parsed the urls for my directory, Google grabbed about 400+ pages that are generated from 3 asp templates. Google was not seeing those pages prior to the parsing.




Spooky -> RE: I want to be exposed. Google me now! (4/27/2003 22:02:42)

I remember that well :)
The problem I strike is 2 fold <edit> make that 3!
1) I believe the initial problem is that the forum coding / session management will nullify any advantage that rewrite provides.
Id like to get that part sorted first (if its the problem)
As you see - theres a refresh slipped in up there :)
2) With IISrewrite, theres a need to restart the server after every change.
With the amount of messing around I do, it could put the server down for days [;)]
3) Itll force a rewrite of the forum each time a new version is released [:@]




Spooky -> RE: I want to be exposed. Google me now! (4/27/2003 22:03:46)

Is the refresh going to cause problems for google at " 1140" ?
Is the use of the meta tag :
<meta http-equiv=" dyn-url-keys" content=" appid;m" > the right approach?




pageoneresults -> RE: I want to be exposed. Google me now! (4/27/2003 22:42:20)

Unfortunately my experience ends at finding possible problems. I' m not too certain about that refresh, although, certain types of refresh have been known to cause issues with the spider based search engines.

You may want to head on over to WebmasterWorld and post some questions. You won' t be able to post urls, but, you can specify your problem and provide examples of code or server responses. I know your heart is here with Outfront but there are times when additional experience is required, especially when it comes down to the dynamics of a forum like this.

Take a close look at the forum urls while you are at WebmasterWorld. Google gobbles that site up daily. Same goes for our consultants directory. It too has been fortunate in hooking up with the Google Freshbot. We can add new pages and see them appear in the index within days sometimes.

With the activity at this forum, you would definitely qualify for the Freshbot. Problem is, if you don' t make this forum search engine friendly, you can forget about getting these thousands of pages indexed. What is more important to you right now? It sounds like you need some exposure. Well, you can pay for it at Overture (PPC - Pay Per Click) or Google (CPC - Cost Per Click - AdWords). Or, you can bite the bullet and make these work for you.

I' m tellin' ya Spooky, the parsing of urls is the single most important factor in getting a dynamic site to rank well. There are different ways to go about this. You could write a piece of software that takes the content from the forums and generates static html pages. The problem with that solution is that you may end up with thousands of pages that don' t need to be there. You could literally generate an infinite amount of pages using one asp page, a few includes and IISRewrite. Get this, there are no actual pages. Content is being pulled from the database based on a query from a link somewhere.

Once you are able to parse the urls, then you need to get the spider in there. We currently use a couple of different approaches. Take our directory for example. We have it categorized in multiple ways. There are sub-directories that contain a few asp templates that generate data from the database. This data is formatted to generate index pages for each category. Those index pages contain links into the database with the parsed urls.

The possiblities are limitless once you parse. I can' t emphasize that enough. I can say that real world experience with three different asp driven sites all using a parsing method whether it is IISRewrite or my programmers rendition of it. He did one site for me before we bought IISRewrite. I sent you links to that previously and we went back and forth with code snippets. I don' t think I was able to provide you with enough information to figure it out. We purchased IISRewrite to see what it would do. After he figured it out, it is definitely the solution for rewriting urls. Even though you have to restart IIS each time you change the ini file, once you get it set up, you leave it alone.

Okay, I' m long winded this evening, sorry about that. If you take the above advice, the rewards are great! Maybe you can work with aspplayground to do this. It would be to their advantage also. Just think, the only commercially available search engine friendly forum. Now that' s an idea! ;)





pageoneresults -> RE: I want to be exposed. Google me now! (4/27/2003 23:00:16)

Just for reference, here is the process I go through to determine the quality of a site and whether there are any possible problems.

First, let' s see if the page is search engine friendly by using the [url=" http://www.searchengineworld.com/cgi-bin/sim_spider.cgi" ]SIM Spider at WebmasterWorld[/url].

Once you' ve clicked on the Spider it! button. View the results returned. Some will find this very interesting. Scroll down the page and it will show you links that the spider found on your page. If you are using a javascript navigation of some sort, you may find that your links are invisible, not good!

After I' ve checked the spider friendliness, I then [url=" http://validator.w3.org/" ]check the validity of the html[/url]. Even though most sites will return errors, it is the fatal errors that I am concerned about. Unfortunately FrontPage users are at the top of the list for generating invalid html and lot' s of fatal errors. ;)

I' ll then check css if it is present. I' ll utilize the [url=" http://jigsaw.w3.org/css-validator/validator-uri.html" ]W3C CSS Validator[/url] to see if there are any errors there.

Now it is time to perform a [url=" http://www.searchengineworld.com/cgi-bin/servercheck.cgi" ]server header check[/url]. This is very important! You' ll want to make sure that your indexible pages are returning a 200 status code. Here is an example of what the server header check might return...

Server Response: http://www.seoconsultants.com/
Status: HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Content-Location: http://www.seoconsultants.com/index.htm
Date: Mon, 28 Apr 2003 02:41:52 GMT
Content-Type: text/html
Accept-Ranges: bytes
Last-Modified: Sun, 27 Apr 2003 18:07:02 GMT
ETag: " fa2e96cde7cc31:930"
Content-Length: 23792

I' ve bolded two of the most important areas that I concentrate on. Status and the Last-Modified date. I want to make sure that the server is returning that last modified date as it is important to the indexing spider. This is where the Google Freshbot comes in.

I' ll typically run the page through the [url=" http://www.searchengineworld.com/cgi-bin/page_size.cgi" ]web page size checker[/url] to see what type of balance there is between the text and html. The report might look like this...

Total WebPage Size
23792 (bytes)

Visible Text Size
3745 (bytes)

Size of HTML Tags
20047 (bytes)

Text to HTML Ratio
16.23%

Number of Images
26

Largest Image Size
10966 (bytes)

Size of All Images
38837 (bytes)

Grand Total: Images+Html
62629 (bytes)

You definitely want to take a close look at the text to html ratio. If it is below 10%, you may have some content issues to deal with.

Okay, enough of my secret formula. The rest you' ll need to figure out! ;)




pageoneresults -> RE: I want to be exposed. Google me now! (4/27/2003 23:10:55)

Let me also add that Google has an indexing limit of 100k per page. If your content is not within that first 100k, forget about ranking. If you plan on feeding the spider 100k of presentation code then kiss any rankings good-bye. That is why external css and javascript are key in eliminating html code bloat. The goal is to put your content right after that opening <body> tag. Ours looks like this...

<body>
<h1>
<p>

Then below all of the content is our site navigation and graphic content. My belief is that the spiders will favor pages that are light, error free and with very minimal presentational markup.




Spooky -> RE: I want to be exposed. Google me now! (4/27/2003 23:45:28)

Let me absorb all of that :-)

Ultimately that will be my aim - I need to place a peg in the ground when the forum code is complete.
It may be soon when a new version is complete. After that,Ill take a look at doing a total rewrite with the aim of speed and basing on current standards

First Id like to see at least ONE url in google! :-)




pageoneresults -> RE: I want to be exposed. Google me now! (4/27/2003 23:51:27)

Outfront is there. Frontpagewebmaster is not. This is why...

quote:

We are sorry...
You are seeing this message because of one of the following reasons:

Either JavaScript and/or Cookies are Not Enabled on Your Browser.
You enter the forum with an URL that contains invalid Session ID.
Problem #1:

To use our forum (ASPPlayground.NET Forum SQL), JavaScript and Cookies must be enabled.

Netscape Communicator 4+:

Select " Edit" from the browser Menu bar
Select " Preferences"
Double click on " Advanced"
Verify that " Enable JavaScript" is selected
Verify that " Accept only cookies that get sent back to the originating server" is selected
Select the " OK" button
Microsoft Internet Explorer 4+:

Select " View" from the browser Menu bar
Select " Internet Options"
Select " Security" tab
Select " Custom (for expert users)" security level and select the " Settings" button
Locate " Scripting" and verify that " Active Scripting" is enabled
Locate " Cookies" and verify that " Allow per-session cookies (not stored)" is enabled
Select the " OK" button on the " Security" window
Select the " OK" button on the " Internet Options" window
Once you have enabled both JavaScript and Cookies you may try the forum again.

Problem #2:

You have bookmarked a page that contains your previous (invalid) Session ID. Always bookmark our forum' s home page WITHOUT the Session ID. e.g.:

http://www.frontpagewebmaster.com/?cookieCheck=XXX will be invalid the next time you enter the forum.
http://www.frontpagewebmaster.com/ is the page you want to bookmark.
Try again?

Technical Information

For more information, please contact forum admin.
For forum software inquiry, please contact info@aspplayground.net or visit ASPPlayground.NET.

The above is exactly what Google is seeing! Here is the [url=" http://216.239.33.100/search?q=cache:g71khF7kYJUC:www.frontpagewebmaster.com/+&hl=en&ie=UTF-8" ]Google Cache[/url].




Spooky -> RE: I want to be exposed. Google me now! (4/27/2003 23:57:06)

quote:

First Id like to see at least TWO urls in google! :-)


[;)]

Yes, thats the problem I mean. For some reason, google picks up on that, even though Ive modified the redirection code, the session management kicks us out




pageoneresults -> RE: I want to be exposed. Google me now! (4/28/2003 0:00:25)

ASPPlayground can' t help you out with this? He seems to be very helpful over there. Definitely quick to answer questions about the forum. I have a client using aspplayground right now. We' ve got the #1 spot for two of the clients major search terms. Go figure...

<added>Hmmm, just checked the Google Cache on that one and get the same error page. Not good! Our page title is pulling its weight in the SERPs (Search Engine Results Pages). We have the same Google snippet as you do. This is definitely a problem that should be resolved. How close are you with those guys over there? Can you get them to provide a fix?




Spooky -> RE: I want to be exposed. Google me now! (4/28/2003 0:07:55)

Im sure Sam is more than happy to help!
However, as of Monday (ish) there is a new version, so this discussion may turn out to be irrelevant.
If it doesnt work as expected, Ill investigate that version more too.




Spooky -> RE: I want to be exposed. Google me now! (4/28/2003 4:42:20)

quote:

There are also 167 html errors for the home page alone.

Its now only 36 - about 20 of those are :

reference to entity " m" for which no system identifier could be generated for code :
" ><a href=" fb.asp?appid=4&m=132212" class=c2>RE

I guess Ill need to change to &amp; for those, but otherwise its mostly all good.
One fatal error is generated by not declaring doctype - is this enough to block google though??





pageoneresults -> RE: I want to be exposed. Google me now! (4/28/2003 10:45:22)

quote:

One fatal error is generated by not declaring doctype - is this enough to block google though??


No, the DOCTYPE is required to pass html/xhtml validation. It also tells the browser which mode it should be in.




Spooky -> RE: I want to be exposed. Google me now! (4/28/2003 11:01:12)

Hmm. The options are narrowing [&o]




pageoneresults -> RE: I want to be exposed. Google me now! (4/28/2003 11:06:18)

Unfortunately I think the only option is for Sam to address the session issues. Until that occurs, you shall remain un-exposed.




Spooky -> RE: I want to be exposed. Google me now! (5/4/2003 0:48:29)

Well, in theory , we should be good to go.
Its as valid as you can get (well for the main page;) and Ive removed some meta refresh tags.

Lets see what happens. " Here bot bot bot. Here bot bot bot!"




pageoneresults -> RE: I want to be exposed. Google me now! (5/4/2003 12:39:39)

Spooky, can you get rid of the expires tag also...

Status: HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Sun, 04 May 2003 16:37:38 GMT
MicrosoftOfficeWebServer: 5.0_Pub
pragma: no-cache
cache-control: private
Content-Length: 21403
Content-Type: text/html; Charset=ISO-8859-1
Expires: Fri, 26 Mar 1999 17:37:38 GMT
Set-Cookie: forum%5FpgdlastVisit=5%2F4%2F2003+12%3A37%3A38+PM; expires=Tue, 03-Jun-2003 04:00:00 GMT; path=/
Set-Cookie: forum%5Fpgdmembrowser=moz4; expires=Tue, 03-Jun-2003 04:00:00 GMT; path=/
Set-Cookie: ASPSESSIONIDSQDCAAQR=FPJMPAFBNDJLHDCCBICALNCD; path=/
Cache-control: no-cache




Spooky -> RE: I want to be exposed. Google me now! (5/4/2003 15:27:04)

Si!




Spooky -> The Google Monster has arrived! (5/12/2003 16:25:01)

Wooo! google knows where we are ;-)

Theres a few other mods that may alos allow the actual message to be indexed, currently, its only the main pages and profiles.

Watch this space :-)




pageoneresults -> RE: I want to be exposed. Google me now! (5/13/2003 15:32:01)

Hmmm, the most important part of this board to get indexed are the messages. That is why it is important that moderators and admins take the time to make sure that message titles are succinct and properly describe the topic.

Forum members should also take this into consideration when posting new topics. Titles should be no more than 7 words, use proper upper and lower case and should describe the topic appropriately.

Once Googlebot gets in here and starts indexing threads, those titles are going to be the most important element of the indexing.

Spooky, make the mods quickly!




Spooky -> RE: I want to be exposed. Google me now! (5/14/2003 14:32:22)

The later version of the forum is using regular expressions instead of a pgd code conversion - Sam seems to think that has restricted the amount of text that can be indexed.
I would have to agree, but we will find out for sure soon :)




Page: [1]

Valid CSS!




Forum Software © ASPPlayground.NET Advanced Edition 2.4.5 ANSI
0.09375