PDA

View Full Version : Google and Site Maps


RyanLanane
11-25-2005, 09:41 PM
I modified the htaccess of funnyassjokes.com so all dynamic urls are now static... made a site map with all 1600 urls in it, and submitted it to Google.

Couple weird things, it shows last downloaded 8 hours ago, and Google ahs downloaded it multiple times since I originally submitted it (Nov 23rd). When I click "Stats" - "Crawl Stats" the 'sucessfully crawled' bar is entirely red leading me to believe that it's saying it has crawled all 1600 urls. Couple problems though.

In my referrer logs it is showing google bot with about 60-70 queries per day. I am assuming some of these are for AdSense.. and there woudl be ALOT more if it crawled all of the urls I submitted. Also, when I do a search for http://www.google.com/search?q=site:www.funnyassjokes.com/&hl=en

It brings back only my main domain as I submitted it with a trailing /

The meta tags are fairly weak but I just wanted to do an original submit, get through all of it, then go from there... making tweaks and re-uploading the site map for another crawl etc.. Thing is even with the new index it is not showing any description or ANYTHING for funnyassjokes.com - just

"
www.funnyassjokes.com/ (http://www.funnyassjokes.com/)
Similar pages (http://www.google.com/search?hl=en&lr=&q=related:www.funnyassjokes.com/)
"
A couple questions... Has anyone had anything similar happen to them? Does this mean google has crawled everything it is going to, or is there no way to tell ? Also, since I submitted all those urls in the site map shouldn't I eventually see 1600 + seperate queries from the AdSense googlebot queries even if there are errors ? Also this is the one I wonder the most about... Once it shows it has crawled your site, does it mean it has also indexed it.. meaning whatever pages are going to be listed are instantly at that point or are they 2 seperate stages which can be done a lengthy amount of time from each other.

I know this is a long ass post, any answers would be appreciated though :)

Optimizing this thing is going to be a bitch because all calls are done by a script with a single template for every joke and every category meaning each page has the same meta tags, title, description, etc.. I am hoping Google doesn't look at it as a Spam of their index :(

sarettah
11-25-2005, 11:39 PM
I havent done much with google ad sense but my experience with google is that they crawl and crawl and crawl way before you are indexed.

Optimizing it could be made easier by doing it dynamically, ie; having the script figure out the metas for each joke and inserting them on the fly.

RyanLanane
11-26-2005, 12:59 AM
Sarettah when that site is pulling in some decent traffic I was actually planning on having you look into doing that for me. Right now, I can't manipulate the meta's per joke... being able to do so would make a huge difference when it comes to SEO - The script really is pretty messy though..

It will probably be a couple months, right now I am gettting like 4 - 6 adword clicks a day from like 50 visitors lol.. $.03 a click so it isn't making money lol..

When it does, I'll dump it back in though, more of one of my projects to have fun with... The design leaves alot to be desired too still, lol

Dravyk
11-26-2005, 02:12 AM
A couple questions... Has anyone had anything similar happen to them? Does this mean google has crawled everything it is going to, or is there no way to tell ? Also, since I submitted all those urls in the site map shouldn't I eventually see 1600 + seperate queries from the AdSense googlebot queries even if there are errors ? Also this is the one I wonder the most about... Once it shows it has crawled your site, does it mean it has also indexed it.. meaning whatever pages are going to be listed are instantly at that point or are they 2 seperate stages which can be done a lengthy amount of time from each other. Ok, the result in Google is a "stump", a place-holder in a sense.

60-70 inquiries is the spider going through the 1,600 links it's digested.

Google may decide to either wait until each link has been crawled before it gets put into the public dB, or it might do it in pieces. Not instant at all. Crawling simply means getting put into the dB; it has nothing to do with actually publicly publishing the dB.

Also, no, even if it crawls all 1,600 does not mean all 1,600 will go in. In fact, I will guarantee you that they will not. MSN and Yahoo do put everything in, Google purposely has decided "bigger is not better" (or so they say publicly). You will probably get 1/4 to 1/3 of your site eventually listed in Google and no more. And no you can't control what goes in and stays out; it/they will decide.

GoogleBot is GoogleBot. "Adsense bot" usually shows up as: Google MediaPartners or mediapartners-google.


Optimizing this thing is going to be a bitch because all calls are done by a script with a single template for every joke and every category meaning each page has the same meta tags, title, description, etc.. I am hoping Google doesn't look at it as a Spam of their index.

Right now, I can't manipulate the meta's per joke... being able to do so would make a huge difference when it comes to SEO - The script really is pretty messy though Some advice. 1,600 joke pages? ReMETA and rename each joke/file? Messy script?

I would forget it and leave as it is. But if your heart is set on it, you might find it easier taking each one and pouring each into a better CMS. (Or maybe you can get the one you have reworked. I can think of ways that could be done or might not be able to be done, depending upon individual item functionality and how it writes to a dB at the moment.)

And even then you might want to hire a grunt to do it for you. If you change 25 jokes a day thats about 4-5 hours of work a day for over two months. Can you afford that time or get back the ROI on that? (And that's not including a new script install or existing script overhaul to boot.)

RyanLanane
11-26-2005, 08:30 AM
Dravyk, I am the grunt guy right now - grunt work doesn't bother me as long as I alter the type of grunt work at least once a day ;)

That being said, this is more of a 'fun' project for me... Done right it could pry make me a few thousand dollars a month in a year or two. Investing spare time here and there is worth it to me, I want it to be the site I show to people and say "I started that from scratch" and actually be proud of every aspect of it - I have a LONG way to go in ALOT of areas .. in fact in EVERY area lol.. In no rush though :)

Thanks for the Google answers that helped put things into perspective. Been doing ALOT of reading and asking questions about Google the last few weeks. I have picked up alot, but I am barely scratching the surface. Implementation is a whole new ball game...

sarettah
11-26-2005, 11:20 PM
Right now, I can't manipulate the meta's per joke... being able to do so would make a huge difference when it comes to SEO - The script really is pretty messy though..


Yes you can :) I have faith in you Ryan. It probably would not take much messing with the script to make it happen.

If you end up with some free time, hit me up and I will point you in the right direction.

What script are you using for it ?