Previous Thread
Next Thread
Print Thread
Hop To
Joined: Jun 2006
Posts: 67
A
journeyman
journeyman
A Offline
Joined: Jun 2006
Posts: 67
Since installing 7x and the spider tracking, I've noticed that Yahoo is all over our boards all of the time.

Our Who's Online

What gives? I am assuming something is wrong or yahoo wouldn't be doing this 24 hours / day.


[Linked Image from boards.collectors-society.com]
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
I have 4 communities, and I can view them at several other sites; it's normal; Google used to do this a lot as well, but it seems they streamlined how they crawl data


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Jun 2006
Posts: 67
A
journeyman
journeyman
A Offline
Joined: Jun 2006
Posts: 67
I dunno... it seems weird. I agree that it's a quirk in the yahoo spider, but it's probably interacting with something on the boards to loop continuously over the same junk?

That's a performance hit I could do without if we could track it down.

Has anyone analyzed their logs to see what the yahoo spiders are actually doing? Is it legit traffic or are they caught somehow?


[Linked Image from boards.collectors-society.com]
Joined: Dec 2003
Posts: 1,796
Pooh-Bah
Pooh-Bah
Joined: Dec 2003
Posts: 1,796
They show up in the WOL data reading different topics, forums, etc. It doesn't seem like they're 'stuck', it does seem like yahoo just sends a plethora of them out daily tho, one right after another.


- Allen
- ThreadsDev | PraiseCafe
Joined: Jun 2006
Posts: 67
A
journeyman
journeyman
A Offline
Joined: Jun 2006
Posts: 67
Hm.

Wonder if they're somehow getting an error on their end parsing what they've collected and just go back to retry the next day. frown


[Linked Image from boards.collectors-society.com]
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
I see seperate IP's on differant pages; so it's not that they're getting "stuck" it's just that they're sending a lot of bots out... It shouldn't effect anything too much (50 bots don't take up as much resources as you'd think).

There are some robots.txt rules to keep bots out of "un-needed spots" (such as the calendar, where they'll incriment day by day into oblivian).


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Jun 2006
Posts: 67
A
journeyman
journeyman
A Offline
Joined: Jun 2006
Posts: 67
50 bots would be nice. Right now I have 171 bots on my site, by far most of them are yahoo. And this is ALL The time.

http://boards.collectors-society.com/ubbthreads.php?ubb=online

There's just got to be something wrong with that.

Last edited by Architecht; 07/19/2007 1:18 PM.

[Linked Image from boards.collectors-society.com]
Joined: Nov 2006
Posts: 3,095
Likes: 1
Carpal Tunnel
Carpal Tunnel
Joined: Nov 2006
Posts: 3,095
Likes: 1
Well even here on UBB they were up to about 700 bots at one time.

Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
MSN, Yahoo and Google are always on my forum and I do not like it.
They in fact should pay all of us for they are getting our content for free to users they are charging a fee to use their system and the money they make from vendors who pay them.

If they do not pay then there should be some methods we use to block them as I also get some koint out of japan that also sucks onto our content.

I have never made a penny from anyone coming via those ISPs


JR
Team ZR-1 Corvette Racer's
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
Actually, I feel the opposite, I feel we shoudl pay them... Think of it, they download our pages for their DB, their users search their database and they send their users to our site, which tend to register and click advertising links which in turn make us money... For those of us who advertise anyway...

Now, if you want to stop them from visiting your site at all, you can, it's what robots.txt is for, just stop them from visiting your forums and never worry about them again (though you'll soon notice traffic decreases, and some sites depend on search engines for new users)


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Jun 2006
Posts: 67
A
journeyman
journeyman
A Offline
Joined: Jun 2006
Posts: 67

I don't mind them being there, I'm just assuming that all the thrashing going on indicates something bad as it's seems inefficient on its face.


[Linked Image from boards.collectors-society.com]
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
Well, there are some "black holes", such as the calendar, where they can go up day by day into oblivion, but that's an easy fix with robots.txt:
User-agent: *
Disallow: /forum/ubbthreads.php?ubb=calendar
Disallow: /forum/ubbthreads.php/ubb/calendar
Disallow: /forum/ubbthreads.php?ubb=showday
Disallow: /forum/ubbthreads.php/ubb/showday


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
I never in almost 10 years with a forum online have had one person who registered by having gone through one of those ISPs.
In any case what they are charging their customers as a product is our content we not only produce but pay for the domain and webhosting costs


Originally Posted by Gizmo
Actually, I feel the opposite, I feel we shoudl pay them... Think of it, they download our pages for their DB, their users search their database and they send their users to our site, which tend to register and click advertising links which in turn make us money... For those of us who advertise anyway...

Now, if you want to stop them from visiting your site at all, you can, it's what robots.txt is for, just stop them from visiting your forums and never worry about them again (though you'll soon notice traffic decreases, and some sites depend on search engines for new users)


JR
Team ZR-1 Corvette Racer's
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
You've never had a user who's come to your site through a search engine and registered? Somehow I find that hard to believe, unless theres some really un-searched for content on your site...

IF you want to block SE's all together, just add this to your robots.txt:
User-agent: *
Disallow: /

They'll never touch your site again (so long as they follow the robots.txt standard, which most major ones do)

Honestly though, the max BW i've ever seen wasted by SE's in a month is about a gig; and this was on a huge site with loads of content that bankrolls about 4k+ a month due to advertising and depends on search engines...


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Dec 2003
Posts: 30
newbie
newbie
Joined: Dec 2003
Posts: 30
Originally Posted by Architecht
50 bots would be nice. Right now I have 171 bots on my site, by far most of them are yahoo. And this is ALL The time.

http://boards.collectors-society.com/ubbthreads.php?ubb=online

There's just got to be something wrong with that.

Yahoo is a pig. Googlebot and the others are nowhere near as bad as Yahoo is. I've been doing this since 1995, so I have some idea of what I'm talking about. smile

Right now I have 4 registered users on, 6 guests, and 135 spiders. Almost all of them are Yahoo from various IP addresses - as was pointed out that's not one spider stuck, that's a ton of them. There's no reason why they should be that friggin piggish.

Makes me wonder if I shouldn't do something like this to them.

However, Yahoo's own help has some ideas:

http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-03.html



Joe Siegler - Former Infopop Staff
Webmaster: Black Sabbath Online, Dopefish, & 3D Realms
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
I will agree that the crawler delay option they mention would be a valid way to slow down pounding; thanks for the link Joe smile.

There are some areas of the UBB that spiders will get stuck in (as mentioned in the faq), the calendar is one of them, as are member files (neither of which need to be crawled).


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Dec 2003
Posts: 30
newbie
newbie
Joined: Dec 2003
Posts: 30
Look at this.

http://www.black-sabbath.com/forums/ubbthreads.php?ubb=online

Right now I have 3 users, 10 guests, and 162 friggin search spiders. The overhelming majority are Yahoo. I implemented the delay option, it didn't seem to make much of a difference. frown

My Texas Rangers site isn't nearly as bad. 1 user (me), 0 guests, and 9 spiders (all but one are Yahoo). Sigh.

http://www.rangerfans.com/forums/ubbthreads.php?ubb=online

Unless I did it wrong, but I don't think so.


Joe Siegler - Former Infopop Staff
Webmaster: Black Sabbath Online, Dopefish, & 3D Realms
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
Lol, 160 yahoo spiders is nothing; I had 500 the one night lol


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
I added a delay to robots.txt
its in the root oy my domain

User-agent: *
Disallow: /cgi-bin/

User-agent: Slurp
Crawl-delay: 5

Has not slowed down Yahoo slurp ( which must mean they are saying they want to suck up everyones content ) one bit


JR
Team ZR-1 Corvette Racer's
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
Well, they don't check robots.txt every time they request something; they request it at a set interval that can take up to a few weeks to pass.


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!

Link Copied to Clipboard
ShoutChat
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Recent Topics
Bots
by Outdoorking - 04/13/2024 5:08 PM
Can you add html to language files?
by Baldeagle - 04/07/2024 2:41 PM
Do I need to rebuild my database?
by Baldeagle - 04/07/2024 2:58 AM
This is not a bug, but a suggestion
by Baldeagle - 04/05/2024 11:25 PM
spam issues
by ECNet - 03/19/2024 11:45 PM
Who's Online Now
1 members (Gizmo), 343 guests, and 134 robots.
Key: Admin, Global Mod, Mod
Random Gallery Image
Latest Gallery Images
Los Angeles
Los Angeles
by isaac, August 6
3D Creations
3D Creations
by JAISP, December 30
Artistic structures
Artistic structures
by isaac, August 29
Stones
Stones
by isaac, August 19
Powered by UBB.threads™ PHP Forum Software 8.0.0
(Preview build 20230217)