Previous Thread
Next Thread
Print Thread
Hop To
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
Version 7.5.6
How can I stop the Ahole Google from sucking up CPU in Looking at the calendar hundreds of times a day when that function is never really used so it is blank in content ?

Thanks


JR
Team ZR-1 Corvette Racer's
Joined: Dec 2005
Posts: 122
member
member
Joined: Dec 2005
Posts: 122
put something like this in your robots.txt file:

Code
User-agent: *
Allow: /
Sitemap: http://jakchat.com/sitemap.xml
Disallow: /files/texts
Disallow: /forums/ubbthreads.php?ubb=calendar
Disallow: /forums/ubbthreads.php/ubb/calendar
Disallow: /forums/ubbthreads.php?ubb=showday
Disallow: /forums/ubbthreads.php/ubb/showday
Disallow: /forums/ubbthreads.php?ubb=showprofile
Disallow: /forums/ubbthreads.php/ubb/showprofile
Disallow: /forums/ubbthreads.php?ubb=showmembers
Disallow: /forums/ubbthreads.php/ubb/showmembers
Disallow: /forums/ubbthreads.php?ubb=online
Disallow: /forums/ubbthreads.php/ubb/online
Disallow: /forums/ubbthreads.php?ubb=search
Disallow: /forums/ubbthreads.php/ubb/search
Disallow: /forums/ubbthreads.php?ubb=faq
Disallow: /forums/ubbthreads.php/ubb/faq
Disallow: /forums/ubbthreads.php?ubb=viewprivacy
Disallow: /forums/ubbthreads.php/ubb/viewprivacy
Disallow: /forums/ubbthreads.php?ubb=mycookies
Disallow: /forums/ubbthreads.php/ubb/mycookies
Disallow: /forums/ubbthreads.php?ubb=markallread
Disallow: /forums/ubbthreads.php/ubb/markallread
Disallow: /forums/ubbthreads.php?ubb=newuser
Disallow: /forums/ubbthreads.php/ubb/newuser

this will stop search engines from indexing unnecessary pages.


JakChat.com -- Forums for Indonesia's English-speaking community
Ubuntu-Indonesia.com -- Forums for Indonesia's Ubuntu Users
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
I have those disallows in the robots.txt for some months and has not stopped this
It is in the root of the domain.
Does this txt file have to be located somewhere else ?


JR
Team ZR-1 Corvette Racer's
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
The version i share actually throttles google down with "Crawl-Delay":

Code
User-agent: *
Crawl-Delay: 3
Disallow: /files/texts
Disallow: /forum/ubbthreads.php?ubb=calendar
Disallow: /forum/ubbthreads.php/ubb/calendar
Disallow: /forum/ubbthreads.php?ubb=showday
Disallow: /forum/ubbthreads.php/ubb/showday
Disallow: /forum/ubbthreads.php?ubb=showprofile
Disallow: /forum/ubbthreads.php/ubb/showprofile
Disallow: /forum/ubbthreads.php?ubb=showmembers
Disallow: /forum/ubbthreads.php/ubb/showmembers
Disallow: /forum/ubbthreads.php?ubb=online
Disallow: /forum/ubbthreads.php/ubb/online
Disallow: /forum/ubbthreads.php?ubb=search
Disallow: /forum/ubbthreads.php/ubb/search
Disallow: /forum/ubbthreads.php?ubb=faq
Disallow: /forum/ubbthreads.php/ubb/faq
Disallow: /forum/ubbthreads.php?ubb=viewprivacy
Disallow: /forum/ubbthreads.php/ubb/viewprivacy
Disallow: /forum/ubbthreads.php?ubb=mycookies
Disallow: /forum/ubbthreads.php/ubb/mycookies
Disallow: /forum/ubbthreads.php?ubb=markallread
Disallow: /forum/ubbthreads.php/ubb/markallread
Disallow: /forum/ubbthreads.php?ubb=newuser
Disallow: /forum/ubbthreads.php/ubb/newuser


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Dec 2005
Posts: 122
member
member
Joined: Dec 2005
Posts: 122
Originally Posted by teamzr1
I have those disallows in the robots.txt for some months and has not stopped this
It is in the root of the domain.
Does this txt file have to be located somewhere else ?

yes, it must be located in the web root directory. to test if it's in the right place, open it in a web browser like this: http://mydomain.com/robots.txt . if it doesn't appear, then it's in the wrong place.


JakChat.com -- Forums for Indonesia's English-speaking community
Ubuntu-Indonesia.com -- Forums for Indonesia's Ubuntu Users
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
I was able to stop Google flooding my forum 24/7 by added this

User-Agent: Googlebot
Disallow: /

As the examples above on it's own would not


JR
Team ZR-1 Corvette Racer's
Joined: Jun 2006
Posts: 16,299
Likes: 116
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,299
Likes: 116
Well, that would block Google entirely; a lot of us live off of our Search Engine traffic...


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
Well for whatever reason when the robots.txt content was as what others shown above that google bot was ignoring those disallows and constantly flooding the server my forum is on and when there is couple hundred of domains on a web server the forum functioned very slowly

I think attracta.com and Google Crawler have something to do with my issues.


JR
Team ZR-1 Corvette Racer's
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
What is wrong with the content of this robots.txt content as if I take out the disallow for Google bot then it again ignores the disallows for like events, calendar,etc ?

User-agent: *
Disallow: /cgi-bin/
User-agent: Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)
Disallow:/

User-agent: Slurp
Crawl-delay: 5

User-agent: dotbot
Disallow: /

User-agent: *
Crawl-Delay: 120
Disallow: /forum/ubbthreads.php?ubb=calendar
Disallow: /forum/ubbthreads.php/ubb/calendar
Disallow: /forum/ubbthreads.php?ubb=showday
Disallow: /forum/ubbthreads.php/ubb/showday
Disallow: /forum/ubbthreads.php?ubb=showprofile
Disallow: /forum/ubbthreads.php/ubb/showprofile
Disallow: /forum/ubbthreads.php?ubb=showmembers
Disallow: /forum/ubbthreads.php/ubb/showmembers
Disallow: /forum/ubbthreads.php?ubb=online
Disallow: /forum/ubbthreads.php/ubb/online
Disallow: /forum/ubbthreads.php?ubb=search
Disallow: /forum/ubbthreads.php/ubb/search
Disallow: /forum/ubbthreads.php?ubb=faq
Disallow: /forum/ubbthreads.php/ubb/faq
Disallow: /forum/ubbthreads.php?ubb=viewprivacy
Disallow: /forum/ubbthreads.php/ubb/viewprivacy
Disallow: /forum/ubbthreads.php?ubb=mycookies
Disallow: /forum/ubbthreads.php/ubb/mycookies
Disallow: /forum/ubbthreads.php?ubb=markallread
Disallow: /forum/ubbthreads.php/ubb/markallread
Disallow: /forum/ubbthreads.php?ubb=newuser
Disallow: /forum/ubbthreads.php/ubb/newuser

Using Google Webmaster tools website there is a throttle adjustment for crawler but I find it does not do much and if you do reduce the requests and increase a delay if you do not renew that function after 90 days then they go right back to flooding your domain with crawler requests sick

Also via this tools webpage and reviewing the robots.txt they say they ignore the Crawl-Delay: and it's value


JR
Team ZR-1 Corvette Racer's
Joined: Dec 2003
Posts: 6,562
Likes: 78
Joined: Dec 2003
Posts: 6,562
Likes: 78
Originally Posted by teamzr1
Version 7.5.6
How can I stop the Ahole Google from sucking up CPU in Looking at the calendar hundreds of times a day when that function is never really used so it is blank in content ?

Thanks
Then turn off the calendar feature.
You can start with,
Control Panel » Feature Settings
Un-select enable the calendar.
I think you are getting to convoluted in your robot file.
Besides many crawlers do not honor them anyway.
And I can't figure out why the exclude for cgi since threads does not use that folder anyway.
Also if you disable the calendar google will still look for the url for some time. You might want to update your google settings and exclude the url.


Blue Man Group
There is no such thing as stupid questions. Just stupid answers
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
OK thanks I will try that as google is querying as many as 3 times a minute on useless bandwidth and server CPU for events and calendar, as many as 60 times a hour non stop.

Makes no sense to have disallows in robots.txt if they are ignored by a service like google that makes millions of dollars profits from sucking up our content and there is zero positive of them constantly looking at a idle calendar

Google
Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Referer:
66.249.67.215 6 minutes 25 seconds ago Viewing events for a day
Google
Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Referer:
66.249.68.169 6 minutes 33 seconds ago Looking at the calendar


JR
Team ZR-1 Corvette Racer's
Joined: Dec 2003
Posts: 6,562
Likes: 78
Joined: Dec 2003
Posts: 6,562
Likes: 78
Just bear in mind that they will attempt to read it still for a while but they should get a error and give up after time.
I wish I had the problem.


Blue Man Group
There is no such thing as stupid questions. Just stupid answers
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
Turning off calender seems to have worked as spider reads from google has ceased for that and events but that sucks to have to turn calender off just to stop what google does.
They still are doing for posts but that is fine and not as often as the ton of reads for calender and events.


Next one on the spider hit list was Enterweb (claim out of Sweden), they also had a spider flooding my forum all the time to find they are in the land of hackers.
So I put a server IP deny for their whole class A address and ban via config of users in the forum

NetRange: 88.0.0.0 - 88.255.255.255
CIDR: 88.0.0.0/8
OriginAS:
NetName: 88-RIPE
NetHandle: NET-88-0-0-0-1
Parent:
NetType: Allocated to RIPE NCC
Comment: These addresses have been further assigned to users in
Comment: the RIPE NCC region. Contact information can be found in
Comment: the RIPE database at http://www.ripe.net/whois
RegDate: 2004-04-01
Updated: 2009-05-18
Ref: http://whois.arin.net/rest/net/NET-88-0-0-0-1

OrgName: RIPE Network Coordination Centre
OrgId: RIPE
Address: P.O. Box 10096
City: Amsterdam


JR
Team ZR-1 Corvette Racer's
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
Here is records in just a couple of hours how many times google was requesting calendar info, total waste of server CPU
restricting this now betters forum response times for the users.

URL Detail
ubb=calendar&month=1&day=1&year=1953 URL restricted by robots.txt
ubb=calendar&month=1&day=1&year=1976 URL restricted by robots.txt
ubb=calendar&month=1&day=13&year=1932 URL restricted by robots.txt
ubb=calendar&month=1&day=26&year=1983 URL restricted by robots.txt
ubb=calendar&month=1&day=26&year=2082 URL restricted by robots.txt
ubb=calendar&month=1&day=3&year=2023 URL restricted by robots.txt
ubb=calendar&month=1&year=1916 URL restricted by robots.txt
ubb=calendar&month=1&year=1994 URL restricted by robots.txt
ubb=calendar&month=1&year=2004 URL restricted by robots.txt
ubb=calendar&month=1&year=2055 URL restricted by robots.txt
ubb=calendar&month=1&year=2064 URL restricted by robots.txt
ubb=calendar&month=1&year=2068 URL restricted by robots.txt
ubb=calendar&month=1&year=2073 URL restricted by robots.txt
ubb=calendar&month=1&year=2086 URL restricted by robots.txt
ubb=calendar&month=10&day=12&year=1955 URL restricted by robots.txt
ubb=calendar&month=10&day=12&year=1964 URL restricted by robots.txt
ubb=calendar&month=10&day=12&year=1966 URL restricted by robots.txt
ubb=calendar&month=10&day=12&year=1994 URL restricted by robots.txt
ubb=calendar&month=10&day=12&year=2065 URL restricted by robots.txt
ubb=calendar&month=10&day=12&year=2071 URL restricted by robots.txt
ubb=calendar&month=10&year=1934 URL restricted by robots.txt
ubb=calendar&month=10&year=1937 URL restricted by robots.txt
ubb=calendar&month=10&year=1941 URL restricted by robots.txt
ubb=calendar&month=10&year=1949 URL restricted by robots.txt
ubb=calendar&month=10&year=1976 URL restricted by robots.txt
ubb=calendar&month=10&year=1993 URL restricted by robots.txt
ubb=calendar&month=10&year=2013 URL restricted by robots.txt
ubb=calendar&month=10&year=2035 URL restricted by robots.txt
ubb=calendar&month=11&day=19&year=1972 URL restricted by robots.txt
ubb=calendar&month=11&day=19&year=2064 URL restricted by robots.txt
ubb=calendar&month=11&day=24&year=2068 URL restricted by robots.txt
ubb=calendar&month=11&year=1931 URL restricted by robots.txt
ubb=calendar&month=11&year=2008 URL restricted by robots.txt
ubb=calendar&month=11&year=2019 URL restricted by robots.txt
ubb=calendar&month=11&year=2056 URL restricted by robots.txt
ubb=calendar&month=11&year=2061 URL restricted by robots.txt
ubb=calendar&month=11&year=2076 URL restricted by robots.txt
ubb=calendar&month=12&day=8&year=1995 URL restricted by robots.txt
ubb=calendar&month=12&day=8&year=2020 URL restricted by robots.txt
ubb=calendar&month=12&year=1937 URL restricted by robots.txt
ubb=calendar&month=12&year=1941 URL restricted by robots.txt
ubb=calendar&month=12&year=1946 URL restricted by robots.txt
ubb=calendar&month=12&year=1977 URL restricted by robots.txt
ubb=calendar&month=12&year=2004 URL restricted by robots.txt
ubb=calendar&month=12&year=2017 URL restricted by robots.txt
ubb=calendar&month=12&year=2029 URL restricted by robots.txt
ubb=calendar&month=12&year=2055 URL restricted by robots.txt
ubb=calendar&month=2&day=5&year=1948 URL restricted by robots.txt
ubb=calendar&month=2&day=5&year=1982 URL restricted by robots.txt
ubb=calendar&month=2&day=5&year=1996 URL restricted by robots.txt
ubb=calendar&month=2&day=5&year=2004 URL restricted by robots.txt
ubb=calendar&month=2&day=5&year=2013 URL restricted by robots.txt
ubb=calendar&month=2&day=5&year=2084 URL restricted by robots.txt
ubb=calendar&month=2&year=1941 URL restricted by robots.txt
ubb=calendar&month=2&year=1947 URL restricted by robots.txt
ubb=calendar&month=2&year=1986 URL restricted by robots.txt
ubb=calendar&month=2&year=2091 URL restricted by robots.txt
ubb=calendar&month=3&day=10&year=1970 URL restricted by robots.txt
ubb=calendar&month=3&day=10&year=1973 URL restricted by robots.txt
ubb=calendar&month=3&day=10&year=2052 URL restricted by robots.txt
ubb=calendar&month=3&day=10&year=2057 URL restricted by robots.txt
ubb=calendar&month=3&day=10&year=2082 URL restricted by robots.txt
ubb=calendar&month=3&year=1912 URL restricted by robots.txt
ubb=calendar&month=3&year=1916 URL restricted by robots.txt
ubb=calendar&month=3&year=1954 URL restricted by robots.txt
ubb=calendar&month=3&year=2054 URL restricted by robots.txt
ubb=calendar&month=3&year=2077 URL restricted by robots.txt
ubb=calendar&month=4&day=28&year=1956 URL restricted by robots.txt
ubb=calendar&month=4&day=28&year=2012 URL restricted by robots.txt
ubb=calendar&month=4&year=1994 URL restricted by robots.txt
ubb=calendar&month=4&year=1995 URL restricted by robots.txt
ubb=calendar&month=4&year=2003 URL restricted by robots.txt
ubb=calendar&month=4&year=2073 URL restricted by robots.txt
ubb=calendar&month=5&day=23&year=1981 URL restricted by robots.txt
ubb=calendar&month=5&day=24&year=1919 URL restricted by robots.txt
ubb=calendar&month=5&day=24&year=2092 URL restricted by robots.txt
ubb=calendar&month=5&year=1906 URL restricted by robots.txt
ubb=calendar&month=5&year=1958 URL restricted by robots.txt
ubb=calendar&month=5&year=1995 URL restricted by robots.txt
ubb=calendar&month=5&year=2030 URL restricted by robots.txt
ubb=calendar&month=5&year=2084 URL restricted by robots.txt
ubb=calendar&month=6&day=14&year=2054 URL restricted by robots.txt
ubb=calendar&month=6&day=14&year=2074 URL restricted by robots.txt
ubb=calendar&month=6&year=1936 URL restricted by robots.txt
ubb=calendar&month=6&year=1960 URL restricted by robots.txt
ubb=calendar&month=6&year=1999 URL restricted by robots.txt
ubb=calendar&month=7&year=1909 URL restricted by robots.txt
ubb=calendar&month=7&year=1942 URL restricted by robots.txt
ubb=calendar&month=7&year=1949 URL restricted by robots.txt
ubb=calendar&month=7&year=2063 URL restricted by robots.txt
ubb=calendar&month=7&year=2089 URL restricted by robots.txt
ubb=calendar&month=8&day=25&year=1925 URL restricted by robots.txt
ubb=calendar&month=8&day=25&year=1977 URL restricted by robots.txt
ubb=calendar&month=8&day=25&year=2017 URL restricted by robots.txt
ubb=calendar&month=8&day=25&year=2054 URL restricted by robots.txt
ubb=calendar&month=8&day=25&year=2094 URL restricted by robots.txt
ubb=calendar&month=8&day=30&year=1920 URL restricted by robots.txt
ubb=calendar&month=8&day=30&year=2039 URL restricted by robots.txt
ubb=calendar&month=8&day=30&year=2070 URL restricted by robots.txt
ubb=calendar&month=8&year=1927 URL restricted by robots.txt
ubb=calendar&month=8&year=1932 URL restricted by robots.txt
ubb=calendar&month=8&year=1995 URL restricted by robots.txt
ubb=calendar&month=8&year=2035 URL restricted by robots.txt
ubb=calendar&month=8&year=2053 URL restricted by robots.txt
ubb=calendar&month=8&year=2056 URL restricted by robots.txt
ubb=calendar&month=8&year=2072 URL restricted by robots.txt
ubb=calendar&month=8&year=2078 URL restricted by robots.txt
ubb=calendar&month=9&day=11&year=1910 URL restricted by robots.txt
ubb=calendar&month=9&day=11&year=1958 URL restricted by robots.txt
ubb=calendar&month=9&day=11&year=1991 URL restricted by robots.txt
ubb=calendar&month=9&day=19&year=1979 URL restricted by robots.txt
ubb=calendar&month=9&day=19&year=2086 URL restricted by robots.txt
ubb=calendar&month=9&day=23&year=1942 URL restricted by robots.txt
ubb=calendar&month=9&day=23&year=1995 URL restricted by robots.txt
ubb=calendar&month=9&day=23&year=2003 URL restricted by robots.txt
ubb=calendar&month=9&day=23&year=2037 URL restricted by robots.txt
ubb=calendar&month=9&day=23&year=2084 URL restricted by robots.txt
ubb=calendar&month=9&year=1926 URL restricted by robots.txt
ubb=calendar&month=9&year=1933 URL restricted by robots.txt
ubb=calendar&month=9&year=1938 URL restricted by robots.txt
ubb=calendar&month=9&year=1965 URL restricted by robots.txt
ubb=showday&day=1&month=1&year=1931 URL restricted by robots.txt
ubb=showday&day=1&month=1&year=1943 URL restricted by robots.txt
ubb=showday&day=1&month=1&year=2014 URL restricted by robots.txt
ubb=showday&day=1&month=1&year=2022 URL restricted by robots.txt
ubb=showday&day=10&month=3&year=1954 URL restricted by robots.txt
ubb=showday&day=10&month=3&year=1956 URL restricted by robots.txt
ubb=showday&day=10&month=3&year=1980 URL restricted by robots.txt
ubb=showday&day=10&month=3&year=1988 URL restricted by robots.txt
ubb=showday&day=10&month=3&year=1989 URL restricted by robots.txt
ubb=showday&day=10&month=3&year=2009 URL restricted by robots.txt
ubb=showday&day=10&month=3&year=2043 URL restricted by robots.txt
ubb=showday&day=10&month=3&year=2058 URL restricted by robots.txt
ubb=showday&day=10&month=3&year=2064 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=1926 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=1955 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=1969 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=1991 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=2016 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=2023 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=2031 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=2038 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=2051 URL restricted by robots.txt
ubb=showday&day=11&month=9&year=2061 URL restricted by robots.txt
ubb=showday&day=12&month=10&year=1949 URL restricted by robots.txt
ubb=showday&day=12&month=10&year=1981 URL restricted by robots.txt
ubb=showday&day=12&month=10&year=1984 URL restricted by robots.txt
ubb=showday&day=12&month=10&year=2056 URL restricted by robots.txt
ubb=showday&day=12&month=10&year=2094 URL restricted by robots.txt
ubb=showday&day=13&month=1&year=1912 URL restricted by robots.txt
ubb=showday&day=13&month=1&year=1959 URL restricted by robots.txt
ubb=showday&day=13&month=1&year=2010 URL restricted by robots.txt
ubb=showday&day=13&month=1&year=2035 URL restricted by robots.txt
ubb=showday&day=13&month=1&year=2061 URL restricted by robots.txt
ubb=showday&day=13&month=1&year=2068 URL restricted by robots.txt
ubb=showday&day=13&month=1&year=2073 URL restricted by robots.txt
ubb=showday&day=14&month=6&year=1988 URL restricted by robots.txt
ubb=showday&day=14&month=6&year=2003 URL restricted by robots.txt
ubb=showday&day=14&month=6&year=2034 URL restricted by robots.txt
ubb=showday&day=16&month=11&year=1920 URL restricted by robots.txt
ubb=showday&day=16&month=11&year=1957 URL restricted by robots.txt
ubb=showday&day=16&month=11&year=1975 URL restricted by robots.txt
ubb=showday&day=16&month=11&year=1977 URL restricted by robots.txt
ubb=showday&day=19&month=11&year=1983 URL restricted by robots.txt
ubb=showday&day=19&month=11&year=1991 URL restricted by robots.txt
ubb=showday&day=19&month=11&year=2033 URL restricted by robots.txt
ubb=showday&day=19&month=11&year=2034 URL restricted by robots.txt
ubb=showday&day=19&month=11&year=2077 URL restricted by robots.txt
ubb=showday&day=19&month=9&year=1926 URL restricted by robots.txt
ubb=showday&day=19&month=9&year=1932 URL restricted by robots.txt
ubb=showday&day=19&month=9&year=2023 URL restricted by robots.txt
ubb=showday&day=19&month=9&year=2038 URL restricted by robots.txt
ubb=showday&day=19&month=9&year=2071 URL restricted by robots.txt
ubb=showday&day=2&month=8&year=2003 URL restricted by robots.txt
ubb=showday&day=2&month=8&year=2046 URL restricted by robots.txt
ubb=showday&day=23&month=1&year=1993 URL restricted by robots.txt
ubb=showday&day=23&month=1&year=2025 URL restricted by robots.txt
ubb=showday&day=23&month=1&year=2048 URL restricted by robots.txt
ubb=showday&day=23&month=1&year=2073 URL restricted by robots.txt
ubb=showday&day=23&month=5&year=1949 URL restricted by robots.txt
ubb=showday&day=23&month=5&year=1960 URL restricted by robots.txt
ubb=showday&day=23&month=5&year=2021 URL restricted by robots.txt
ubb=showday&day=23&month=5&year=2088 URL restricted by robots.txt
ubb=showday&day=23&month=9&year=1914 URL restricted by robots.txt
ubb=showday&day=23&month=9&year=1926 URL restricted by robots.txt
ubb=showday&day=23&month=9&year=1997 URL restricted by robots.txt
ubb=showday&day=23&month=9&year=2005 URL restricted by robots.txt
ubb=showday&day=23&month=9&year=2013 URL restricted by robots.txt
ubb=showday&day=23&month=9&year=2023 URL restricted by robots.txt
ubb=showday&day=23&month=9&year=2036 URL restricted by robots.txt
ubb=showday&day=24&month=11&year=1944 URL restricted by robots.txt
ubb=showday&day=24&month=11&year=1953 URL restricted by robots.txt
ubb=showday&day=24&month=11&year=1959 URL restricted by robots.txt
ubb=showday&day=24&month=11&year=2012 URL restricted by robots.txt
ubb=showday&day=24&month=11&year=2018 URL restricted by robots.txt
ubb=showday&day=24&month=11&year=2019 URL restricted by robots.txt
ubb=showday&day=24&month=11&year=2048 URL restricted by robots.txt
ubb=showday&day=24&month=11&year=2085 URL restricted by robots.txt
ubb=showday&day=24&month=5&year=1931 URL restricted by robots.txt
ubb=showday&day=24&month=5&year=1958 URL restricted by robots.txt
ubb=showday&day=24&month=5&year=1994 URL restricted by robots.txt
ubb=showday&day=24&month=5&year=2062 URL restricted by robots.txt
ubb=showday&day=25&month=8&year=1951 URL restricted by robots.txt
ubb=showday&day=25&month=8&year=2078 URL restricted by robots.txt
ubb=showday&day=25&month=8&year=2095 URL restricted by robots.txt
ubb=showday&day=26&month=1&year=1940 URL restricted by robots.txt
ubb=showday&day=26&month=1&year=1988 URL restricted by robots.txt
ubb=showday&day=26&month=1&year=2066 URL restricted by robots.txt
ubb=showday&day=26&month=1&year=2074 URL restricted by robots.txt
ubb=showday&day=28&month=4&year=1956 URL restricted by robots.txt
ubb=showday&day=28&month=4&year=2039 URL restricted by robots.txt
ubb=showday&day=3&month=1&year=1942 URL restricted by robots.txt
ubb=showday&day=3&month=1&year=2075 URL restricted by robots.txt
ubb=showday&day=30&month=8&year=1930 URL restricted by robots.txt
ubb=showday&day=30&month=8&year=1955 URL restricted by robots.txt
ubb=showday&day=30&month=8&year=2003 URL restricted by robots.txt
ubb=showday&day=30&month=8&year=2013 URL restricted by robots.txt
ubb=showday&day=5&month=2&year=1981 URL restricted by robots.txt
ubb=showday&day=5&month=2&year=2096 URL restricted by robots.txt
ubb=showday&day=8&month=12&year=1911 URL restricted by robots.txt
ubb=showday&day=8&month=12&year=1960 URL restricted by robots.txt
ubb=showday&day=8&month=12&year=2055 URL restricted by robots.txt
ubb=showday&day=8&month=12&year=2082 URL restricted by robots.txt


JR
Team ZR-1 Corvette Racer's
Joined: Dec 2003
Posts: 6,562
Likes: 78
Joined: Dec 2003
Posts: 6,562
Likes: 78
Originally Posted by teamzr1
What is wrong with the content of this robots.txt content as if I take out the disallow for Google bot then it again ignores the disallows for like events, calendar,etc ?

User-agent: *
Disallow: /cgi-bin/
User-agent: Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)
Disallow:/

User-agent: Slurp
Crawl-delay: 5

User-agent: dotbot
Disallow: /

User-agent: *
Crawl-Delay: 120
Disallow: /forum/ubbthreads.php?ubb=calendar
Disallow: /forum/ubbthreads.php/ubb/calendar
Disallow: /forum/ubbthreads.php?ubb=showday
Disallow: /forum/ubbthreads.php/ubb/showday
Disallow: /forum/ubbthreads.php?ubb=showprofile
Disallow: /forum/ubbthreads.php/ubb/showprofile
Disallow: /forum/ubbthreads.php?ubb=showmembers
Disallow: /forum/ubbthreads.php/ubb/showmembers
Disallow: /forum/ubbthreads.php?ubb=online
Disallow: /forum/ubbthreads.php/ubb/online
Disallow: /forum/ubbthreads.php?ubb=search
Disallow: /forum/ubbthreads.php/ubb/search
Disallow: /forum/ubbthreads.php?ubb=faq
Disallow: /forum/ubbthreads.php/ubb/faq
Disallow: /forum/ubbthreads.php?ubb=viewprivacy
Disallow: /forum/ubbthreads.php/ubb/viewprivacy
Disallow: /forum/ubbthreads.php?ubb=mycookies
Disallow: /forum/ubbthreads.php/ubb/mycookies
Disallow: /forum/ubbthreads.php?ubb=markallread
Disallow: /forum/ubbthreads.php/ubb/markallread
Disallow: /forum/ubbthreads.php?ubb=newuser
Disallow: /forum/ubbthreads.php/ubb/newuser

Using Google Webmaster tools website there is a throttle adjustment for crawler but I find it does not do much and if you do reduce the requests and increase a delay if you do not renew that function after 90 days then they go right back to flooding your domain with crawler requests sick

Also via this tools webpage and reviewing the robots.txt they say they ignore the Crawl-Delay: and it's value

I was looking at your file and I just noticed that your paths are incorrect.
/forum/ does not exist on your site.

/ubbthreads/ does.

So maybe you should try:
Disallow: /ubbthreads/ubbthreads.php?ubb=calendar
Disallow: /ubbthreads/ubbthreads.php/ubb/calendar


Blue Man Group
There is no such thing as stupid questions. Just stupid answers
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
Thanks, I had copied that content from a earlier post here and did not think of that but interesting is when using google webmaster tools and have it test the robots.txt it never failed
-------------

Check to see that your robots.txt is working as expected. (Any changes you make to the robots.txt content below will not be saved.)
robots.txt file Downloaded Status
http://www.teamzr1.com/robots.txt 2 hours ago 200 (Success)
----------------------

I rid the /forum and will see how google does and then go back and turn calendar back on and see how that does.

On another note, how come this forum is not allowing users to add a file manager( image) to their posts ?


JR
Team ZR-1 Corvette Racer's
Joined: Jun 2006
Posts: 1,344
G
veteran
veteran
G Offline
Joined: Jun 2006
Posts: 1,344
When you test the robot.txt in webmaster its is checking that it gets a 200 response by accessing the robot.txt file, not errors within.

To check the errors within the robots.txt you should look at the crawl errors, that will tell you where the errors are like, page not found etc.

Your a forum owner, guess the owners here don't want to allow the uploads wink I can't blame them, so much stuff builds and just gets left it adds up after years.

Joined: Dec 2003
Posts: 6,562
Likes: 78
Joined: Dec 2003
Posts: 6,562
Likes: 78
BTW,
looking at your post of :
http://www.teamzr1.com/robots.txt
Paths are still incorrect.
They are looking at your root not where the scripts run.
So Google may indeed exclude the paths stated but will still attempt to read the actual paths to your calendar.

My 2 cents.


Blue Man Group
There is no such thing as stupid questions. Just stupid answers
Joined: Jun 2007
Posts: 286
T
enthusiast
enthusiast
T Offline
Joined: Jun 2007
Posts: 286
getting rid of the /forum that was incorrect, Google updated yesterday morning and since then their spider requests for NON posts have gone down greatly
They still are querying post content so have not lost visitors but google webmaster tools has shown crawl errors have stopped so robots content must be OK.

Wait 1 more day and then turning calendar back on and monitor google spider and crawl actions.

Response time now on forum is much better for the users.


JR
Team ZR-1 Corvette Racer's

Link Copied to Clipboard
ShoutChat
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Recent Topics
Bots
by Outdoorking - 04/13/2024 5:08 PM
Can you add html to language files?
by Baldeagle - 04/07/2024 2:41 PM
Do I need to rebuild my database?
by Baldeagle - 04/07/2024 2:58 AM
This is not a bug, but a suggestion
by Baldeagle - 04/05/2024 11:25 PM
Is UBB.threads still going?
by Aaron101 - 04/01/2022 8:18 AM
Who's Online Now
1 members (1 invisible), 920 guests, and 238 robots.
Key: Admin, Global Mod, Mod
Random Gallery Image
Latest Gallery Images
Los Angeles
Los Angeles
by isaac, August 6
3D Creations
3D Creations
by JAISP, December 30
Artistic structures
Artistic structures
by isaac, August 29
Stones
Stones
by isaac, August 19
Powered by UBB.threads™ PHP Forum Software 8.0.0
(Preview build 20230217)