|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
Since the WOL now has mouseover info I've seen quite a few visits from Indy Library and BecomeBot amongst others. Apparently Indy Library is a spam bot / e-mail harvester from China. After reading through a few sites about these bots I've found some useful info on how to block them. Bots that are listed will receive a 403 Forbidden error when trying to view your site. The amount of bandwidth savings and decrease in server resource usage as a result may be significant in many cases. How to block them: http://www.javascriptkit.com/howto/htaccess13.shtmlList of bad bots: http://willmacc.wordpress.com/2006/11/15/updated-bots-to-block/
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
It's fairly simple; simply set a php script in your robots.txt to not allow crawling; then set the php script itself to write the ip of anyone accessing it to a flat file; then setup your system to block all access to your site from any hosts in the file...
Simply put, any "good" bot should not go to any script indicated not to be crawled by your robots.txt; so anyone hitting said php script is a "bad bot" (or some idiot who went to a url that someone tricked them into going to lol)
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
robots.txt /btra/index.php <?php
if(phpversion() >= "4.2.0") {
extract($_SERVER);
}
?>
<html>
<head>
<title>Bad Bot!</title>
</head>
<body>
<span style="font-weight: bold;">There is nothing here to see. So what are you doing here?</span><br />
<a href="../">Go home</a><br />
<?php
$badbot = 0;
/* scan the blacklist.dat file for addresses of SPAM robots to prevent filling it up with duplicates */
$filename = "../blacklist.dat";
$fp = fopen($filename, "r") or die ("Error opening file ... <br />\n");
while ($line = fgets($fp,255)) {
$u = explode(" ",$line);
if(ereg($u[0],$REMOTE_ADDR)) { $badbot++; }
}
fclose($fp);
if ($badbot == 0) {
/* we just see a new bad bot not yet listed ! */
/* send a mail to hostmaster */
$tmestamp = time();
$datum = date("Y-m-d (D) H:i:s",$tmestamp);
$from = "badbot-watch@domain.tld";
$to = "hostmaster@domain.tld";
$subject = "domain-tld alert: bad robot";
$msg = "A bad robot hit $REQUEST_URI $datum \n";
$msg .= "address is $REMOTE_ADDR, agent is $HTTP_USER_AGENT\n";
/* See, I don't want mail; so lets disable it...
mail($to, $subject, $msg, "From: $from");
*/
/* append bad bot address data to blacklist log file: */
$fp = fopen($filename,'a+');
fwrite($fp,"$REMOTE_ADDR - - [$datum] \"$REQUEST_METHOD $REQUEST_URI $SERVER_PROTOCOL\" $HTTP_REFERER $HTTP_USER_AGENT\n");
fclose($fp);
}
?>
</body>
</html> blacklist.dat (chmodded 777); a blank file (for now) blacklist.php (include on every page to block users defined in blacklist.dat) <?php
if(phpversion() >= "4.2.0") {
extract($_SERVER);
}
$badbot = 0;
/* look for the IP address in the blacklist file */
// $filename = "$_SERVER["DOCUMENT_ROOT"]/blacklist.dat";
$filename = "blacklist.dat";
$fp = fopen($filename, "r") or die ("Error opening file ... <br />\n");
while ($line = fgets($fp,255)) {
$u = explode(" ",$line);
if (ereg($u[0],$REMOTE_ADDR)) {$badbot++;}
}
fclose($fp);
if ($badbot > 0) {
/* this is a bad bot, reject it */
sleep(20);
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<title>You've been Blocked</title>
</head>
<body>
<div align="center">
<h1>Welcome...</h1>
Unfortunately, due to abuse, this site is not available to you.<br />
If you feel that your ban is in error, please send an email to the hostmaster of this site,<br />
If you're an anti-social, ill-behaving SPAM bot, please just go away.
</div>
</body>
</html>
<?php
exit;
}
?>
|
|
|
|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
And why is that simple? I can just about HTML and CHMOD!
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
|
|
|
|
Joined: Aug 2006
Posts: 1,649 Likes: 1
Pooh-Bah
|
Pooh-Bah
Joined: Aug 2006
Posts: 1,649 Likes: 1 |
blacklist.php (include on every page to block users defined in blacklist.dat) That code is to be placed on every page, for instance, in Threads?? That's too much work for me...
GangsterBB.NET (Ver. 7.6.1.1) PHP Version 5.6.40 / MySQL 5.7.23-23 (was 5.6.41-84.1) / Apache 2.4.54 2007 Content Rulez Contest - Hon Mention UBB.classic 6.7.2 - RIP
|
|
|
|
Joined: Nov 2006
Posts: 3,095 Likes: 1
Carpal Tunnel
|
Carpal Tunnel
Joined: Nov 2006
Posts: 3,095 Likes: 1 |
The .htaccess method may or may not work depending on your host provider and how they have you set up.
The php method Gizmo shows should work for everyone.
|
|
|
|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
OK, I'm giving it a try but I'm just getting "Error opening file ... "
I have:
/btra (775) /btra/index.php (644) /btra/blacklist.dat (777)
and blacklist.php pasted into UBB Default Footer. Should it be in /btra and called by an include?
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
That code is to be placed on every page, for instance, in Threads?? That's too much work for me... no, an include to that page is included on every page; which in threads is as simple as adding it to your header include as: include("/path/to/blacklist.php"); And htaccess will only work in apache, not to mention youd have to make manual updates; whereas mine updates everything automatically with no work (after being installed) being needed. blacklist.dat and blacklist.php go to your web root. The include to blacklist.php goes in your header include on yoru ubb. Example: /btra/ (755) /btra/index.php (644) /blacklist.dat (666 or 777) /blacklist.php (644) Your header will hold the include to your blacklist.php file. You need this in your header as you're stopping the page loading if the users ip is blacklisted; if it's in the footer it'll stop after the page has loaded (in which case it'd be pointless).
|
|
|
|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
So, this might sound like a stupid question but does it only work for .php files?
What I'm saying is, do I have to change my existing .shtml / .html files to .php?
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
Yes it does only work with php pages, BUT you can parse your shtml/html files through php with: .htaccess: AddType application/x-httpd-php .html .htm .shtm .shtml
(a benefit of this is if you have a phpaccelerator running)
|
|
|
|
Joined: Nov 2006
Posts: 3,095 Likes: 1
Carpal Tunnel
|
Carpal Tunnel
Joined: Nov 2006
Posts: 3,095 Likes: 1 |
FYI It will put more load on your sever as it will now parse EVERY html page for script information even if there is none.
(note: for most users though the hit would be very minimal and probably not noticed)
Last edited by ntdoc; 01/26/2007 5:00 PM.
|
|
|
|
Joined: Aug 2006
Posts: 1,649 Likes: 1
Pooh-Bah
|
Pooh-Bah
Joined: Aug 2006
Posts: 1,649 Likes: 1 |
Thanks, Giz - works great! But just to clarify for others, add this in your header:
<?php
include("/blacklist.php");
?>
GangsterBB.NET (Ver. 7.6.1.1) PHP Version 5.6.40 / MySQL 5.7.23-23 (was 5.6.41-84.1) / Apache 2.4.54 2007 Content Rulez Contest - Hon Mention UBB.classic 6.7.2 - RIP
|
|
|
|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
That doesn't work for me. I have to put the full URL: <?php include('http://www.website.com/blacklist.php'); ?> I tried it with header.php and it didn't display without the full URL. Edit: I've changed my .shtml pages to .php
|
|
|
|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
When you put the include into the header should you be able to se any code in "view source" because I can't.
|
|
|
|
Joined: Aug 2006
Posts: 1,649 Likes: 1
Pooh-Bah
|
Pooh-Bah
Joined: Aug 2006
Posts: 1,649 Likes: 1 |
No, the PHP code won't show up when you View Source
GangsterBB.NET (Ver. 7.6.1.1) PHP Version 5.6.40 / MySQL 5.7.23-23 (was 5.6.41-84.1) / Apache 2.4.54 2007 Content Rulez Contest - Hon Mention UBB.classic 6.7.2 - RIP
|
|
|
|
Joined: Aug 2006
Posts: 1,649 Likes: 1
Pooh-Bah
|
Pooh-Bah
Joined: Aug 2006
Posts: 1,649 Likes: 1 |
Come to think of it, maybe it's not entirely working. I assumed it was cuz I got the "go away" message when I went to /btra directly, and saw my IP address added to the dat file. But - I still have access to my forums I've tried both "http://www.website.com/blacklist.php" and "/blacklist.php" in the include.
GangsterBB.NET (Ver. 7.6.1.1) PHP Version 5.6.40 / MySQL 5.7.23-23 (was 5.6.41-84.1) / Apache 2.4.54 2007 Content Rulez Contest - Hon Mention UBB.classic 6.7.2 - RIP
|
|
|
|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
Yes, I'm in exactly the same position. If I access a normal page it blocks me but the forum doesn't.
I've even tried pasting the code into header.tpl before <!DOCTYPE html PUBLIC...... but that doesn't work either.
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
That doesn't work for me. I have to put the full URL: <?php include('http://www.website.com/blacklist.php'); ?> path, not url; i don't think it is capable of working with a url...
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
do the full server path to the fiel in the include.
|
|
|
|
Joined: Aug 2006
Posts: 1,649 Likes: 1
Pooh-Bah
|
Pooh-Bah
Joined: Aug 2006
Posts: 1,649 Likes: 1 |
do the full server path to the fiel in the include. Okay, using the full server path did it. Now I cannot access my forums (it works!) -- but it says "Error opening file..." What file? Any other permissions we need to set?
GangsterBB.NET (Ver. 7.6.1.1) PHP Version 5.6.40 / MySQL 5.7.23-23 (was 5.6.41-84.1) / Apache 2.4.54 2007 Content Rulez Contest - Hon Mention UBB.classic 6.7.2 - RIP
|
|
|
|
Joined: Aug 2006
Posts: 1,649 Likes: 1
Pooh-Bah
|
Pooh-Bah
Joined: Aug 2006
Posts: 1,649 Likes: 1 |
Dude - I couldn't access my forums after that, even after clearing the .dat file! I had to comment-out the include to get back in (for now)...
GangsterBB.NET (Ver. 7.6.1.1) PHP Version 5.6.40 / MySQL 5.7.23-23 (was 5.6.41-84.1) / Apache 2.4.54 2007 Content Rulez Contest - Hon Mention UBB.classic 6.7.2 - RIP
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
lol odd; myself i dont use it on my forums, i use it everywhere else; You may want to take and check to make sure blacklist.dat (or whatever you named it) exists and can be written to, i have mine 777.
the only differance to what i'm using and what you'd be using is that you're using the full path to the file as i use the relative path ot it; but it shouldn't effect it at all differantly...
I'll take a gander at it later when i'm at my pc; just had the telephone repair guy out and i can't get to my computer
|
|
|
|
Joined: Aug 2006
Posts: 1,649 Likes: 1
Pooh-Bah
|
Pooh-Bah
Joined: Aug 2006
Posts: 1,649 Likes: 1 |
Well, commenting the include out worked fine for FF, but I still cannot get my site to load here with IE7 -- not only that, when I couldn't access my site before, NO ONE could! It took the whole thing down somehow... lol Edit: Okay, rebooting my computer w/ IE7 got me back in!
Last edited by jgeoff; 01/26/2007 9:18 PM.
GangsterBB.NET (Ver. 7.6.1.1) PHP Version 5.6.40 / MySQL 5.7.23-23 (was 5.6.41-84.1) / Apache 2.4.54 2007 Content Rulez Contest - Hon Mention UBB.classic 6.7.2 - RIP
|
|
|
|
Joined: Aug 2006
Posts: 1,649 Likes: 1
Pooh-Bah
|
Pooh-Bah
Joined: Aug 2006
Posts: 1,649 Likes: 1 |
Oh yeah, blacklist.dat is 777 -- do the others need any special permissions? And why would it take my entire site down like that?
GangsterBB.NET (Ver. 7.6.1.1) PHP Version 5.6.40 / MySQL 5.7.23-23 (was 5.6.41-84.1) / Apache 2.4.54 2007 Content Rulez Contest - Hon Mention UBB.classic 6.7.2 - RIP
|
|
|
|
Joined: Jun 2006
Posts: 811
old hand
|
old hand
Joined: Jun 2006
Posts: 811 |
Anyone have a good list of bot IPs that I can use for my .htaccess file? This list is probably so darned long if there is one out there.
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
lol jgeoff i'll look at integrating it into the ubb, it should be as simple as including that one file though; not sure why it'd block everyone from accessing anything though...
the only special perm needed is on the .dat file.
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
you may try updating /btra/index.php and blocklist.php's path to the dat file to be a full path, it is likely the problem, but not completely sure offhand
|
|
|
|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
Tried that, doesn't work. Works fine with single pages, it's just the UBB it doesn't like. myself i dont use it on my forums So what's the point in having it then? I would have thought that the whole idea of implementing this script was to stop bots completely.
|
|
|
|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
Another thing that bothers me about this is what if the bad bots do take notice of the robots.txt and don't visit the /btra directory? They'll just carry on zapping your resources anyway.
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
So what's the point in having it then? I would have thought that the whole idea of implementing this script was to stop bots completely. I've used the script for like 3+ years, I've only been running threads for one (I'm from UBB.C, we had no such luxuries of PHP/MySQL lol); I just haven't looked into crafting it for use in threads (although honestly i'm unsure as to why it would fail as it does). I'm currently on project with another site but will definately look into why this isn't working within threads with the header include (you may be able to just place the code in ubbthreads.php). Another thing that bothers me about this is what if the bad bots do take notice of the robots.txt and don't visit the /btra directory? They'll just carry on zapping your resources anyway. Most bots scan every page except those in your robots.txt's deny list; the bad bots will generally try to hit these asap.
|
|
|
|
Joined: Dec 2006
Posts: 1,235
veteran
|
veteran
Joined: Dec 2006
Posts: 1,235 |
Anyone have a good list of bot IPs that I can use for my .htaccess file? This list is probably so darned long if there is one out there. http://willmacc.wordpress.com/bot-ips/
|
|
|
|
Joined: Jun 2006
Posts: 811
old hand
|
old hand
Joined: Jun 2006
Posts: 811 |
|
|
|
|
Joined: Dec 2003
Posts: 6,568 Likes: 78
|
Joined: Dec 2003
Posts: 6,568 Likes: 78 |
I just can't seem to modify this to a global site use. So here is what I am attempting to do. I have never used the Threads 6.x series so I have never had the file addpost_newpoll.php Currently I block anyone that tries to access this file in htaccess. <Files addpost_newpoll.php> order allow,deny deny from all </Files> This works fine since when they try to access this file in any folder they get a 403. But since I know they are up to no good I would like to block them from any other access even a legit URL. And this file always seems to be attempted at various random folders. So how could I modify this to add the offending IP to the black list for the whole site?
Blue Man Group There is no such thing as stupid questions. Just stupid answers
|
|
|
|
Joined: Feb 2007
Posts: 1,294 Likes: 2
Veteran
|
Veteran
Joined: Feb 2007
Posts: 1,294 Likes: 2 |
The problem with adding an IP address is that you may get hit by 20 different IP's per day then the next day a totally different set of IP's for the bots. By the time they go back to the first set of IP's you may have 200 or more entered in the .htaccess file.
The best route is to block them if they did not have a refer from another page on the site to that URL they are trying to access directly. If I only knew how to do that it would be a great thing.
|
|
|
|
Joined: Dec 2003
Posts: 6,568 Likes: 78
|
Joined: Dec 2003
Posts: 6,568 Likes: 78 |
I understand and I do on occasion open the Ip's back up. But in this case I see someone accessing a php file which was never on my site. So I would prefer to block the ip completly for a while at least. It is just a lot of pushups to keep adding Ip's to the htacess file.
Blue Man Group There is no such thing as stupid questions. Just stupid answers
|
|
|
|
Joined: Mar 2008
Posts: 326
Enthusiast
|
Enthusiast
Joined: Mar 2008
Posts: 326 |
You could use the solution presented by Gizmo in the 3rd post, but name the btra/index.php file to addpost_newpoll.php; if anyone accesses this file, their IP gets blocked.
|
|
|
|
Joined: Jun 2006
Posts: 16,304 Likes: 116
|
Joined: Jun 2006
Posts: 16,304 Likes: 116 |
You could use the solution presented by Gizmo in the 3rd post, but name the btra/index.php file to addpost_newpoll.php; if anyone accesses this file, their IP gets blocked. :snicker: they read what I post!
|
|
|
|
Joined: Dec 2003
Posts: 6,568 Likes: 78
|
Joined: Dec 2003
Posts: 6,568 Likes: 78 |
Yes I could just change the file name to addpost_newpoll.php but that is a problem when most times the url the file is accessed in is a folder that does not exist at all. Even if I add a include in everypage and in every folder and in the ubb I still have the spammers hunting with bogus addresses for that file. I'll figure it out. Maybe a url rewrite in htaccess to the file.
Blue Man Group There is no such thing as stupid questions. Just stupid answers
|
|
|
|
|
Test
by Phun - 05/28/2024 7:31 PM
|
|
0 members (),
273
guests, and
292
robots. |
Key:
Admin,
Global Mod,
Mod
|
|
|
|