It's fairly simple; simply set a php script in your robots.txt to not allow crawling; then set the php script itself to write the ip of anyone accessing it to a flat file; then setup your system to block all access to your site from any hosts in the file...
Simply put, any "good" bot should not go to any script indicated not to be crawled by your robots.txt; so anyone hitting said php script is a "bad bot" (or some idiot who went to a url that someone tricked them into going to lol)