Greetings. I installed the latest 6.7.0 UBB and have enabled the "search engine friendly" URLs. Our site is mature and was pretty well represented in the search engines - all except for the forums. It's been over about a month now and I cannot find any instances of threads from the new version in any search engine?
My question - is there something else to be done in UBB in order to encourage or allow the spiders to take the leap? Is there something in the UBB that's not allowing them to follow the path - I've got the forums linked to the main page with http://www.stripersonline.com/ubb547/ultimatebb.php which seems to me the most "spider friendly" option? Is there some .html link to use as the forum start page?
It simply doesn't seem that the board is "spider friendly" <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" />
Your site looks fine to me. It can take a while (up to a few months) for the spiders to see that there is new content to spider and get parts of the board rolling. Be sure that you're linking to either your ultimatebb.cgi page or your ultimatebb.php page from your main site.
Done both of them already <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" /> The spiders will spider the profile links, but when you click on them, they come up 404. As to the robots.txt file, I've got it down the bare minimum -
Very odd, i don't see any issues on it... How long ago did you upgrade to 6.7? I know that when I first got everything setup with 6.4 and used the spider mod it took a few months to even show something that looked good...
to your Header insert, and make sure you have some links to your forum, they crawl, not guess <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" />
THey should "spider" all links eventually <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" /> ... But not a bad sugguestion.
to your Header insert, and make sure you have some links to your forum, they crawl, not guess <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" /> [/qb]
Thanks Ian, I'll do that. As to the links to the forum, there's three on the index page - and one in the footer of each of about 700 pages on the site <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" />
But I even went one step further and made up a text page sort of TOC for the forums listing the forum name, the "spider friendly" URL directly to that forum, and each forum's description. I did this for the spiders that might be a little stoopid <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" />
Thanks guys, hopefully the spiders will march through and pick up some of the 70,000 threads and 3/4 million posts on our forums <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" />
lol, then in spider.php I just append what I want to get spidered and eventually it happenes :x... No one has really noticed, although I believe that now al is probably bored enough to go take a peek at the file :/...
What I'm goin to eventually do is just use a php include to read from a text file so that I can include more links to it on the fly, as is their all hardcoded into the document.
Please be aware that turning on spider-friendly links can be a major pain if a spider comes through that does not know how to properly handle dynamic content.
Also be aware that forum number 1 on your board is a disaster waiting to happen - we advise keeping no more than a few thousand topics in a single forum. That forum has 24,000 topics... about eight times our recommended limit. I would highly suggest using the Mass Move tool to create an archive forum...
Originally posted by Charles Capps: [qb] Also be aware that forum number 1 on your board is a disaster waiting to happen - we advise keeping no more than a few thousand topics in a single forum. That forum has 24,000 topics... about eight times our recommended limit. I [/qb]
So what you're saying is 330,640 posts in one forum is too many? <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" />
Seriously, I know it's a lot more than recommended - I won't hold you guys to blame if it blows up <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" /> But I do have a question - what are the potential problems with having too many threads/posts in one forum? What sort of explosions might be on the horizon? I'd forgotten about the limitations...now I'm scared again <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" />
I have another spider question - how can I stop them from indexing email, profile, PM, edit and reply with quote sections of each and every post? That's how I'm seeing results like "http://www.stripersonline.com/cgi-bin/ubb_547C/ultimatebb.cgi?ubb=edit_post;f=1;t=032224;reply_num=000004;u=00001886" showing up - it's gotta be boring the snot outta the spiders <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" />
they spider any links they see. they just have a lot of FYI pages indexed <img src="https://www.ubbcentral.com/boards/images/graemlins/tongue.gif" alt="" />
The problem is that the spiders have to snag each and every little bit; your host may notify you about it <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" /> ... It also henders preformance of that section.
Originally posted by Gizmo: [qb] The problem is that the spiders have to snag each and every little bit; your host may notify you about it <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" /> ... It also henders preformance of that section.
Hey, more spider food the better man... [/qb]
I gave up on hosts back when forum #1 was only twice as large as Infopop says a forum should be allowed to get <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" /> Now Charles tells me it's 8X bigger than recommended - can you imagine what just that one forum would do to a virtual server account? <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" /> Coupled with the 24 other forums, I don't think "host" is ever gonna be allowed in my vocabulary again <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" />
About the "more spider food the better" - interesting - but from the perspective of someone searching Google and seeing that link above...when they click on it, in that example, they'd be taken directly into editing a post - it wouldn't work, but it would sure confuse the poor bastage <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" />
Is there anyway to put a meta/ignore kinda tag in the code so that those profile, PM, email, edit and reply with quote links would be ignored? Or are you gonna make me go hunt you down on UBBdev for a hack Gizmo? <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" />
No hack on UBBDev for ignoring such stuff... BTW, the more food for the spiders, the more often theil come lookin for more <img src="https://www.ubbcentral.com/boards/images/graemlins/wink.gif" alt="" /> ... At least in my experiance. So long as you have SOME forums spidering correctly then it's going to work just fine, it'll just take time.
The general problem with such a single large forum is with the sheer number of files.
UBB.classic is tied directly to the filesystem. It can only perform as fast as the operating system can open files.
Every time a file in that forum has to be opened, the operating system has to read the list of all 24,000 just to find the one that we're interested in. That might not be so bad when you're opening just one file... but in order to generate the topic list (i.e. the forum itself), between three and, oh, say, 50 files in that forum have to be opened up, and THAT can cause some major performance issues.
Now, this might not seem like too much of a problem at first... but our file locking scheme quickly makes it a problem.
6.0x (and to a lesser extent, the entire 5-series) had a major problem with file corruption. We eventually found that we had a number of race conditions in the code that were causing the corruption (among other things). Unfortunately, due to the nature of the race conditions and the file I/O we needed to do, we could not fix the problem directly. There would always be a chance of file corruption... that's just the way things worked.
So we developed a workaround - before doing any file operations, grab a lock on a central lock file... that central lock can be kept until we're done working. This way, only one UBB instance can operate on files at a single time. (There are actually multiple central locks - a global one, and one for each forum.)
But, as I said, we're completely tied to the filesystem, so a single hung request that has the central lock file will block every single subsequent request.
So, let's say it takes two seconds to open a single file in forum 1... and let's say that we need to open three files. A single request to that forum would require at least six seconds, during which no other requests can be made.
Thus creating a major performance issue. That's why we advise keeping no more than 3,000 topics in an active forum - it could easily bring the board to a halt. (We also advise keeping no more than 20,000 members... 10,000 or fewer, if possible.)
But wait, it gets worse. <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" />
In addition to the performance problem, we have noticed that the chance of file corruption increases as the file count in a directory increases. We believe this to be a side-effect of the performance problem, but we've never managed to catch the coruption as it's happening.
In any case, those topics are at risk of being corrupted to the point of being unrecoverable...
So, in summary: Lots of topics in a busy forum -> poor performance -> higher potential for corruption.
Thanks Charles, that helps me understand a number of things that have happened in the past <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" />
Question - If I "mass move" 20,000 threads:
#1 - Won't I need to create and use about 7 new archive forums? Or does the suggested 3K limit not apply to archived threads?
#2 - If I leave the archived threads in Forum 1 but archived so they can only be read, will that change any of the potential corruption or performance issues?
I must say, even having close to 30,000 threads in that one Forum, our forum has always been much, much faster than any other surf fishing website. New members always rave about how fast it is <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" /> I imagine that's due to being on a dedicated server that does pretty much nothing else but run those forums <img src="https://www.ubbcentral.com/boards/images/graemlins/smile.gif" alt="" />
The 3,000 limit is generally for active forums. It's safe to have a read-only archive that is much larger.
In fact, what you might want to do to save lots of time would be to use the mass move tool to move *RECENT* topics into a new forum, then close the existing one as the read-only archive. Moving all those topics to another forum would probably take forever and a day... and honestly, I hate to say it, but I don't even know if the mass move tool will be able to cope with trying to move that many topics at once. (So if you do go with the move-into-archive idea, doing it in smaller chunks would be a wise idea.)
Just to clarify on terms... when I speak of an archive, I speak of a forum that has been marked read-only.
There are two other possible definitions for archive. The first is a 5-series archive, which is like a forum, but completely different. All content in a 5-series archive is completely read only, with no way to delete or edit. I generally compare them to black holes - what goes in never comes back out ... until you upgrade to the 6-series, at least.
The second definition would be the way UBB.x/Eve handles archived topics. UBB.x/Eve plans have message storage limits (due to database size and speed), and admins have the ability to move messages out of the database and into "archived" states. UBB.x/Eve archived topics & posts exist inside the original forum, unlike both 5-series archives and 6-series read-only forums.
So, when I speak of archiving, I speak of moving topics into a read-only forum.