Hi, Im the admin of the system in question that pianoworld is referring to.

During a system backup, there's a fair bit of file IO going on, and of course the forums are still up and running as usual. If there's alot of activity (say a few forum searches in the large forum system), things can get very heavily loaded and slow for short periods of time. Sometimes however, it seems to stop completely.

I watched this happen today, and so stopped the backup, checked for anything else on the server using IO and killed it, and the system was completely idle. The main website (outside UBB) was peppy, and mysql on the cmd line for even complex queries was very fast. However the forums were still not working, generating that error.

Only after stopping and restarting apache and mysql did things recover. (Im thinking stopping mysql didnt do anything, it was apache). My guess is there's a connection pool to mysql that UBB uses and it got filled with a number of requests that wedged while the server was busy for a short period, but then all other queries piled up behind.

The most curious thing is that this has been happening off and on for the last few weeks, but the system eventually recovers. Im wondering how it does, and when - what's it waiting for when the system is already idle for several minutes before it recovers? I dont see any crontabs installed that would do some regular 'cleanup' operation... So what's causing it to unwedge, and why doesn't it happen faster once load dissipates?

(Obviously this connection pooling is my personal theory with no evidence, just have seen it with other applications in the past - wondering what I should do/run/look at to verify that's what's happening.)

Thanks.

-math