Previous Thread
Next Thread
Print Thread
Hop To
Joined: Dec 2005
Posts: 122
member
member
Joined: Dec 2005
Posts: 122
i am developing a new forum site which shall be a "subset" of an existing forum.

i want to mass delete all *topics* from the database that do not contain specified keywords.

clearly, this will require some mysql magic, with at least one JOIN.

can anyone point me in the right direction? guarantees not required; i know the warranty is null and void once i start stuffing around like this! i'm all nicely backed up, so i can afford a few catastophies ...



JakChat.com -- Forums for Indonesia's English-speaking community
Ubuntu-Indonesia.com -- Forums for Indonesia's Ubuntu Users
Joined: Jan 2010
Posts: 34
newbie
newbie
Joined: Jan 2010
Posts: 34
I'll try and do some testing to come up with something that will work. Since some posts in a topic might contain the keywords and some might not, I'm not quite sure on how this would be done with a single query. A small script might be in order, but will do some test queries when I get a chance.

Joined: Dec 2005
Posts: 122
member
member
Joined: Dec 2005
Posts: 122
thanks for your attention, scott. i hope you can come up with something!


JakChat.com -- Forums for Indonesia's English-speaking community
Ubuntu-Indonesia.com -- Forums for Indonesia's Ubuntu Users
Joined: Jan 2010
Posts: 34
newbie
newbie
Joined: Jan 2010
Posts: 34
Quick question. Are the keywords that you'll be looking for, would they be only in the initial post of the topic, or in any post of the topic?

Joined: Dec 2005
Posts: 122
member
member
Joined: Dec 2005
Posts: 122
they will be in any post within a topic.

so if any message within a topic contains one of the keywords, then i wish to have this topic retained in the message database. all other topics that do not contain one of the keywords have to be deleted.

this may be sufficiently complex to require a little program, which i can do using php. i just need to sort out the mysql part, as joining the topics and messages in UBB often confuses me.


Last edited by KuKuKaChu; 04/20/2010 7:59 PM. Reason: more info ...

JakChat.com -- Forums for Indonesia's English-speaking community
Ubuntu-Indonesia.com -- Forums for Indonesia's Ubuntu Users
Joined: Jan 2010
Posts: 34
newbie
newbie
Joined: Jan 2010
Posts: 34
Well, there is a way to do it in a single query, but it takes a long time to run even on a moderate amount of posts and topics, so it's probably best to just script it.

Couple ways to go about it. But, what you're probably going to need to do is just check one topic at a time. So, in your script, just select all of your topic ids, then loop through those and query the posts with that topic id, checking if those keywords exist, with a like '%keyword%' clause. If the posts in that topic don't contain the keywords, then you'd delete the topic from the topics table, and then all of the posts from the posts table with that topic id. No need for a join or anything fancy.

If you need more details, let me know.

Joined: Apr 2007
Posts: 3,940
Likes: 1
SD Offline
Former Developer
Former Developer
Joined: Apr 2007
Posts: 3,940
Likes: 1
i'd hijaack the content rebuilder and add another select case there...

that way you can leverage the looping part thru all the posts part right off the bat..

just add your own new special case there..

2c

Joined: Dec 2005
Posts: 122
member
member
Joined: Dec 2005
Posts: 122
ok. this is what i did:

Code
<?
$hostname = "localhost";
$database = "mydatabase";
$username = "myusername";
$password = "mypassword";
$jc = mysql_connect($hostname, $username, $password) or trigger_error(mysql_error(), E_USER_ERROR);
mysql_select_db($database, $jc); 

// get a list of unique topic numbers that contain the keywords
// your keywords will of course be different. hopefully.
$query = "SELECT DISTINCT TOPIC_ID FROM ubbt_POSTS WHERE 
					POST_BODY LIKE '%batam%' OR 
					POST_BODY LIKE '%bintan %' OR 
					POST_BODY LIKE '%karimun%'  OR 
					POST_BODY LIKE '%sentosa%' OR 
					POST_BODY LIKE '%nagoya%'  OR 
					POST_BODY LIKE '%tanjung pinang%'  OR 
					POST_BODY LIKE '%free trade zone%'  
					order by TOPIC_ID";
					
$list = mysql_query($query, $jc) or die(mysql_error($jc));

// stuff all the "good" topics into an array.
$tp = array();
while (($data = mysql_fetch_assoc($list))) $tp[] = $data['TOPIC_ID'];

// select all the post ids from the POSTS table
$query = "SELECT TOPIC_ID FROM ubbt_POSTS WHERE 1 ORDER BY TOPIC_ID";
$list = mysql_query($query, $jc) or die(mysql_error($jc));
$data = mysql_fetch_assoc($list);

$count = 0;
$keepcount = 0;
do { 
	$count++;
	$topic = $data['TOPIC_ID'];
	// if the topic number is not found in the list of "good" topics, then delete all posts with that topic id
	if (in_array($topic, $tp)) $keepcount++;
	else {
		mysql_query("DELETE FROM ubbt_POSTS WHERE TOPIC_ID=$topic", $jc) or die(mysql_error($jc));
	}  
} while ($data = mysql_fetch_assoc($list)); 

echo "<b><br />\nPosts: $count Posts Processed\n<br />\n$keepcount Posts Preserved\n<br /></b>\n";

// now we have to fix up the TOPICS table, which will contain lots of orphan topic ids; code is very similar to above
$query = "SELECT TOPIC_ID FROM ubbt_TOPICS WHERE 1 ORDER BY TOPIC_ID";
$list = mysql_query($query, $jc) or die(mysql_error($jc));
$data = mysql_fetch_assoc($list);

$count = 0;
$keepcount = 0;
do { 
	$count++;
	$topic = $data['TOPIC_ID'];
	if (in_array($topic, $tp)) $keepcount++;
	else {
		mysql_query("DELETE FROM ubbt_TOPICS WHERE TOPIC_ID=$topic", $jc) or die(mysql_error($jc));
	}  
} while ($data = mysql_fetch_assoc($list)); 

echo "<b><br />\nTopics $count Records Processed\n<br />\n$keepcount Topics Preserved\n<br /></b>\n";
?>

yes, it's rough and ready, but it does the job nicely, and doesn't take very long. mine took about 5 minutes to process 100,000 posts.

once finished, go into the control panel "Content Rebuilder" and execute Rebuild Posts, Rebuild Topics, Rebuild Forums, and Rebuild Post Counts.

i also found it useful at this stage to remove all users with 0 posts, using the control panel Member Management. then rebuild post counts again. (Prune Orphaned Private Topics should be run at this point, but for me this did not work! *bug*)

you may wish to Clear Cache at this point, just for good luck.

and there you have it. a new forum all ready to go. wink



JakChat.com -- Forums for Indonesia's English-speaking community
Ubuntu-Indonesia.com -- Forums for Indonesia's Ubuntu Users

Link Copied to Clipboard
ShoutChat
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Recent Topics
spam issues
by ECNet - 03/19/2024 11:45 PM
Looking for a forum
by azr - 03/15/2024 11:26 PM
Editing Links in Post
by Outdoorking - 03/15/2024 9:31 AM
Question on barkrowler and the like
by Mors - 02/29/2024 6:51 PM
Member Permissions Help
by domspeak - 02/27/2024 6:31 PM
Who's Online Now
1 members (Havenofsobriety), 522 guests, and 99 robots.
Key: Admin, Global Mod, Mod
Random Gallery Image
Latest Gallery Images
Los Angeles
Los Angeles
by isaac, August 6
3D Creations
3D Creations
by JAISP, December 30
Artistic structures
Artistic structures
by isaac, August 29
Stones
Stones
by isaac, August 19
Powered by UBB.threads™ PHP Forum Software 8.0.0
(Preview build 20230217)