|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
The search feature in our new forum doesn't seem to work real well. For example, if I search for baldeagl I get zero hits. If I search for baldeagle, I get plenty of hits. If I search for baldeagl*, I get plenty of unrelated hits. A member was searching for "MAR-K" and getting no hits even though there's a post in one forum the subject of which is "ATTN MAR-K". But searching for "mar-ktech" (his username) produced many hits, including the subject line above. Is there something I need to enable? Or do I need to run something to get the indexing working? If you want to test some searches, our board is The Stovebolt Forum
|
|
|
|
Joined: Aug 2006
Posts: 1,358
Veteran
|
Veteran
Joined: Aug 2006
Posts: 1,358 |
ubb.threads currently utilizes the full text index for searches, and while that is blazingly fast, it also comes with limitations.
|
|
|
|
Joined: Jun 2006
Posts: 16,299 Likes: 116
|
Joined: Jun 2006
Posts: 16,299 Likes: 116 |
I believe that Rick is planning on returning the "old style search" for those users who have difficulties with the FullText searches. I'm not sure when it'll be added, nor if it'll work along side the full text (but i believe it will be admin selectable on which behavior to use)
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
Are there any tricks to get the full text search to work better?
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
I've done some more testing on the search function and it is, to say the least, erratic. For example, if I set the search range to 5 years and search for door, I get posts going back to 10/12/2007 only. If I search for bed, I get zero hits (and we have an entire forum just for beds.) If I search for baldeagl, I get zero hits, but if I search for baldeagle, I get hits going back to 2004, but only 75 (and I have over 1800 posts during that period.)
Is there anything I can do to improve this? Or are we stuck with it until the new search function is coded up? Is it a function of the posts being in the database? Having been imported rather than natively added?
Throw me a bone, guys.
|
|
|
|
Joined: Aug 2006
Posts: 1,358
Veteran
|
Veteran
Joined: Aug 2006
Posts: 1,358 |
By default, words less then 4 letters are not indexed by the full text index. If you have server access, you could bring that down to 3 characters.
It's not my field of expertise, but I think fulltext only searches for "whole words", so that would exclude baldeagl if the word in the post is actually baldaegle. But that's a guess.
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
I've already got it set to 2 characters.
|
|
|
|
Joined: Aug 2006
Posts: 1,358
Veteran
|
Veteran
Joined: Aug 2006
Posts: 1,358 |
Did you rebuild the search index after that (I guess dropping and recreating the index). Repairing the posts table would do that repair table ubbt_POSTS quick
Last edited by blaaskaak; 11/04/2007 1:07 PM.
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
Doh! {{scrambles back into his burrow}}
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
OK, I'll bite. Where is the rebuild index function?
|
|
|
|
Joined: Aug 2006
Posts: 1,358
Veteran
|
Veteran
Joined: Aug 2006
Posts: 1,358 |
OK, I'll bite. Where is the rebuild index function? Like I said, repair the table, that rebuild them. Maybe an option for you to alter /scripts/dosearch.php and redo the queries to use LIKE instead of MATCH.
Last edited by blaaskaak; 11/04/2007 1:11 PM.
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
OK. I assume I have to close the board before running the repair table command, right?
|
|
|
|
Joined: Aug 2006
Posts: 1,358
Veteran
|
Veteran
Joined: Aug 2006
Posts: 1,358 |
no, not really. I never do. But that's me You could add that quick parameter, according to what I googled, that should do the trick.
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
After reading the mysql docs regarding MATCH IN BOOLEAN MODE, I think there's be a lot more to alter than just MATCH to LIKE. Not sure I'd want to tackle that without a lot of testing first, and that's what we pay the ubb folks to do when we buy their product, right? :-)
|
|
|
|
Joined: Aug 2006
Posts: 1,358
Veteran
|
Veteran
Joined: Aug 2006
Posts: 1,358 |
Yeah, that's right Oh well, if YOU really need to search something you can always create your very own custom sql query to find what you want
|
|
|
|
Joined: Apr 2007
Posts: 3,940 Likes: 1
Former Developer
|
Former Developer
Joined: Apr 2007
Posts: 3,940 Likes: 1 |
well the way the pre full text index (FTI) query did it was to just do it kinda like (no pun) this: assume you are looking for 'yarp' AND 'unit' the where clause would be : where post_body LIKE '%yarp%' AND post_body LIKE '%unit%' same idea applies to using OR and for the excluded words.. example: assume you are looking for 'yarp' AND 'unit' but NOT 'genius' the where clause would be : where (post_body LIKE '%yarp%' AND post_body LIKE '%unit%') AND post_body NOT LIKE '%genious%' ... so you could do it that way and replace the FTI functionality.. for small post tables, this might be fine ps: that last query always returns 0 results
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
My reading of the mysql docs says that the FULLTEXT search is much faster than matching with LIKE but it has to be done right or it doesn't work well. Also, there's some special gotchas when using php. For those of you who interface MySQL with PHP and wonder what the problem is with getting "exact phrases" working properly, here's the way to go.
$query= "SELECT code, category, header, date FROM articles WHERE MATCH (text,header,summary) AGAINST ('" . stripslashes (str_replace (""", "\"", ($_POST['keywords']))) . "' IN BOOLEAN MODE)";
PHP, or some setups or with some browsers, convert double quotes from POST data to their HTML-equivalents even without being asked to do that. The above will fix the issue. Stripslashes() is optional, I prefer to keep it in to keep things looking clean, though the \" doesn't seem to break the boolean literal search. But that post is old, so it may not still be relevant. I'm going to do some playing around with FULLTEXT searches and see if I can solve the problem. One thing is for certain, we have to have shorter character length search capabilities, because our members will definitely be searching for "bed" (among other short words.)
Last edited by Baldeagle; 11/04/2007 11:59 PM.
|
|
|
|
Joined: Apr 2007
Posts: 3,940 Likes: 1
Former Developer
|
Former Developer
Joined: Apr 2007
Posts: 3,940 Likes: 1 |
that's easily remedied.. just change your stop word length and or your actual stop word list (if you have access to your box via SSH)
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
While doing some testing I discovered that the ubbt_POSTS table did not have a FULLTEXT index. Is this not created when the tables are created? Is the index updated with each new post? Periodically? Not at all? (I created one using ALTER.) This seems rather critical to the proper functioning of searches.
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
Sirdude, I have no idea what you're referring to with "stop word". What is that and where do I find it?
|
|
|
|
Joined: Nov 2006
Posts: 3,095 Likes: 1
Carpal Tunnel
|
Carpal Tunnel
Joined: Nov 2006
Posts: 3,095 Likes: 1 |
|
|
|
|
Joined: Apr 2007
Posts: 3,940 Likes: 1
Former Developer
|
Former Developer
Joined: Apr 2007
Posts: 3,940 Likes: 1 |
While doing some testing I discovered that the ubbt_POSTS table did not have a FULLTEXT index. my bad, we are actually searching on default_post_body (the pre html marked up fiel -- which makes sense)
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
So the two columns that need to be indexed are POST_SUBJECT and POST_DEFAULT_BODY?
|
|
|
|
Joined: Apr 2007
Posts: 3,940 Likes: 1
Former Developer
|
Former Developer
Joined: Apr 2007
Posts: 3,940 Likes: 1 |
yes, you will see that they ARE already indexed. that is FULL text indexed.. now if you intend to go the non FTI route, you might then be suggesting that you index them normally.. and off the top of my head, i don't see that helping.. if anything it might hurt performance on the store.. since both are so random that the indexes b-tree would almost be meaningless.. take this as a simple analogy.. lets assume you have an index on user name.. ok. that's logical.. the b-tree for the index is probably going to be balanced.. so a query that is like "select * from blabla where username like 'b%'" will be very efficient. the index just goes to all last names starting with 'b' .... but you would make the index meaningless (or less significant), if you did a query like "select * from blabla where username like '%b%'" (any name with a 'b' in it), because the b-tree isn't setup that way. you are really causing the mysql optimizer to give up and give into a full table scan for results.. </geek sql mode> sorry for that, but feel free to have a go at it
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
How can I tell if they are indexed? I ran a "describe table" query, but I didn't see any info on indexing.
|
|
|
|
Joined: Apr 2007
Posts: 3,940 Likes: 1
Former Developer
|
Former Developer
Joined: Apr 2007
Posts: 3,940 Likes: 1 |
ideally in phpMyAdmin, you just click on the table name (left nav, when you select your DB0 and it shows all the fields.. then you can look below the field descriptions and there is a section about the indices used..
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
The syntax is SHOW INDEX FROM [tablename];
|
|
|
|
Joined: Apr 2007
Posts: 3,940 Likes: 1
Former Developer
|
Former Developer
Joined: Apr 2007
Posts: 3,940 Likes: 1 |
you don't have phpmyadmin ?
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
I don't use phpmyadmin. I login to mysql directly and type the commands that I want.
|
|
|
|
Joined: Apr 2007
Posts: 3,940 Likes: 1
Former Developer
|
Former Developer
Joined: Apr 2007
Posts: 3,940 Likes: 1 |
oooooh kay i like the console as much as the next guy, but probably use phpmyadmin for 80% of my work.. good luck
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
Here's what I've done so far. Added "ft_min_word_len=3" to the /etc/my.cnf file to allow three character word searches. Changed the character map so that dashes (-) are included in searches. Verified that the POST_SUBJECT and POST_BODY_DEFAULT tables are FULLTEXT indexed.
This corrected the two most pressing problems that I've had.
It seems that the character length entry in the Control Panel has no effect at all on searches. (Feature Settings/Search/Minimum Searchable Word Length:) Perhaps that's a bug?
|
|
|
|
Joined: Apr 2007
Posts: 3,940 Likes: 1
Former Developer
|
Former Developer
Joined: Apr 2007
Posts: 3,940 Likes: 1 |
you are correct. dosearch.inc.php only makes sure that if it's not set, that is is defaulted to 3 and then it's not used i verified for 7.2.2 and 7.3 as for FULLTEXT indices, i guess it's nice you verified.. but i think i indicated they were there.. maybe not good luck
|
|
|
|
Joined: Oct 2007
Posts: 361 Likes: 8
Enthusiast
|
Enthusiast
Joined: Oct 2007
Posts: 361 Likes: 8 |
Yes, you did, but I wanted to be sure. The search function now seems to be working quite well and is blazing fast. I've set the max timeframe to 10 years and the max hits returned to 1000, and a search that returns 700 hits took about 2 seconds. I assume that new posts are automatically indexed when they are inserted into the db?
|
|
|
|
Joined: Apr 2007
Posts: 3,940 Likes: 1
Former Developer
|
Former Developer
Joined: Apr 2007
Posts: 3,940 Likes: 1 |
yes, that's what an index is all about also, there is an improvement on the current FTI search coming for 7.3 where the timeframe is 'self JOINED' in to allow the mysql engine to include it in the optimization. FTI by definition/design only uses the one index and says 'screw you' to any other columns the might have indices. This pulls the time field in with the JOIN on itself. so your queries will just get faster
|
|
|
Bots
by Outdoorking - 04/13/2024 5:08 PM
|
|
|
|
|
|
1 members (Nightcrawler),
1,165
guests, and
234
robots. |
Key:
Admin,
Global Mod,
Mod
|
|
|
|