In 7.5.8, the UBB.Threads developers changed the way "Spider-Friendly URLs" were generated. That change caused several problems.
1. Search engine indexing problems caused by multiple "?", "&", "/" and other special characters in the URL
EXAMPLE: https://www.ubbcentral.com/forums/ubbthreads.php/topics/255499/?_/&/?%3E%22&+?&am The link should take you to here, but it does not parse in UBBT 7.5.8.
2. Problems with how UBBT 7.5.8 determines if a line of text is just text or if it's actually URL and should be treated as a URL due to the special characters being placed in the URLs.
This link has braces ( [ and ] ) in it, but UBBT 7.5.8 doesnt know how to parse it, so it treats only part of it as a URL or ignores the whole thing entirely, just as UBBT 7.5.8 ignored the url strings in item #1 as being a link. https://www.ubbdev.com/forums/ubbthreads.php/topics/318145/%5B7.x%5D_Google_Adsense_Search.html https://www.ubbdev.com/forums/ubbthreads.php/topics/318145/[7.x]_Google_Adsense_Search.html
3. Generating "duplicate content" flags are raised in Google's Webmaster Tools dashboard because of the URL change in all your topics, but the content remaining the same. Its not incredibly horrible as relating to Google's search engine and their advanced content tools, but other spiders may flag you and down-rank your pages because of it.
4. Possible login Cookie problems for Internet Explorer 7 and older. https://www.ubbcentral.com/forums/ubbthreads.php/topics/255498/Re:_"Remember_me_on_e#Post255498
The short answer is that you should use a hyphen for your SEO URLs. Google treats a hyphen as a word separator, but does not treat an underscore that way. Google treats and underscore as a word joiner — so red_sneakers is the same as redsneakers to Google. This has been confirmed directly by Google themselves, including the fact that using dashes over underscores will have a (minor) ranking benefit.
Again, SEO URLs should use hyphens to separate words. Do not use underscores, do not try to use spaces, and do not smash all the words together intoonebigword. As of 2012, dashes are still the best way to optimize your SEO URLs.
A video answering the hyphen vs underscore SEO URL question by Matt Cutts. Matthew "Matt" Cutts leads the Webspam team at Google, and works with the search quality team on search engine optimization issues
Regarding the video; since your pages are now formatted differently ("brand new") with UBBT 7.5.8's "SEO-friendly" URLs, they should be using hyphens (-) rather than underscores (_).
6. Since UBBT 7.5.8, here are some other characters that get placed in the URL which should not be there: ? > < " & + | ! % # \ ^ { } = : ; @ $
seems a shorter way to do it. maybe i missed something.. :shrug:
Yup, you did.
While I detailed what each line does (for the reader), most importantly, FIRSTLY replace the common space (& nbsp;) and ampersand (& amp;) markup code with dashes - which some browsers like to auto insert to coppied urls (Firefox). You might also want to check templates/default/forum.tpl, because it does the same thing to urls in the "last post" column. (Further reading regarding forum.tpl, see line 330 and below of libs/functions_forums.inc.php, regarding the comment to escape the quotes in the subject)
The code I posted/linked to, recognizes this, and fixes it.
In addition, since $title is already limited in the newtopic template to 50 characters (IIRC) -- and since long URLs are split at 40 & -15 (line 566 of libs\bbcode.inc.php), running a substr on it again may be redundant.
What might be an improvement to how UBBT handles internal urls for the forum reader and for SEO strategy, is that if a link comes from the same domain and ubbt forum, the link is displayed as human-readable-text, such as "[This Topic We Are Discussing]", rather than just a URL with an ellipsis halfway through the url // and the first part of the url's title/description. AND to go one step further, if that link is on the same domain, do not add a "NOFOLLOW" meta attribute to it. This would be added just above, in bbcode.inc.php, by making a check to the url. if it preg_match #site-domain-name-url#, then $this->nofollow = '';
As always in coding, there are a million ways something can be written to perform the same output as what some one else has coded. It's no competition to see who is right and who is wrong in the process. In the end, having the correct output should be the ultimate goal.
larger issue is make_ubb_url isn't the go to for every forum URL that is generated. much like array_get is used to sanitize all inputs, make_ubb_url should also be a common funnel.
i'm running the code i posted above on this forum right now and i see where a lot of routines ( portals, showflat, postlist and more ) don't do the right thing to take advantage of your improvements.
this points to maybe a way to leverage what you did, but really make it truly common throughout
periods don't make it thru the preg_replace for non-alpha
also, the problem is that not everyone calling make_ubb_url asks to have the $title sanitized
a real re-work is in order for maybe next version.
edit: as to the "Re" dealio... the real way to do that is to not have an indexed title for every post. the topic should be enough for the entire topic. this would cut way down on space as well as leverage big improvement on full_text searching on topics. my pre alpha 7.6 version does exactly this...
only losers will be the 'show threaded' peeps, who like to allow changing topic in mid stream
Also remember that there are "function ubbchars" and "function ubbchars_decode" at the end of the ubbthreads.inc.php which also get passed by various other scripts within UBBT 7.5.8's code. Take these into account when working with html markup code.
With the intent to lookout for all of your current paid license-holders (customers), is there a timeline for a release-to-public of the above-mentioned solutions which you've adopted in to UBB Central?
Since the most recent UBBT release on 11/30/2013, there has been a possible login cookie problem for all users of Internet Explorer 7 (and prior versions). On a url with an underscore, IE silently drops all cookies for that host and refuses to accept new ones. The server sends Set-Cookie response correctly, yet the cookie never shows up in IE.
I'd rather license a forum software with a company who properly maintained release timelines and enforced integrity assessments and mitigation instead of a forum software which shoots the messenger and dismisses all risks/bugs for every license holder.
On 05/10/2014, following my thorough report and suggested solution, you've fixed the problem here on your user-to-user support forums, UBBCentral.com, but you haven't made the fix available to your currently licensed subscribers.
Please don't be just another company who cares more about being "right" while ignoring the problem, rather than protecting/supporting your license holders (customers) from known flaws in your software.
A humble please from me too. $59 bucks a year for a once-a-year, point x.x.x bug-fix update is wild.
I'm looking forward to any update that will sort this for the rest of us who aren't super-techies.
Google Webmaster has been driving me NUTS now for months, as well as my domain's Links checking software, due to UBB SEO probs with the URLs as 242 outlined.
I would welcome any update soon. Many thanks.
Mark J.Cairns Producer, Airwolf Themes CD soundtracks
seems a shorter way to do it. maybe i missed something.. :shrug:
BTW, the reason I used "mb_convert_case" instead of your chosen "strtolower" String is that PHP by default does not know about utf-8. It assumes any string is ASCII, so "strtolower" converts bytes containing codes of uppercase letters A-Z to codes of lowercase a-z. As the UTF-8 non-ascii letters are written with two or more bytes, the "strtolower," you suggest using, converts each byte separately, and if the byte happens to contain code equal to letters A-Z, it is converted. The result sequence is broken, and it no longer represents correct character and could create multiple unintended characters, especially for non-english languages.
To change this, you need to configure the mbstring extension ( http://www.php.net/manual/en/book.mbstring.php ) to replace "strtolower" with "mb_strtolower" or use "mb_convert_case," such as how it was originally written for you in my recommended fix, "mb_convert_case($title, MB_CASE_LOWER, "UTF-8")"
The generic URI syntax mandates that new URI schemes that provide for the representation of character data in a URI must, in effect, represent characters from the unreserved set without translation, and should convert all other characters to bytes according to UTF-8, and then percent-encode those values. This requirement was introduced in January 2005 with the publication of RFC 3986. URI schemes introduced before this date are not affected.
It seems like because there were other accepted ways of doing URL encoding in the past, browsers attempt several methods of decoding a URI, but if you're the one doing the encoding you should use UTF-8. UTF-8 should also be used because it is the only encoding allowed by the newer IRI standard (RFC 3987) that is replacing the older URL standard.
---
The original intended purpose of this post was to 1) submit a bug report, 2) publish examples of what this bug affects, and 3) then submit a solution/fix for it.
My exact suggested fix (linked in the OP) resolves the following items: -Keep the URLs spider-friendly -Standardize across UBBT releases -Compatible with 7.5.7 and prior, to avoid "duplicate content" flags -Allow copy/pasted URLs from UBBT to be parsed correctly on UBBT and other internet softwares -Resolve IE cookie issues -Not break anything else
Why is it that the URL suggestion for hyphens in lieu of underscores has been implemented here, with no indication either in the member area or anyplace else of any updates to the current version?
It appears that someone has made some edits to the code without any notation to the public. Leading us all to believe that this flavor of UBB at this site is what we purchase, where in fact it is not.
I wonder what other changes have been implemented here that have not been released!
Blue Man Group There is no such thing as stupid questions. Just stupid answers
May I suggest upping the limit of 30 characters to 70
+1
This should be the default -- or no truncation at all. User-generated subjects are already limited to 50 characters by default. In addition, UBBT 7.5.8 raises duplicate content flags by also inserting four more characters to the URL, as "Re:_", to direct links to individual posts.
If it's obvious, I don't see it. Is there any reason for new SEO URLs to have truncated titles, if the title is intended to be used for SEO while still remaining user-readable... and it's already being limited to 50 characters elsewhere?
I noticed above, you posted a YouTube video, but it did not parse correctly. I am having that problem on my site. Is there any fix for that?
"No matter where you go, there you are." "If you can't do something smart, Do something right" "There are three kinds of people in the world, those who can count, and those who can't"
BF, check UBBDev. All the markup settings are in a post over there. The settings within stock ubbt do not cover youtube, vemeo or other services. They are jusy basic.
the link i posted here, was when those non-stock settings were used on this site. Now they reverted just those items to their default settings.
I have those, it still isn't parsing them, just shows the code in the post.
"No matter where you go, there you are." "If you can't do something smart, Do something right" "There are three kinds of people in the world, those who can count, and those who can't"