UBBCentral

Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs

Posted By: isaac

Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/10/14 07:33 AM

In 7.5.8, the UBB.Threads developers changed the way "Spider-Friendly URLs" were generated. That change caused several problems.

1. Search engine indexing problems caused by multiple "?", "&", "/" and other special characters in the URL

EXAMPLE: https://www.ubbcentral.com/forums/ubbthreads.php/topics/255499/?_/&/?%3E%22&+?&am
The link should take you to here, but it does not parse in UBBT 7.5.8.


2. Problems with how UBBT 7.5.8 determines if a line of text is just text or if it's actually URL and should be treated as a URL due to the special characters being placed in the URLs.

This link has braces ( [ and ] ) in it, but UBBT 7.5.8 doesnt know how to parse it, so it treats only part of it as a URL or ignores the whole thing entirely, just as UBBT 7.5.8 ignored the url strings in item #1 as being a link.
https://www.ubbdev.com/forums/ubbthreads.php/topics/318145/%5B7.x%5D_Google_Adsense_Search.html
http://www.ubbdev.com/forums/ubbthreads.php/topics/318145/[7.x]_Google_Adsense_Search.html


3. Generating "duplicate content" flags are raised in Google's Webmaster Tools dashboard because of the URL change in all your topics, but the content remaining the same. Its not incredibly horrible as relating to Google's search engine and their advanced content tools, but other spiders may flag you and down-rank your pages because of it.

Read more on this at:
http://googlewebmastercentral.blogspot.no/2008/09/demystifying-duplicate-content-penalty.html


4. Possible login Cookie problems for Internet Explorer 7 and older.
https://www.ubbcentral.com/forums/ubbthreads.php/topics/255498/Re:_"Remember_me_on_e#Post255498

AGAIN, the URL doesnt get parsed correctly because of the quotes (") in it.
The manually-corrected URL is: http://www.ubbcentral.com/forums/ubbthreads.php/topics/255498#Post255498

Using the instructions to fix UBBT 7.5.8's URL bug -- at the bottom of this post -- the copy/pasted link would be parsed correctly and would look like:
https://www.ubbcentral.com/forums/ub...osing-user-login-status-in-ie#Post255498


5. In addition, here is some further reading on the use of hyphens (-) vs underscores (_):
SOURCE: http://www.ecreativeim.com/blog/2011/03/seo-basics-hyphen-or-underscore-for-seo-urls/

Quote:
The short answer is that you should use a hyphen for your SEO URLs. Google treats a hyphen as a word separator, but does not treat an underscore that way. Google treats and underscore as a word joiner — so red_sneakers is the same as redsneakers to Google. This has been confirmed directly by Google themselves, including the fact that using dashes over underscores will have a (minor) ranking benefit.

Again, SEO URLs should use hyphens to separate words. Do not use underscores, do not try to use spaces, and do not smash all the words together intoonebigword. As of 2012, dashes are still the best way to optimize your SEO URLs.


A video answering the hyphen vs underscore SEO URL question by Matt Cutts.
Matthew "Matt" Cutts leads the Webspam team at Google, and works with the search quality team on search engine optimization issues

Regarding the video; since your pages are now formatted differently ("brand new") with UBBT 7.5.8's "SEO-friendly" URLs, they should be using hyphens (-) rather than underscores (_).


6. Since UBBT 7.5.8, here are some other characters that get placed in the URL which should not be there:
? > < " & + | ! % # \ ^ { } = : ; @ $


---
How to fix all six items listed above:
http://www.ubbdev.com/forums/ubbthreads.php/topics/319241

Posted By: SD

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/10/14 04:02 PM

PHP Code
$title = preg_replace("#[^A-Za-z0-9]+#", "-", strtolower($title));
$title = preg_replace("#(-){2,}#", "$1", $title);
$title = trim($title, '-');
$title = substr($title, 0, 30);  



seems a shorter way to do it. maybe i missed something.. :shrug:
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/10/14 11:44 PM

Originally Posted by SD
PHP Code
$title = preg_replace("#[^A-Za-z0-9]+#", "-", strtolower($title));
$title = preg_replace("#(-){2,}#", "$1", $title);
$title = trim($title, '-');
$title = substr($title, 0, 30);  



seems a shorter way to do it. maybe i missed something.. :shrug:


Yup, you did.

While I detailed what each line does (for the reader), most importantly, FIRSTLY replace the common space (& nbsp;) and ampersand (& amp;) markup code with dashes - which some browsers like to auto insert to coppied urls (Firefox). You might also want to check templates/default/forum.tpl, because it does the same thing to urls in the "last post" column. (Further reading regarding forum.tpl, see line 330 and below of libs/functions_forums.inc.php, regarding the comment to escape the quotes in the subject)

The code I posted/linked to, recognizes this, and fixes it.

In addition, since $title is already limited in the newtopic template to 50 characters (IIRC) -- and since long URLs are split at 40 & -15 (line 566 of libs\bbcode.inc.php), running a substr on it again may be redundant.

What might be an improvement to how UBBT handles internal urls for the forum reader and for SEO strategy, is that if a link comes from the same domain and ubbt forum, the link is displayed as human-readable-text, such as "[This Topic We Are Discussing]", rather than just a URL with an ellipsis halfway through the url // and the first part of the url's title/description. AND to go one step further, if that link is on the same domain, do not add a "NOFOLLOW" meta attribute to it. This would be added just above, in bbcode.inc.php, by making a check to the url. if it preg_match #site-domain-name-url#, then $this->nofollow = '';

As always in coding, there are a million ways something can be written to perform the same output as what some one else has coded. It's no competition to see who is right and who is wrong in the process. In the end, having the correct output should be the ultimate goal.
Posted By: SD

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/11/14 12:42 AM

larger issue is make_ubb_url isn't the go to for every forum URL that is generated. much like array_get is used to sanitize all inputs, make_ubb_url should also be a common funnel.

THEN we're cooking with gas, so to speak smile

2c

also, in your code you have

Code
$title = str_replace(array("-"), "-", $title); //remove dash


i don't see how that removes anything... replace dash with a dash? :shrug:
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/11/14 12:44 AM

If you see a bug, fix it smile

I believe it was meant as a double-dash ( -- ) replacement rather than a single dash ( - )
Posted By: SD

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/11/14 12:45 AM

i'm running the code i posted above on this forum right now and i see where a lot of routines ( portals, showflat, postlist and more ) don't do the right thing to take advantage of your improvements.

this points to maybe a way to leverage what you did, but really make it truly common throughout smile

good stuff and :ty: smile
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/11/14 12:48 AM

Dont forget to remove the title string's pre and trailing dashes and periods cool

http://www.ubbcentral.com/forums/ubbthreads.php/topics/255520/re-six-show-stopping-problems-#Post255520

EDIT: And the "Re: " string... for bonus points 2c
Posted By: SD

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/11/14 12:49 AM

i do with the trim '-'

periods don't make it thru the preg_replace for non-alpha

also, the problem is that not everyone calling make_ubb_url asks to have the $title sanitized wink

a real re-work is in order for maybe next version.

edit: as to the "Re" dealio... the real way to do that is to not have an indexed title for every post. the topic should be enough for the entire topic. this would cut way down on space as well as leverage big improvement on full_text searching on topics. my pre alpha 7.6 version does exactly this...

only losers will be the 'show threaded' peeps, who like to allow changing topic in mid stream laugh
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/11/14 12:55 AM

it's getting there.

Also remember that there are "function ubbchars" and "function ubbchars_decode" at the end of the ubbthreads.inc.php which also get passed by various other scripts within UBBT 7.5.8's code. Take these into account when working with html markup code.
Posted By: SD

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/11/14 12:58 AM

yes, i wrote them, so i know them well... they actually cause havoc, when not used in the right place wink
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/11/14 12:58 AM

Originally Posted by SD
only losers will be the 'show threaded' peeps, who like to allow changing topic in mid stream laugh


lol - I too, hate those guys.

I've simply turned off that /feature/ of renaming the topic in the reply template with a style="display:none;"
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/11/14 03:35 AM

SD, if you're going to rewrite a bit of code that already works, just so you can have it in your own format, don't forget to also add this item -

Originally Posted by id242
FIRSTLY replace the common space (& nbsp;) and ampersand (& amp;) markup code with dashes


Demonstrated with this URL:
http://www.ubbcentral.com/forums/ubbthreads.php/topics/255499/amp-gt-quot-amp-amp-url-exampl#Post255499

It should instead be:
http://www.ubbcentral.com/forums/ubbthreads.php/topics/255499/url-exampl#Post255499

again, thanks for your hard work in squashing the bugs in UBBT 7.5.8 wink
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 05/30/14 04:12 PM

With the intent to lookout for all of your current paid license-holders (customers), is there a timeline for a release-to-public of the above-mentioned solutions which you've adopted in to UBB Central?

Since the most recent UBBT release on 11/30/2013, there has been a possible login cookie problem for all users of Internet Explorer 7 (and prior versions). On a url with an underscore, IE silently drops all cookies for that host and refuses to accept new ones. The server sends Set-Cookie response correctly, yet the cookie never shows up in IE.

I'd rather license a forum software with a company who properly maintained release timelines and enforced integrity assessments and mitigation instead of a forum software which shoots the messenger and dismisses all risks/bugs for every license holder.

On 05/10/2014, following my thorough report and suggested solution, you've fixed the problem here on your user-to-user support forums, UBBCentral.com, but you haven't made the fix available to your currently licensed subscribers.

Please don't be just another company who cares more about being "right" while ignoring the problem, rather than protecting/supporting your license holders (customers) from known flaws in your software.
Posted By: Mark J.Cairns

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 07/22/14 05:41 PM

^^What he said, SD.

A humble please from me too. $59 bucks a year for a once-a-year, point x.x.x bug-fix update is wild.

I'm looking forward to any update that will sort this for the rest of us who aren't super-techies.

Google Webmaster has been driving me NUTS now for months, as well as my domain's Links checking software, due to UBB SEO probs with the URLs as 242 outlined.

I would welcome any update soon. Many thanks.
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 07/27/14 12:08 AM

Originally Posted by SD
PHP Code
$title = preg_replace("#[^A-Za-z0-9]+#", "-", strtolower($title));
$title = preg_replace("#(-){2,}#", "$1", $title);
$title = trim($title, '-');
$title = substr($title, 0, 30);  


seems a shorter way to do it. maybe i missed something.. :shrug:


BTW, the reason I used "mb_convert_case" instead of your chosen "strtolower" String is that PHP by default does not know about utf-8. It assumes any string is ASCII, so "strtolower" converts bytes containing codes of uppercase letters A-Z to codes of lowercase a-z. As the UTF-8 non-ascii letters are written with two or more bytes, the "strtolower," you suggest using, converts each byte separately, and if the byte happens to contain code equal to letters A-Z, it is converted. The result sequence is broken, and it no longer represents correct character and could create multiple unintended characters, especially for non-english languages.

To change this, you need to configure the mbstring extension ( http://www.php.net/manual/en/book.mbstring.php ) to replace "strtolower" with "mb_strtolower" or use "mb_convert_case," such as how it was originally written for you in my recommended fix, "mb_convert_case($title, MB_CASE_LOWER, "UTF-8")"

Some further reading on this at:
http://www.daniweb.com/web-development/php/threads/342307/utf-8-encoding-issues-with-strtolower

Because this is directly related to URLs, I always recommend encoding them in UTF-8. From the Wikipedia page on percent encoding @ http://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_in_a_URI

Quote:
The generic URI syntax mandates that new URI schemes that provide for the representation of character data in a URI must, in effect, represent characters from the unreserved set without translation, and should convert all other characters to bytes according to UTF-8, and then percent-encode those values. This requirement was introduced in January 2005 with the publication of RFC 3986. URI schemes introduced before this date are not affected.


It seems like because there were other accepted ways of doing URL encoding in the past, browsers attempt several methods of decoding a URI, but if you're the one doing the encoding you should use UTF-8. UTF-8 should also be used because it is the only encoding allowed by the newer IRI standard (RFC 3987) that is replacing the older URL standard.

---

The original intended purpose of this post was to 1) submit a bug report, 2) publish examples of what this bug affects, and 3) then submit a solution/fix for it.

My exact suggested fix (linked in the OP) resolves the following items:
-Keep the URLs spider-friendly
-Standardize across UBBT releases
-Compatible with 7.5.7 and prior, to avoid "duplicate content" flags
-Allow copy/pasted URLs from UBBT to be parsed correctly on UBBT and other internet softwares
-Resolve IE cookie issues
-Not break anything else
Posted By: Ruben

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 07/28/14 10:10 PM

Why is it that the URL suggestion for hyphens in lieu of underscores has been implemented here, with no indication either in the member area or anyplace else of any updates to the current version?

It appears that someone has made some edits to the code without any notation to the public.
Leading us all to believe that this flavor of UBB at this site is what we purchase, where in fact it is not.

I wonder what other changes have been implemented here that have not been released!
Posted By: Mark J.Cairns

Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 07/30/14 01:07 AM

May I suggest upping the limit of 30 characters to 70 from:

$title = substr($title, 0, 30);

to

$title = substr($title, 0, 70);


I found 30 chr$ too restrictive. I realise Google can only show 58 chr$ but many of my topics are up to 70.

Hack works great for me. Thanks id242.
Posted By: Gizmo

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 07/30/14 03:23 AM

The latest rendition of id242's writeup at UBBDev ([7.5.8] Better URL Sanitization for SEO) covers this as well.
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 08/02/14 11:49 AM

Originally Posted by Mark J.Cairns
May I suggest upping the limit of 30 characters to 70


+1

This should be the default -- or no truncation at all. User-generated subjects are already limited to 50 characters by default. In addition, UBBT 7.5.8 raises duplicate content flags by also inserting four more characters to the URL, as "Re:_", to direct links to individual posts.

If it's obvious, I don't see it. Is there any reason for new SEO URLs to have truncated titles, if the title is intended to be used for SEO while still remaining user-readable... and it's already being limited to 50 characters elsewhere?
Posted By: Gizmo

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 08/02/14 06:45 PM

The RE should be stripped IMHO, the link to specific posts is don with an anchor anyhow.
Posted By: Bad Frog

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 11/28/14 08:37 PM

I noticed above, you posted a YouTube video, but it did not parse correctly.
I am having that problem on my site. Is there any fix for that?
Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 11/28/14 08:46 PM

BF, check UBBDev. All the markup settings are in a post over there. The settings within stock ubbt do not cover youtube, vemeo or other services. They are jusy basic.

the link i posted here, was when those non-stock settings were used on this site. Now they reverted just those items to their default settings.
Posted By: Bad Frog

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 11/28/14 10:41 PM

I have those, it still isn't parsing them, just shows the code in the post.

Posted By: isaac

Re: Six show-stopping problems with UBBT 7.5.8's new Spider-Friendly "SEO" URLs - 11/29/14 03:14 AM

Bad Frog - I see you found what you were looking for at http://ubbdev.com cool

Posting this here for anyone else scanning through this thread looking for the same information you were -
http://www.ubbdev.com/forums/ubbthreads.php/topics/320649/re-7-3-gizmos-customtags.html#Post320649
© 2017 UBB.threads PHP Forum Software Community