Previous Thread
Next Thread
Print Thread
Hop To
Joined: Oct 2007
Posts: 464
Likes: 11
Addict
Addict
Joined: Oct 2007
Posts: 464
Likes: 11
But, I'm posting it here just in case some myql guru might know the answer. This is a copy of a post I made today on the mysql forum.

Quote
We recently had a problem in our UBB forum, after an upgrade from mysql 5.7 to mysql 8.0, with punctuation marks and emojis being rendered as Windows-1252 characters instead of utf8. e.g “ = left quote = “

I finally got the issue corrected by adding these two lines to my.cnf's mysqld section: character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci

(The db was already set to utf8mb4).

That left me with a cleanup job. (about 8.000 instances). So, I wrote a query to clean them up. However, I needed to exclude two forums where we were discussing the issues (because I didn't want to clean those up.)

So, I wrote this query, which did a great job of cleaning up the mess. (This is one query. I had to write several more to clean up post subjects, member signatures, and topic subjects as well.

This is the cleanup query:

Replace all Windows-1252 characters in post bodies with utf8 characters
Code
UPDATE sbf_POSTS a
JOIN sbf_TOPICS t ON a.TOPIC_ID = t.TOPIC_ID
JOIN sbf_FORUMS f ON t.FORUM_ID = f.FORUM_ID
SET a.POST_BODY = REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(REPLACE
(a.POST_BODY, '“' , '“'),
'”','”'),
'‘','‘'),
'’', '’'),
'—', '–'),
'–', '—'),
'…', '᠁')
WHERE (
a.POST_BODY LIKE '%“%' OR
a.POST_BODY LIKE '%”%' OR
a.POST_BODY LIKE '%‘%' OR
a.POST_BODY LIKE '%’%' OR
a.POST_BODY LIKE '%—%' OR
a.POST_BODY LIKE '%–%' OR
a.POST_BODY LIKE '%…%'
)
AND f.FORUM_ID != '24'
AND f.FORUM_ID != '57'

Unfortunately, that didn't clean up ALL of them. I found 181 instances that were missed using this simple query.
Code
select POST_BODY from sbf_POSTS where POST_BODY LIKE '%â€%'

My question is, why does the simple query find instances that the complex query does not? I'm assuming it's because of the joins, but I don't understand why those would create this problem.

UPDATE: I wrote this query to test my hypothesis. I basically took the cleanup script (posted above) and modified it to be a select without any replace statements.
Code
SELECT POST_SUBJECT, POST_BODY FROM  sbf_POSTS a
JOIN sbf_TOPICS t ON a.TOPIC_ID = t.TOPIC_ID
JOIN sbf_FORUMS f ON t.FORUM_ID = f.FORUM_ID
WHERE ( 
    a.POST_BODY LIKE '%“%' OR
    a.POST_BODY LIKE '%”%' OR
    a.POST_BODY LIKE '%‘%' OR
    a.POST_BODY LIKE '%’%' OR
    a.POST_BODY LIKE '%—%' OR
    a.POST_BODY LIKE '%–%' OR
    a.POST_BODY LIKE '%…%'
)
     AND f.FORUM_ID != '24'
     AND f.FORUM_ID != '57'
It found some characters that I planted in an old post, but the cleanup script did not fix them. So, the JOINS are not the problem. Now I'm even more mysitifed.


The Stovebolt Geek
https://www.stovebolt.com/ubbthreads/ubbthreads.php

Server Information
UBB.threads Version 8.0.0
Release 20240826
Server OS Linux
Server Load 0.11
Web Server Apache/2.4.37
PHP Version 8.3.11
MYSQL Version 8.0.39
Database Size 1.82 GB
1 member likes this: Gizmo
Joined: Jun 2006
Posts: 16,369
Likes: 126
UBB.threads Developer
UBB.threads Developer
Joined: Jun 2006
Posts: 16,369
Likes: 126
Its possible that they're abandoned threads, or in a forum that isn't in a category; I think its definitely the joins that didn't fetch the rest of the data.


I am a Web Development Contractor, I do not work for UBBCentral. I have provided free User to User Support since the beginning of these support forums.
Do you need Forum Install or Upgrade Services?
Forums: A Gardeners Forum, Scouters World
UBB.threads: UBBWiki, UBB Styles, UBB.Sitemaps
Longtime Supporter & Resident Post-A-Holic
VNC Web Services: Code Modifications, Upgrades, Styling, Coding Services, Disaster Recovery, and more!
Joined: Oct 2007
Posts: 464
Likes: 11
Addict
Addict
Joined: Oct 2007
Posts: 464
Likes: 11
Thanks Gizmo. I trust your knowledge of sal queries. 😌


The Stovebolt Geek
https://www.stovebolt.com/ubbthreads/ubbthreads.php

Server Information
UBB.threads Version 8.0.0
Release 20240826
Server OS Linux
Server Load 0.11
Web Server Apache/2.4.37
PHP Version 8.3.11
MYSQL Version 8.0.39
Database Size 1.82 GB

Link Copied to Clipboard
ShoutChat
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Recent Topics
Sticky posts and global announcements
by Larry Miller - 12/08/2024 2:30 PM
Profile avatar storage settings
by SenecaFlyer - 12/05/2024 1:24 PM
Not allowing attachment over 2m
by ehill - 12/03/2024 3:16 PM
New Admin Here
by SenecaFlyer - 12/02/2024 4:14 PM
Who's Online Now
2 members (Ruben, 1 invisible), 3,275 guests, and 86 robots.
Key: Admin, Global Mod, Mod
Random Gallery Image
Latest Gallery Images
Ride safe!
Ride safe!
by Morgan, December 7
Los Angeles
Los Angeles
by isaac, August 6
3D Creations
3D Creations
by JAISP, December 30
Artistic structures
Artistic structures
by isaac, August 29
Powered by UBB.threads™ PHP Forum Software 8.0.1
(Snapshot build 20240918)