When adding an emoji to a post, I get the following error (see below). A row is actually inserted into the Topics table, but not the Posts table. The Post ID is set to '0' in the topics table, so the post appears in the list of threads but is not actually there.
A user had this problem, so I tested it. I tested it in the old forum as well and in that forum the emoji is just completely stripped out. Maybe because the MySQL / PHP versions are old and can't handle the characters at all.
Perhaps it would be possible to support emojis with the new version and upgraded software, if I simply change the column collation from utf8 to utf8mb4 but.... I don't want to mess up tens of thousands of posts accidentally.
At minimum I would like to handle the error... any suggestion? There is a function that seems to parse the post body but I don't understand it.
Script: addpost.inc.php Line: 664 (likely different than factory) SQL Error: Incorrect string value: '\xF0\x9F\x98\x80' for column `ubbv7`.`ubbt_POSTS`.`POST_BODY` at row 1 SQL Error: 1366
The collation utf8mb4_general_ci on your table is required for emoji support, inserting extended UTF characters into a table that does not support them will cause database issues such as you posted.
It looks like I should change Post Body, Post Default Body, and Post Subject from utf8_general_ci to utf8mb4_general_ci. When I tried to do that it complained that it cannot be part of FULLTEXT index. I supposed what I need to do is drop the index, then change the collation, then add the index back. What I don't know is if there are any affects this will have on existing data or functionality.
Oh, good to know! .... In the database tools, there is an 'optimize' function for each table. Are you referring to that are something else? I'm not sure specifically what optimize does.
the tool is located at the bottom section of the content rebuilder.
this screenshot shows the page after the tool has already been run. prior to that, there is further information, directions, and a progress page while your forums are updated
OPTIMIZE TABLE reorganizes the physical storage of table data and associated index data, to reduce storage space and improve I/O efficiency when accessing the table. The exact changes made to each table depend on the storage engine used by that table.
Use OPTIMIZE TABLE in these cases, depending on the type of table:
After doing substantial insert, update, or delete operations on an InnoDB table that has its own .ibd file because it was created with the innodb_file_per_table option enabled. The table and indexes are reorganized, and disk space can be reclaimed for use by the operating system.
After doing substantial insert, update, or delete operations on columns that are part of a FULLTEXT index in an InnoDB table. Set the configuration option innodb_optimize_fulltext_only=1 first. To keep the index maintenance period to a reasonable time, set the innodb_ft_num_word_optimize option to specify how many words to update in the search index, and run a sequence of OPTIMIZE TABLE statements until the search index is fully updated.
After deleting a large part of a MyISAM or ARCHIVE table, or making many changes to a MyISAM or ARCHIVE table with variable-length rows (tables that have VARCHAR, VARBINARY, BLOB, or TEXT columns). Deleted rows are maintained in a linked list and subsequent INSERT operations reuse old row positions. You can use OPTIMIZE TABLE to reclaim the unused space and to defragment the data file. After extensive changes to a table, this statement may also improve performance of statements that use the table, sometimes significantly.
Make sure you make backups to your database prior to the convert. It is a big change that can't be undone without importing a backup. Also when complete on the first step. You would goto the UBB CP Paths & Database. area and set the MySQL Full Unicode Support (utf8mb4) checkbox to utf8mb4
Blue Man Group There is no such thing as stupid questions. Just stupid answers
Thanks. Well that does sound a bit scary. Is it actually changing the data? What bothers me is that perhaps I would not be able to identify right away if there had been a problem with the conversion. (For example, I might visit our Japanese forum in two weeks and realize that half the posts are gibberish, or something like that).
I do want to prevent the error from happening though at the very least. Perhaps there is a Javascript solution that would just trim out any added emoji characters when they are typed in.
That is why you should make a backup of the database and files any time you upgrade or alter the database in case something bad does happen. Anyway it alters the database only to accept the data for the emoji. as well as other 4byte items Also as I recall there is a convert status showing what is going on but it is fairly fast and painless even though it makes a lot of alters to the database. Without the tool it probably would take several hours to complete and then still miss some cell format somewhere..
Blue Man Group There is no such thing as stupid questions. Just stupid answers
I tried the database collation conversion tool, and here is what happened:
If seemed like it was working to start. It logged what it was doing, and all the log lines had green checkmarks.The first log line is:
ALTER TABLE ubbt_ADMIN_LOG MODIFY `LOG_IP` varchar(46) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci..... ....
... then it went through other tables in alphabetical order, but ended on the 26th one:
ALTER TABLE ubbt_PAYPAL_DATA MODIFY `SUBSCR_DATE` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general...... ..... ..... .... ........ ...... 4_general_ci, MODIFY `OPTION_SELECTION1`
That all happened really fast, then the page spinned for about 30 seconds and stopped.
When I examine the tables now, I see that earlier tables are indeed utf8mb4, however some tables are not. For example, `POSTS` and `STYLES` DO have utf8mb4 columns, while `TOPICS` does NOT.
There was no mysql error log created or otherwise no other indication of an error.
So my question, is this expected, or did the script fail for me?
Run it again, PayPal table is right before Posts in alphabetical order. The page should eventually automatically refresh, no matter if your browser times out due to a large Posts table.
When you visit the "Backup Tables" tab, there is an option already selected that will Analyze (check for errors) and Optimize your tables as well. That option is described on that page.
I ran the backup, which completed successfully (it took a long time - hours... I had not used it before, I always run mysqldump from the terminal which takes something like 15 seconds. Nevertheless it completed with no errors).
I then ran the collation updater again, with exactly the same results.
Quote
Run it again, PayPal table is right before Posts in alphabetical order.
Actually it's not, there are other tables like POLL_DATA which were updated successfully. POSTS itself appears to have been updated successfully, at least the columns are utf8mb.
In fact, all tables UP TO AND INCLUDING one called "STYLES" appear to be utf8mb. And all tables after that (from "SUBSCRIPTIONS") appear to be utf8 (not updated).
There are a couple strange things here. The log for PAYPAL_DATA query. It is a very long query string. The strange things about this are: 1. The query (as displayed in the browser) is truncated, and it's truncated in a slightly different spot each time I run the conversion tool. The second time I ran it, it display up to : "....MODIFY `OPTION_NAME2` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci, MODIFY `OP"
2. This query is the last query that appears on the page, even though subsequent tables (including POSTS) WERE updated.
3. The page continues to load, without giving any feedback on the page, for maybe a minute or so.
This makes me think that maybe the Paypal query display issue is a separate issue, and that the script is continuing (but not outputing any text) until it encounters a second problem at the subscriptions table, and then it quits. BTW all the results up to the paypal query are displayed instantly (less than a second).
- If I set execute_sql to false in the script, then it successfully echos all the queries and completes as expected.
- If I watch the processlist in mysql, I can see it takes about 100 seconds or so on the Posts table, but there don't appear to be any problems. If I log the queries, the last thing that is logged for the connection is
88 Query ALTER TABLE ubbt_SUBSCRIPTIONS DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci 88 Query SHOW FULL COLUMNS FROM ubbt_SUBSCRIPTIONS 88 Quit
- There are no error messages logged or visible anywhere.
It quits after about 230 seconds (more or less) for no apparent reason. (Execution time limit is set to 600 by the script). And sure enough, subscriptions is the last one that is done.
- I get the same result after I deleted a duplicate "POSTS" (backup) table that I had (it was spending time on that table as well) It makes no sense to me that the same thing still happens.
I got to work by commenting out the line that prints the alter table queries (having it print any short string instead of the query also works)
By watching the mysql log I can see that it completes - the page itself never refreshes, and in fact it (oddly) does not render the main content at all - just the breadcrumb and about half of the side menu. This is repeatable.
Anyhow, it's certainly strange but it seems to have worked now.