Support MySQL utf8mb4 (emojis)
Description
is related to
relates to
Activity

Matthew Jones February 12, 2018 at 4:39 PMEdited
I created this script based on the ideas here. Just need to fill in the database and run this. Locally it's looking good after in Sakai 12. Might want to give this more of a look in 13. I don't know how to easily automate this, and if we do try to increase any column sizes we'd have to make sure people run this. (And new installations create with the correct types too!)
After this is done set the property

Matthew Jones February 12, 2018 at 9:55 AM
Here's another writeup of a process we'd have to go through.
I feel like coming up with a generic conversion should be doable by queering all tables from the information schema, generating the SQL dynamically and running it.
https://dba.stackexchange.com/a/104866/85328
https://mathiasbynens.be/notes/mysql-utf8mb4
The problem is that the max key length is still 767. So we'd also need to have a dymanic SQL generated to change all tables to dynamic row format.
ALTER TABLE t1 ENGINE=INNODB ROW_FORMAT=DYNAMIC;
Then we'd probably want to increase a number of column sizes since we can go a lot larger. Has anyone worked on a script process or tested this for Sakai?

Matthew Jones March 28, 2017 at 8:46 AM
Yeah it "can" work on earlier version but it is easier if things are defaults. MySQL 5.7.9 made the dynamic indexes the default as stated on the JRA issue https://jira.atlassian.com/browse/JRA-36135
"Jan Lindström unfortunately this still only allows index keys longer than 767 bytes on tables with row format of DYNAMIC or COMPRESSED, but the default format is COMPACT in versions below MySQL 5.7.9. It's not possible to change the default row format at the database as far as I have found."
Though it looks like it can be only be specified as the global/default or on the table level, this would still take some work for Sakai. And you couldn't make this a global since likely many databases share a server. When utf8mb4 is the default (over utf8) then you just don't have to expect that people read the manual to create their database correctly. Though whether it's utf8 or utf8mb4 as long as the indexes are bigger it should be fine.
So the big amount of work would be on all the conversions, hibernate changes, auto.ddl changes, validators and such to make the these fields indexed 33% larger. Seems like a lot of effort to for the end result. Though having longer indexes (and lengths) on many fields I think would be a bigger benefit than emojis.
Matthew Buckett March 28, 2017 at 8:19 AM
Ah, yeah, on KNL-1431, it's because of the longer indexes.
Matthew Buckett March 28, 2017 at 8:15 AM
Why do we need to wait for MySQL 5.8 for utf8mb4, although it's not the default databases MySQL back to 5.5.3 supports utf8mb:
https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-3.html
and when you create a database you can specify the default character set.
Details
Details
Priority
Components
Assignee
Reporter

MySQL 5.8 may have a default of utf8mb4 and there are advantages of using this type. However there are likely a number of things that would need to be fixed to take advantage of it. This is just going to track the existing issues opened around this.