Common UTF-8 Problems
There are three areas that should be looked at if you're having trouble entering or displaying UTF-8 characters in Sakai:
1) Make sure tomcat's server.xml includes UTF-8 encoding in all its connectors.
1a) The following example assumes port 8080 is the primary browser port:
<Connector port="8080" maxHttpHeaderSize="8192" URIEncoding="UTF-8"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
connectionTimeout="20000" disableUploadTimeout="true" />
1b) If you have setup your Tomcat instance so it runs behind Apache, then you will also need to add the UTF-8 attribute to that connector, in server.xml as well. For example:
<Connector port="8009"
enableLookups="false" redirectPort="8443" protocol="AJP/1.3" URIEncoding="UTF-8" />
2) Make sure your database is created with UTF-8 encoding.
2a) For example, this will work for MySql:
create database sakai default character set utf8;
2b) Oracle UTF-8 settings is allegedly as follows:
http://docs.oracle.com/cd/B10500_01/server.920/a96529/ch10.htm#1009904
SQL> SHUTDOWN IMMEDIATE; – or NORMAL
SQL> STARTUP MOUNT;
SQL> ALTER SYSTEM ENABLE RESTRICED SESSION;
SQL> ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0;
SQL> ALTER DATABASE OPEN;
SQL> ALTER DATABASE CHARACTER SET AL32UTF8;
SQL> SHUTDOWN IMMEDIATE; – or NORMAL
SQL> STARTUP;
A good reference for Oracle 9i (especially Figure 2-8 & 2-9) is at http://download.oracle.com/docs/cd/B10500_01/server.920/a96529/ch2.htm
If that doesn't work or it gives the error "ORA-12712: new character set must be a superset of old character set" you can try
ALTER DATABASE CHARACTER SET INTERNAL_USE AL32UTF8;
The above command will skip the check of character set subset or superset.
3) Make sure the (MySql) connector is defined for UTF-8 encoding in the sakai.properties file. Note that previous releases of Sakai had an incorrect default value:
WRONG: url@javax.sql.BaseDataSource=jdbc:mysql://127.0.0.1:3306/sakai?useUnicode=true&characterEncoding=UTF-8 CORRECT: url@javax.sql.BaseDataSource=jdbc:mysql://127.0.0.1:3306/sakai?useUnicode=true&characterEncoding=UTF-8
There is a fourth issue that only affects Sakai pre-2.1.0 running on MySQL. This patch can be backported manually to these systems. More information, including the patch file, is available in JIRA ticket 1737
One more thing - make sure your MySQL connector is v3.1.14 or better. Some tools (RWiki and JForum) sore content as blob's, and
do byte-to-string conversions. The wrong version of the MySQL driver will cause puzzling problems with UTF-8 characters.
4) Windows users only (Tomcat as a windows service)
Set the property -Dfile.encoding=UTF-8 in tomcat properties. (Open command window -> type "tomcat5w" -> "Java" ->"Java Options:")
You can check you file encoding with this (FileEncoding.java) java file.
import java.lang.*; public class FileEncoding { public static void main(String[] args) { System.out.println("file.encoding=" + System.getProperty("file.encoding")); } }