Getting UTF-8 and MySQL to work with ColdFusion 9 ORM

A question I've seen show up quite often in Hibernate forums is 'How do
I get Hibernate to generate UTF-8 Compliant Tables and/or columns when
it generates my database tables when using mySQL?'

It's
actually, quite easy to do, but often (myself included), people end up
looking deep into Hibernate for some mysterious setting.

In
reality, it's just a question of setting up your database correctly. 
The first thing you need to do is set your database to use UTF-8
character coding as its default, like so, when creating it:


create database my_new_database default CHARACTER SET = utf8 default COLLATE = utf8_general_ci;


This will ensure all tables that are created by ColdFusion's ORM are created with a UTF-8 character set.

That being said, I have seen issues when data is being sent up to MySQL in UTF-8 form getting corrupted.  To ensure the database drivers use the correct character set as well we add useUnicode=true&characterEncoding=UTF-8 to our MySQL connection string in the ColdFusion Admin under the datasource's Advanced Settings.

There you go! UTF-8 support in mySQL with ColdFusion ORM!

Leave a Comment

Comments

  • Paul | August 26, 2009

    …and of course make sure the version of mySQL actually supports unicode (i think 4 & above??).

    what about sql server?

  • Ben Nadel | August 29, 2009

    Mark, good tip. I had a beast of a time trying to get UTF to work on one database (outside of an ORM setup). Had to eventually bring Oguz in to help out, which he did masterfully. It all came down to default collation.

  • Christian St.Germain | May 8, 2010

    I have developed an application with CF8 and MySQL 5. Some of the data fields are encrypted using the MySQL AES_ENCRYPT function for INSERT and UPDATE and the AES_DECRYPT function for SELECT.

    Queries (and dumps) all run fine with either SQL statements alone or with cfc and cfm scripts. With CF9 however, the encrypted fields are returned as binary even though the SQL statements themselves all run fine. I can overcome this be recoding he relevant cfm using CharsetEncode(datafield, "utf-8") but this results in an error when used with CF8. Any thoughts or suggestions would be greatly appreciated.