Loading...
 

UTF-8

1. Work on 5.0

New install

long blob binary for all files (Sylvie)

Upgrade

Improve detection (Nyloth)
Admin panel -> new panel "server" (Nyloth)

Later

Migration script

2. Goal

Check UTF-8 usage in Tiki.
Show that current code on branch 4 / 5 and trunk are not good, as also
explained here.
The most difficult is to show that in despite of what you see in your
web browser data in database are not well stored.

3. Parameter check

First we need to check that our config is full UTF-8 in order to make test.

3.1. Shell

[+]

3.2. Mysql

[+]

4. Testcase

This array show the different situation and the test result.

Tiki Version
Database structure
Mysql Connector
Test
Visual Result
Database Result
3.X (ACTUAL)
ADODBUTF-8Create page name 让弗朗索瓦
OK
OK
3.X
PDO with UTF-8 paramUTF-8Create page name 让弗朗索瓦
OK
OK
3.X
PDO without UTF-8 paramUTF-8Create page name 让弗朗索瓦
OK
Double encoding in UTF-8
4.X
ADODBUTF-8Create page name 让弗朗索瓦
OK
OK
4.X
PDO with UTF-8 paramUTF-8Create page name 让弗朗索瓦
OK
OK
4.X (ACTUAL)
PDO without UTF-8 paramUTF-8Create page name 让弗朗索瓦
OK
Double encoding in UTF-8
5.X
ADODBUTF-8Create page name 让弗朗索瓦
OK
OK
5.X
PDO with UTF-8 paramUTF-8Create page name 让弗朗索瓦
OK
OK
5.X (ACTUAL)
PDO without UTF-8 paramUTF-8Create page name 让弗朗索瓦
OK
Double encoding in UTF-8

(*) Bold color represent current code on SVN

As you can see, the current situation is not good at all. Tiki Website that are in version 4 and 5 will have problem with UTF-8 data in
database.
They will be double-encoded in database because :

  • Tiki's PHP code work with UTF-8 content (input from users for example) and will query the database using this UTF-8 content
  • PDO, the new abstraction layer, does not know that the content is already in UTF-8 and send this content to MySQL without announcing it as an UTF-8 content
  • MYSQL receives this data from Tiki PDO and thinks it's not UTF-8. So, it wrongly converts it one more time into UTF-8 because the underlying DB structure (DB / tables) is in UTF-8.


In fact you will store a double encoded UTF-8 data in the database.

5. Solution(s)

They are many situations

Tiki2 (and previous) migrated to Tiki3

Tiki3 migrated to Tiki4
Good question
Tiki3 migrated to Tiki4 and to Tiki5
Good question
Tiki3 migrated to Tiki5
  • Modify /db/tiki-db-pdo.php
Diff
Index: db/tiki-db-pdo.php
===================================================================
--- db/tiki-db-pdo.php  (revision 27261)
+++ db/tiki-db-pdo.php  (working copy)
@@ -29,6 +29,8 @@
 try {
        //$dbTiki = new PDO("$db_tiki:host=$host_tiki;dbname=$dbs_tiki", $user_tiki, $pass_tiki);
        $dbTiki = new PDO("$db_tiki:$db_hoststring;dbname=$dbs_tiki", $user_tiki, $pass_tiki);
+       if ($dbTiki->getAttribute(PDO::ATTR_DRIVER_NAME) == 'mysql')
+               $dbTiki->exec("SET CHARACTER SET utf8");
        $dbTiki->setAttribute(PDO::ATTR_CASE,PDO::CASE_NATURAL);
        $dbTiki->setAttribute(PDO::ATTR_ERRMODE,PDO::ERRMODE_WARNING);
        $dbTiki->setAttribute(PDO::ATTR_ORACLE_NULLS,PDO::NULL_EMPTY_STRING);
Tiki4 migrated to Tiki5
  • Make a mysql backup in latin1 with mysqldump or phpmyadmin
mysqldump --default-character-set=latin1 -uuser_name -ppassword -h host db_name > dump.sql
  • Recreate database in UTF-8
mysql -uuser_name -ppassword -h host db_name < dump.sql
  • Modify /db/tiki-db-pdo.php
Diff
Index: db/tiki-db-pdo.php
===================================================================
--- db/tiki-db-pdo.php  (revision 27261)
+++ db/tiki-db-pdo.php  (working copy)
@@ -29,6 +29,8 @@
 try {
        //$dbTiki = new PDO("$db_tiki:host=$host_tiki;dbname=$dbs_tiki", $user_tiki, $pass_tiki);
        $dbTiki = new PDO("$db_tiki:$db_hoststring;dbname=$dbs_tiki", $user_tiki, $pass_tiki);
+       if ($dbTiki->getAttribute(PDO::ATTR_DRIVER_NAME) == 'mysql')
+               $dbTiki->exec("SET CHARACTER SET utf8");
        $dbTiki->setAttribute(PDO::ATTR_CASE,PDO::CASE_NATURAL);
        $dbTiki->setAttribute(PDO::ATTR_ERRMODE,PDO::ERRMODE_WARNING);
        $dbTiki->setAttribute(PDO::ATTR_ORACLE_NULLS,PDO::NULL_EMPTY_STRING);
Tiki5 fresh install :
  • Make a mysql backup in latin1 with mysqldump or phpmyadmin
mysqldump --default-character-set=latin1 -uuser_name -ppassword -h host db_name > dump.sql
  • Recreate database in UTF-8
mysql -uuser_name -ppassword -h host db_name < dump.sql
  • Modify /db/tiki-db-pdo.php
Diff
Index: db/tiki-db-pdo.php
===================================================================
--- db/tiki-db-pdo.php  (revision 27261)
+++ db/tiki-db-pdo.php  (working copy)
@@ -29,6 +29,8 @@
 try {
        //$dbTiki = new PDO("$db_tiki:host=$host_tiki;dbname=$dbs_tiki", $user_tiki, $pass_tiki);
        $dbTiki = new PDO("$db_tiki:$db_hoststring;dbname=$dbs_tiki", $user_tiki, $pass_tiki);
+       if ($dbTiki->getAttribute(PDO::ATTR_DRIVER_NAME) == 'mysql')
+               $dbTiki->exec("SET CHARACTER SET utf8");
        $dbTiki->setAttribute(PDO::ATTR_CASE,PDO::CASE_NATURAL);
        $dbTiki->setAttribute(PDO::ATTR_ERRMODE,PDO::ERRMODE_WARNING);
        $dbTiki->setAttribute(PDO::ATTR_ORACLE_NULLS,PDO::NULL_EMPTY_STRING);


Related links

alias

Keywords

The following is a list of keywords that should serve as hubs for navigation within the Tiki development and should correspond to documentation keywords.

Each feature in Tiki has a wiki page which regroups all the bugs, requests for enhancements, etc. It is somewhat a form of wiki-based project management. You can also express your interest in a feature by adding it to your profile. You can also try out the Dynamic filter.

Accessibility (WAI & 508)
Accounting
Administration
Ajax
Articles & Submissions
Backlinks
Banner
Batch
BigBlueButton audio/video/chat/screensharing
Blog
Bookmark
Browser Compatibility
Calendar
Category
Chat
Comment
Communication Center
Consistency
Contacts Address book
Contact us
Content template
Contribution
Cookie
Copyright
Credits
Custom Home (and Group Home Page)
Database MySQL - MyISAM
Database MySQL - InnoDB
Date and Time
Debugger Console
Diagram
Directory (of hyperlinks)
Documentation link from Tiki to doc.tiki.org (Help System)
Docs
DogFood
Draw -superseded by Diagram
Dynamic Content
Preferences
Dynamic Variable
External Authentication
FAQ
Featured links
Feeds (RSS)
File Gallery
Forum
Friendship Network (Community)
Gantt
Group
Groupmail
Help
History
Hotword
HTML Page
i18n (Multilingual, l10n, Babelfish)
Image Gallery
Import-Export
Install
Integrator
Interoperability
Inter-User Messages
InterTiki
jQuery
Kaltura video management
Karma
Live Support
Logs (system & action)
Lost edit protection
Mail-in
Map
Menu
Meta Tag
Missing features
Visual Mapping
Mobile
Mods
Modules
MultiTiki
MyTiki
Newsletter
Notepad
OS independence (Non-Linux, Windows/IIS, Mac, BSD)
Organic Groups (Self-managed Teams)
Packages
Payment
PDF
Performance Speed / Load / Compression / Cache
Permission
Poll
Profiles
Quiz
Rating
Realname
Report
Revision Approval
Scheduler
Score
Search engine optimization (SEO)
Search
Security
Semantic links
Share
Shopping Cart
Shoutbox
Site Identity
Slideshow
Smarty Template
Social Networking
Spam protection (Anti-bot CATPCHA)
Spellcheck
Spreadsheet
Staging and Approval
Stats
Survey
Syntax Highlighter (Codemirror)
Tablesorter
Tags
Task
Tell a Friend
Terms and Conditions
Theme
TikiTests
Timesheet
Token Access
Toolbar (Quicktags)
Tours
Trackers
TRIM
User Administration
User Files
User Menu
Watch
Webmail and Groupmail
WebServices
Wiki History, page rename, etc
Wiki plugins extends basic syntax
Wiki syntax text area, parser, etc
Wiki structure (book and table of content)
Workspace and perspectives
WYSIWTSN
WYSIWYCA
WYSIWYG
XMLRPC
XMPP




Useful Tools