I have attempted to create an internationalization system for my PHP framework that I'm working on.
For the purpose I decided I will be using MySQL to store languages and translations. Respectively I have the tables
languages id | name ----+-------- 1 | English 2 | Spanish translations id | 1 | 2 | n ... ---+---------------+-------------+------ 1 | Hello, $user | Hola, $user | ... 2 | Friend | Amigo | ...
I was told that this is not a normalized database structure. I think that this structure is easier to use, because if I were to add a column lang_id
where I'd store the id of the language the string represents, then when I'm selecting I'd have to do
SELECT string FROM translations WHERE id = ? AND lang_id = {$lang_id}
Instead of what I currently use
SELECT {$lang_id} FROM translations WHERE id = ?
It looks simpler to me, also avoids the use of complex primary keys and uses less memory too. If I'm not correct please don't hesitate to criticize!
I have created two implementations of the logic class, one using eval
in its base, other one using preg_replace_callback
.
First implementation: Eval
class Translator {
private $stmt = null;
private static $instance;
public static function getInstance(){
if(!self::$instance){
self::$instance = new self;
}
return self::$instance;
}
private function __construct() {
$this->stmt = \X\Database::connection()->prepare('SELECT '.$_SESSION['lang_id'].' FROM translations WHERE id = ?')->getStmt();
}
public function get($id, $variables){
$this->stmt->execute(array($id));
foreach($variables as $k => $v){
$$k = $v;
}
unset($id); unset($variables); unset($k); unset($v);
return eval('return "'.addslashes($this->stmt->fetch(\PDO::FETCH_NUM)[0]).'";');
}
}
Example usage in a view, assuming $t = \X\Translator::getInstance();
With variables
<h1><?= $t->get(1, array('user'=>'John Doe')) ?></h1>
Without variables
<h1><?= $t->get(2) ?></h1>
Second implementation: Regular Expressions
class Translator {
public static function init(){
ob_start();
register_shutdown_function('\X\Translator::translateDocument');
}
public static function translateDocument(){
$t = \X\Database::connection()->prepare('SELECT '.$_SESSION['lang_id'].' FROM translations WHERE id = ?')->getStmt();
echo preg_replace_callback('/\{\{\s*(\d+)\s*(?:,(.+?))?\s*}}/', function($matches) use($t){
$t->execute(array($matches[1]));
if(count($matches) > 2){
$keys = array();
$vals = array();
preg_replace_callback('/([^\s]+):\s*([^\;]+);/', function($matchesInner) use(&$keys, &$vals){
$keys[] = $matchesInner[1];
$vals[] = $matchesInner[2];
}, $matches[2]);
return str_replace($keys,$vals,$t->fetch(\PDO::FETCH_NUM)[0]);
}else{
return $t->fetch(\PDO::FETCH_NUM)[0];
}
}, ob_get_clean());
}
}
Example usage, assuming that \X\Translator::init()
has been called
With variables
<h1>{{ 1, $user: John Doe; }}</h1>
Without variables
<h1>{{ 2 }}</h1>
Finally
I ran a benchmark consisting of 5 runs of 1000 translations each, showing the statistics
#Average execution time eval: 0.65 sec regex: 1.15 sec #Memory used eval: 240 kb regex: 253 kb
-
\$\begingroup\$ about the database: it will be a mess when it will have 20 languages. \$\endgroup\$Marco Acierno– Marco Acierno2014年08月21日 13:51:35 +00:00Commented Aug 21, 2014 at 13:51
-
\$\begingroup\$ @MarcoAcierno It will have a lot of columns, why is that considered a mess? Otherwise it will have 20 times the rows it has this way :? \$\endgroup\$php_nub_qq– php_nub_qq2014年08月21日 13:56:57 +00:00Commented Aug 21, 2014 at 13:56
-
\$\begingroup\$ So you like this and this? \$\endgroup\$Marco Acierno– Marco Acierno2014年08月21日 14:04:03 +00:00Commented Aug 21, 2014 at 14:04
-
2\$\begingroup\$ Why don't you just use gettext? Internationalization is a very hard problem, it's not just about replacing texts, you also need to care about grammar, some languages changes cases when the subject is male or female, localizing dates are just horrible, currency only a bit better, and there is a whole lot of subtleties. Why reinvent the wheel? \$\endgroup\$Lie Ryan– Lie Ryan2014年08月21日 16:11:01 +00:00Commented Aug 21, 2014 at 16:11
-
1\$\begingroup\$ And i would avoid to have PHP code, this $user looks so bad. \$\endgroup\$Marco Acierno– Marco Acierno2014年08月21日 16:40:09 +00:00Commented Aug 21, 2014 at 16:40
1 Answer 1
Creating the translations
table with an indefinite number of weirdly named columns is a bad idea. Subsequently adding a language will be awkward: it involves ALTER
ing the table, then a series of UPDATE
s.
I suggest a table with normalization...
CREATE TABLE translations (
id INTEGER NOT NULL,
lang_id INTEGER NOT NULL,
string VARCHAR(???),
PRIMARY KEY (id, lang_id),
FOREIGN KEY (lang_id) REFERENCES languages (id)
);
Then, for your programming convenience, create a view for each language that you support.
CREATE OR REPLACE VIEW en AS
SELECT string
FROM translations AS t
JOIN languages AS l
ON t.lang_id = l.id
WHERE l.name = 'English';
Alternatively, define a stored procedure to satisfy your laziness.
That way, your schema stays normalized. Adding a view is less risky than altering a table. Even your queries read better:
SELECT string FROM en WHERE id = ?;
However, for performance, I think you would be better off querying and caching all of the strings (probably a few hundred — no big deal) in PHP than issuing a separate query per string.
-
\$\begingroup\$ What do you mean
less risky than altering a table
. What are the risks? \$\endgroup\$php_nub_qq– php_nub_qq2014年08月21日 20:03:15 +00:00Commented Aug 21, 2014 at 20:03 -
3\$\begingroup\$ @php_nub_qq DBAs have a general aversion to schema changes.
ALTER TABLE
can be mildly problematic — for example, in MySQL, it breaks transactions; DDL changes cannot be rolled back.ALTER TABLE
requires more privileges thanINSERT
.ALTER TABLE
takes a lock, which would be bad if the table were huge. Mainly, I make this recommendation because supporting a new languages should be a conceptually additive operation, and therefore should be better done in a way that doesn't involve a schema change. \$\endgroup\$user50399– user503992014年08月21日 21:05:09 +00:00Commented Aug 21, 2014 at 21:05
Explore related questions
See similar questions with these tags.