I'm building a website where I plan to support multiple languages. Not only via UI, but via the content too.
I have several tables where I have text columns such as "title", "name", "description", "body" and so on. What's the best way to do so? Will I have to create an additional table for each one where I have text data I want to translate? For instance:
articles(id)
articles_content(article_id, title, description, body, language_id)
comments(id)
comments_content(comment_id, body, language_id)
And thus for each table I want to translate.
Any downsides of this solution?
Is there a better and yet simpler way?
-
3I'd have thought the db structure didn't matter (to the scope of this question) as long as each language conformed to the schema and data types provided.Hex– Hex2018年02月06日 16:22:22 +00:00Commented Feb 6, 2018 at 16:22
-
3It depends if you are more in the case "all articles will be available in all languages" or in the case "the amount of translations for an article will vary a lot from an article to another". The alternative method would be to put the language tag (please use RFC5646) inside the articles_content/comments_content and to have a primary composite key on (article id, language tag). Also have a look at how CMS deal with that, for example in SPIP or Drupal. Studying existing code could give you ideas...Patrick Mevzek– Patrick Mevzek2018年02月06日 16:32:01 +00:00Commented Feb 6, 2018 at 16:32
-
Adding to previous comment: depends on how/how often new languages are added. If you have a table per language, adding a language means making changes in the database schema, which is better to avoid if possible. So a schema where adding a new language/translation is just adding new contents in some existing tables is a better design.Patrick Mevzek– Patrick Mevzek2018年02月06日 16:33:02 +00:00Commented Feb 6, 2018 at 16:33
-
1Also: do you need to link articles/comments in various languages together? Like saying content X in English is the translation of content Y in German? That has impacts on your DB design.Patrick Mevzek– Patrick Mevzek2018年02月06日 16:33:53 +00:00Commented Feb 6, 2018 at 16:33
-
@Raj, This is similar to the approach we've taken for an enterprise publishing company. However, we would have had language_id in the articles table, not articles_content. The reason for this is that articles were not always written for all languages, and if they were, we'd just have separate articles.raterus– raterus2018年02月06日 21:14:06 +00:00Commented Feb 6, 2018 at 21:14
2 Answers 2
Create each "table" as a base table (with the non-translatable elements) and a translation table. The language table will contain the primary key of the base table and the language ID. You can then create a view layer which joins the two together for a particular language (and which the UI will drive off of). This scenario avoids the duplicate maintenance of the non-translatable elements.
I would recommend not having a separate multi-lingual table per base table. I would recommend having a single table representing a multi-lingual string and a single table to store the value of that string in each language.
For example, the create table statements for Postgres would be:
create table articles (
id serial primary key,
title int not null references multilingual (id),
description int not null references multilingual (id),
body int not null references multilingual (id)
);
create table multilingual (
id serial primary key
);
create table string (
id int not null references multilingual (id),
lang lang not null, -- custom lang type defined as enum
contents text not null,
primary key (id, lang)
);
articles_content(article_id, title, description, body, language_id)
This approach involves more joins than multi-lingual table per base table, so it could degrade performance unacceptably. However, it keeps the schema simpler.
Explore related questions
See similar questions with these tags.