odoo/odoo#97692

Created by Fabien Pinckaers (fp)
label
odoo-dev:master-lang-json-str-cwg
head
88f47804abc67339519d7a7f2dd3764da7121458
target
master
merged
2 years ago by Framework (ORM), Raphael Collet

[IMP] core: store translated fields as JSONB columns

[IMP] core: store translated fields as JSONB columns

Translated fields no longer use the model ir.translation.  Instead they store
all their values as JSON, and store them into JSONB columns in the model's
table.  The field's column value is either NULL or a JSON dict mapping language
codes to text (the field's value in the corresponding language), and must
contain an entry for key 'en_US' (as it is used as a fallback for all other
languages).  Empty text is allowed in translation values, but not NULL.

Here are examples for a field with translate=True:

    `NULL`
    `{"en_US": "Foo"}`
    `{"en_US": "Foo", "fr_FR": "Bar", "nl_NL": "Baz"}`
    `{"en_US": "Foo", "fr_FR": "", "nl_NL": "Baz"}`

Like before, writing False to the field makes it NULL, i.e., False in all
languages.  However, writing "" to the field makes its value empty in the
current language, but does not discard the values in the other languages.

Here are examples for a field with translate=xml_translate:

   ` NULL`
    `{"en_US": "<div>Foo<p>Bar</p></div>", "fr_FR": "<div>Fou<p>Barre</p></div>"}`

Change for callable(translate) fields: one can now write any value in any
language on such a field.  The new value will be adapted in all languages, based
on the mapping of terms between languages in the old values.  Basically the
structure of the value must remain the same in all languages, like before.

Reading a translated field is now both simpler and faster than the former
implementation.  We fetch the value of the field in the current language by
coalescing its value with the 'en_US' value of the field:

    `SELECT id, COALESCE(name->>'fr_FR', name->>'en_US') AS name ...`

The raw cache of the field contains either None or a dict which is conceptually
a subset of the JSON value in database (except for missing languages).  For the
sake of simplicity, most cache operations deal with the dict and return the text
value in the current language.

Trigram indexes have been adapted to the new storing strategy, and should enable
to search in any language.  Before this change, only the source value of the
field ('en_US') could be indexed.

Computed stored translated fields are not supported by the framework, because of
the complexity of the computation itself: the field would need to be computed in
all active languages.  We chose to not provide any hook to compute a field in
all languages at once, and the framework always invokes a compute method once to
recompute it.

Code translations are no longer stored into the database.  They become static,
and are extracted from the PO files when needed.  The worker simply uses a cache
with extracted code translations for performance.  This is reasonable, since
fr_FR code translations for all modules takes around 2MB of memory, and the
cache can be shared among all registries in the worker.  Changing code
translations requires to update the corresponding PO file and reloading the
worker(s).

Performance summary:
(+) reading 'model' translated fields is faster
(+) reading 'model_terms' translated fields is much faster (no need to inject
     translations into the source value)
(+) searching translated fields with operator 'ilike' is much faster when the
     field is indexed with 'trigram'
(+) updating translated fields requires less ORM flushing
(-) importing translations from PO files is 2x slower

Some extra fixes:
- make field 'name' of ir.actions.actions translated; because of the PG
   inheritance, this is necessary to make the column definition consistent in
   all models that inherit from ir.actions.actions.
- add some backend API for the web/website client for editing translations
- move methods get_field_string() to model ir.model.fields
- move _load_module_terms to model ir.module.module
- adapt tests in test_impex, test_new_api
- because env.lang is injected into SQL queries, its returned value is
   now guaranteed to correspond to a valid active language or None
- remove wizard to insert missing translations (no longer makes sense)

task-id: 2081307