odoo/upgrade-util#288

Created by Upgrade, Carsten Wolff (cawo)

Blocked

label
odoo-dev:master-imp_jinja_to_qweb_mem_use-cawo
head
498e2c10ce617d9dad84884fcdf0a48bf0476dc8
odoo/upgrade-util
master #288 missing r+

[IMP] jinja_to_qweb: avoid MemoryError

upg-2994884

Avoid MemoryError and/or killed process because of malloc() failing within lxml / libxml2.

Debugging this by determining the size of involved datastructures through means of sys.getsizeof() showed that:
1. The global variable templates_to_check grows to roughly 1.4GiB after the various calls to upgrade_jinja_fields() by upgrade scripts. The process uses ~1.5GiB at that point
2. At the start of function verify_upgraded_jinja_fields(), the process is still at ~1.5GiB. While iterating over all the templates in templates_to_check, no significant amount of memory is allocated on top of this to python datastructures. But, with each call to is_converted_template_valid(), the size of the process increases until it hits the RLIMIT. This function calls into lxml multiple times, suggesting that the memory is allocated in malloc() calls in the C library, evading python's accounting. Internet research shows that lxml has a long history of different memory leaks in C code, plus some caching mechanism across documents that could be responsible1. More recent versions of the module seem to have been improved, but still we're stuck with old versions.

This patch solves / works around (2) by running the function body of is_converted_template_valid() in a forked subprocess that is killed immediately after (instead of using a process pool) and thus no additional memory is ever allocated by lxml in the main process, its memory use stays at ~1.5GiB during the runtime of verify_upgraded_jinja_fields(). We also add a final line to that function that clears the global data when it is processed, at which point the processes memory use drops to 100MiB.

This patch does not solve (1). Solving (2) is enough to fully process upg-2994884, but a future upgrade request may require a fix for (1), too. This will probably be a bigger patch, because I think a viable solution for this would mean to remove the global templates_to_check and instead store tha data in the database.


  1. https://benbernardblog.com/tracking-down-a-freaky-python-memory-leak-part-2/ ↩