Out of Memory Error vs jek.xmx size

Marc Robert · February 2020

Hi all,

We have some DSS users, who get the error message: "java.lang.OutOfMemoryError: GC overhead limit exceeded" when trying to join 2 files with about 80k records and about 5k & 100 columns. We traced this problem and think this might be solved by increasing the jek.xmx setting to 3 GB. We are not completely sure what will happen if we change this. Will all JEK sessions be sized at 3 GB (and thus lowering the maximal number of JEK sessions), or will this set the maximum size to 3 GB.

(ref: https://doc.dataiku.com/dss/latest/operations/memory.html).

And a second question: We do not have direct access to the setup of DSS, but all changes need be done by (Ansible) scripts. How can we proceed with this.?

Thanks in advance..

Omar · February 2020

Hi,

as per the documentation page you linked, please try to increase the jek.xmx value to 3g. As stated in the documentation, this will set the size of JEK processes to 3GB, so all the JEKs will now be sized 3GB.

If you still experience issues, you might consider offloading the job to an external computation engine, like a database or a cluster (hadoop or kubernetes).

As per your second question, this edit can be done via Ansible using the blockinfile module.

Note: This example is provided as is, without any support.

- name: Stop DSS
  shell: /path/to/dss/bin/dss stop >> /some/remote/log.txt
- name: Change xmx setting for JEKs
- blockinfile: |
    dest=/path/to/dss/install.ini backup=yes
    content="[javaopts]
    jek.xmx = 3g"
- name: Regenerate DSS configs
 shell: /path/to/dss/bin/dssadmin regenerate-config >> /some/remote/log.txt
- name: Restart DSS
 shell: /path/to/dss/bin/dss start >> /some/remote/log.txt

Take care,

Omar
Architect @ Dataiku

Marc Robert · February 2020

Hello Omar, thank you for your fast reply. We are going to use your suggestion, and I will keep you posted.

Omar · February 2020

Perfect!

Please mark the thread as answered if it works for you.

Bye

Omar
Architect @ Dataiku

rakeshjogi · March 19

https://community.dataiku.com/discussion/4684/out-of-memory-error-vs-jek-xmx-size

The error java.lang.OutOfMemoryError: Java heap space usually indicates to us that the application is exhausted with the allocated memory in the Java heap. This is caused by objects not being fully released between tests. You can try to increase the heap size using JVM options such as -Xms and -Xmx. Verify the behavior after increasing memory. If the issue still persists, then it could be most likely caused by excessive object allocation or a memory leak in the application.

We can put it in simple terms that the app does not have enough RAM to handle the data it is processing.

Similarly the JVM heap allocated to the JEK process seems to be too small for the workload being executed. In your case, joining datasets with tens of thousands of rows and a large number of columns can significantly increase memory pressure.

If you increase jek.xmx to 3 GB, this sets the maximum heap size per JEK session, not a fixed pre-allocation. The JVM will grow the heap as needed, up to 3 GB. But setting up this can also cause each active JEK session to consume up to 3 GB of RAM when required. As a result, if multiple sessions run concurrently, the total memory usage on the server can increase substantially, which may reduce the maximum number of concurrent sessions your infrastructure can safely support. Therefore, before increasing the value, it is important to ensure that the server has sufficient physical memory to handle peak concurrent usage without swapping or instability. If you are interested in learning about the types of OutOfMemoryError, you can check out this blog: Types of OutOfMemoryError, Causes, and Solutions. Also, You can check out this blog How to Solve OutOfMemoryError: Java heap space to understand more about this Java Heap Space error.

Out of Memory Error vs jek.xmx size

Answers

Categories

Setup Info

Tags