Out of Memory Error vs jek.xmx size

MRvLuijpen
MRvLuijpen Partner, L2 Admin, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 107 Neuron

Hi all,

We have some DSS users, who get the error message: "java.lang.OutOfMemoryError: GC overhead limit exceeded" when trying to join 2 files with about 80k records and about 5k & 100 columns. We traced this problem and think this might be solved by increasing the jek.xmx setting to 3 GB. We are not completely sure what will happen if we change this. Will all JEK sessions be sized at 3 GB (and thus lowering the maximal number of JEK sessions), or will this set the maximum size to 3 GB.

(ref: https://doc.dataiku.com/dss/latest/operations/memory.html).

And a second question: We do not have direct access to the setup of DSS, but all changes need be done by (Ansible) scripts. How can we proceed with this.?

Thanks in advance..

Tagged:

Answers

  • Omar
    Omar Dataiker Posts: 30 Dataiker
    edited July 2024

    Hi,

    as per the documentation page you linked, please try to increase the jek.xmx value to 3g. As stated in the documentation, this will set the size of JEK processes to 3GB, so all the JEKs will now be sized 3GB.

    If you still experience issues, you might consider offloading the job to an external computation engine, like a database or a cluster (hadoop or kubernetes).

    As per your second question, this edit can be done via Ansible using the blockinfile module.

    Note: This example is provided as is, without any support.

    - name: Stop DSS
      shell: /path/to/dss/bin/dss stop >> /some/remote/log.txt
    - name: Change xmx setting for JEKs
    - blockinfile: | dest=/path/to/dss/install.ini backup=yes content="[javaopts] jek.xmx = 3g"
    - name: Regenerate DSS configs
    shell: /path/to/dss/bin/dssadmin regenerate-config >> /some/remote/log.txt
    - name: Restart DSS
    shell: /path/to/dss/bin/dss start >> /some/remote/log.txt

    Take care,

    Omar
    Architect @ Dataiku

  • MRvLuijpen
    MRvLuijpen Partner, L2 Admin, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 107 Neuron
    Hello Omar, thank you for your fast reply. We are going to use your suggestion, and I will keep you posted.

  • Omar
    Omar Dataiker Posts: 30 Dataiker

    Perfect!

    Please mark the thread as answered if it works for you.

    Bye

    Omar
    Architect @ Dataiku

Setup Info
    Tags
      Help me…