Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
One of our Data Scientist is running jobs with some huge Data sets and earlier she was complaining about being very slow so I increased the backend.xmx to 12g . Now also its running but taking lot of time to process.
In the Job log I am seeing this messages .Does it mean error?
5406.922: [GC (Allocation Failure) [PSYoungGen: 670569K->12608K(676864K)] 2011527K->1367487K(2075136K), 0.0228449 secs] [Times: user=0.41 sys=0.15, real=0.02 secs] 5408.211: [GC (Allocation Failure) [PSYoungGen: 669504K->13950K(676864K)] 2024383K->1381110K(2075136K), 0.0185378 secs] [Times: user=0.40 sys=0.04, real=0.02 secs] [2022/09/20-14:49:28.786] [Thread-28] [INFO] [dku.format.csv] - CSV Emitted 252200000 lines from file, 29 columns - interned: 252256275 MEM: 100.0%.
Operating system used: Cent OS
Hey @Khalid538 ,
The "[GC (Allocation Failure)" are garbage collection tasks... If there are many of them it could indicate that the Dataiku instance is under heavy stress and the MEM: 100.0% does indicate an error.
I recommend that you open a ticket with Dataiku support (support@dataiku.com) with logs and job diags attached, please!
To get a Job Diag:
From the job page, click on Actions > Download job diagnosis.
If the resulting file is too large for mail (> 15 MB), you can use https://dl.dataiku.com to send it to us. Please don't forget to send the link that is generated when you upload the file.
Emma
Thanks Emma for the response.
I did open a Support Ticket with Dataiku and he suggested me with the following but still i am not clear on resolving this issue.
Response:
As mentioned in the support ticket you raised, pulling that amount of data into DSS engine will cause OOM errors. Sync the data into relational DBs first and then perform group recipe with the In-database engine (so the computation will happen in the database). Alternatively, use the Spark engine to offload the computation.
Side note: While using the DSS engine for group recipes with local or cloud FS connections, all the data will be copied over to internal H2 implementation taking local disk space so be careful with this as you can easily run out of space.
The recommendation remains the same: In-database or Spark engine.
To add to my previous reply: Increasing backend.xmx will not help as well as the crash is happening in the JEK.