Scoring Error

AaronCrouch
AaronCrouch Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered, Dataiku Frontrunner Awards 2021 Winner, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant Posts: 18 ✭✭✭✭✭

I am encountering an error in the scoring recipe over the last few days. I have already reverted my flow to the way it was the day before the error started, and I have retrained my model, the error still occurs. The relevant section of the logs is attached. Can anyone see what is causing the error?

Best Answer

  • AaronCrouch
    AaronCrouch Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered, Dataiku Frontrunner Awards 2021 Winner, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant Posts: 18 ✭✭✭✭✭
    Answer ✓

    I have no Jupyter Notebooks running, but it was indeed RAM memory. I had to turn off individual explanations to get it to run, and we are working to add more RAM so we can use the explanations again.

Answers

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron
    edited July 17

    @AaronCrouch

    In a quick look. It appears that you might have run out of Memory.

    [2021/03/31-14:37:13.035] [null-err-42] [INFO] [dku.utils] - return np.zeros(self.shape, dtype=self.dtype, order=order)
    [2021/03/31-14:37:13.036] [null-err-42] [INFO] [dku.utils] - MemoryError
    [2021/03/31-14:37:14.831] [FRT-35-FlowRunnable] [WARN] [dku.resource] - stat file for pid 71956 does not exist. Process died?

    Given that you have reverted back to a known good model. If input data has not changed in any significant way.

    Then all I have to offer is that I had a similar problem a little while back. I went back to the OS of the Dataiku Design node I was working on and eventually discovered that the available memory had gotten low. After further investigation, I also discovered that we had a bunch of Python-based Jupyter Notebooks with large datasets loaded that were consuming a majority of the working RAM memory on our little design node. Could that be related to your problem?

    That's all I got. Good luck. Maybe a support ticket is warranted?

  • AaronCrouch
    AaronCrouch Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered, Dataiku Frontrunner Awards 2021 Winner, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant Posts: 18 ✭✭✭✭✭

    That makes a lot of sense. This started around the time I took the certification exams, and it involved uploading a massive amount of data to my server. Just deleted all of the exam files and I am still getting an error. Perhaps it is worth a support ticket.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron

    @AaronCrouch
    ,

    In my case, the problem was not files on disk. But, large Jupyter Notebooks in RAM Memory with large datasets loaded into panda's data frames.

Setup Info
    Tags
      Help me…