<class 'json.decoder.JSONDecodeError'> when evaluating a deployed Random Forest model
How to replicate:
Using windows10, download the latest Dataiku DSS on-premise version (13.2.3).
Create a New project, upload any dataset with a "target" column having binary value.
Click the dataset - Lab - AutoML Prediction - Quick Prototype - Train a Random Forest model on "target", using default settings.
Deploy the Random Forest model.
Click the deployed model, use an Evaluate recipe. Keey all default, and run the recipe, then you get:
Error message:
[15:51:26] [INFO] [dku.utils] - *************** Recipe code failed **************
[15:51:26] [INFO] [dku.utils] - Begin Python stack
[15:51:26] [INFO] [dku.utils] - Traceback (most recent call last):
[15:51:26] [INFO] [dku.utils] - File "C:\Users\PCsPC\AppData\Local\Dataiku\DataScienceStudio\kits\dataiku-dss-13.2.3-win\python\dataiku\doctor\evaluation\reg_evaluation_recipe.py", line 392, in <module>
[15:51:26] [INFO] [dku.utils] - api_node_logs_config=dkujson.loads(sys.argv[13]),
[15:51:26] [INFO] [dku.utils] - File "C:\Users\PCsPC\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\json_init_.py", line 348, in loads
[15:51:26] [INFO] [dku.utils] - return _default_decoder.decode(s)
[15:51:26] [INFO] [dku.utils] - File "C:\Users\PCsPC\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\json\decoder.py", line 337, in decode
[15:51:26] [INFO] [dku.utils] - obj, end = self.raw_decode(s, idx=_w(s, 0).end())
[15:51:26] [INFO] [dku.utils] - File "C:\Users\PCsPC\AppData\Local\Dataiku\DataScienceStudio\Python\python-3.7.13\lib\json\decoder.py", line 353, in raw_decode
[15:51:26] [INFO] [dku.utils] - obj, end = self.scan_once(s, idx)
[15:51:26] [INFO] [dku.utils] - json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[15:51:26] [INFO] [dku.utils] - End Python stack
Note that the "Score" recipe works fine.
Please find attached the whole error log.
Please kindly let me know if any additional information is required.
Operating system used: Windows10
Answers
-
I am not able to reproduce this error:
- I created a new project from starter project (ML Basics from Dataiku)
- Used the "Customers Labeled" dataset and created a random forest model on "gender" as you describe above (all defaults)
- Ran the eval step with all defaults using the same "customers_labeled" dataset
- Runs with warnings (new data present not found in training)
The error log file you gave does not have anything popping out at me immediately, does your data have any nulls that could be throwing off the model?
-
Are you using the latest Dataiku DSS on-premise version (13.2.3) on Windows 10?
I'm not sure which "ML Basics from Dataiku" you meant, I didn't find it in my Dataiku. Could you guide me how to locate it?
This can also be replicated using the DKU_CHURN official project:
In this project, just use the Evaluate recipe of the Churn prediction model (on the customer_within_segment) dataset:
Then you can replicate my issue:
-
Have you resolved this issue? Because i encounter this with other project with Evaluate Recipe.
-
I haven't.
Currently I'm on version 13.2.4, but still getting this issue.
I believe it's a bug that Dataiku needs to fix
-
I'm on 13.1.2 on premise and imagine that it is a Windows based env.
This is the ML basics project I started to try to recreate the issue:
I'll try it with the Churn example you gave and reply here again. It very well may be a bug between versions.
Edit:
I used the same sample dataset/project and even named my outputs the exact same and it ran successfully:I'm wondering if someone from @Dataiku can look into this? @Alexandru (you were in a recent thread I looked at)?