Error while executing spark
We are getting following errors while exectuing Spark
Answers
-
Hi,
You would need to attach your entire log and/or job diag, not just a part of the error message.
If it is not possible for confidentiality reasons, please submit this as a support ticket, with an attached job diagnosis. Please see https://doc.dataiku.com/dss/latest/troubleshooting/obtaining-support.html for more details.
-
there is a log . Kindly help us ..thanks in advance
-
Hello,
- In your test set for this machine learning task, do you happen to have "false" and/or "true" as target for the prediction?
- If no, are the schemas or format of your train dataset and test dataset different, especially around the target column?
- If no, on each of those datasets, can you click on the target column > Analyse and send us a screenshot for both?
-
Thanks for your reply.
yes we are using True /false prediction. Does it not support TRUE/FALSE?..if no then how can we do it?
-
DSS does support Yes/No, 0/1, true/false, etc. But the values need to have the same form on the train set and test set, or in the case of an evaluation recipe, on the original train & test set and on the scored set. Training a model on true/false, then evaluating on Yes/No won't work.
I only see one screenshot, is the from the scored dataset? What were the values of that column for the train & tests sets? -
Hi Adrein,
Glad to see quick response from you.
We are still having few concerns and below are our submission on the points provided:
a. Values need to have the same form on the train set and test set, or in the case of an evaluation recipe, on the original train & test set and on the scored set.
--> Yes we are using same form on each and every dataset present in recipe and that is Yes/No. This behaviour is constant in our project. We used dataiku internal engine for processing and yes that worked end to end but we tried to do the things on Spark MLLib, we face the issue on same recipe which we executed successfully.
What were the values of that column for the train & tests sets?
I am attaching the same for train and test.
Please help us in resolving this.
-
I was able to reproduce your issue. It seems that for MLLib engine, if your target is not true/false on a binary classification target, you should force its Meaning to Text in the Script part of the analysis in which you train your model (before training your model).
Another more robust solution would be to use a Prepare recipe on your dataset to change it to true/false.