Error while executing spark

Options
Swapnali
Swapnali Registered Posts: 38 ✭✭✭✭

We are getting following errors while exectuing Spark

sparlerro.PNG

Answers

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Options

    Hi,

    You would need to attach your entire log and/or job diag, not just a part of the error message.

    If it is not possible for confidentiality reasons, please submit this as a support ticket, with an attached job diagnosis. Please see https://doc.dataiku.com/dss/latest/troubleshooting/obtaining-support.html for more details.

  • Swapnali
    Swapnali Registered Posts: 38 ✭✭✭✭
    Options

    there is a log . Kindly help us ..thanks in advance

  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker
    Options

    Hello,

    1. In your test set for this machine learning task, do you happen to have "false" and/or "true" as target for the prediction?
    2. If no, are the schemas or format of your train dataset and test dataset different, especially around the target column?
    3. If no, on each of those datasets, can you click on the target column > Analyse and send us a screenshot for both?

  • Swapnali
    Swapnali Registered Posts: 38 ✭✭✭✭
    Options

    Thanks for your reply.

    yes we are using True /false prediction. Does it not support TRUE/FALSE?..if no then how can we do it?

    sprkerro.PNG

  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker
    Options
    DSS does support Yes/No, 0/1, true/false, etc. But the values need to have the same form on the train set and test set, or in the case of an evaluation recipe, on the original train & test set and on the scored set. Training a model on true/false, then evaluating on Yes/No won't work.

    I only see one screenshot, is the from the scored dataset? What were the values of that column for the train & tests sets?
  • Amit_Singh
    Amit_Singh Registered Posts: 2 ✭✭✭✭
    Options

    Hi Adrein,

    Glad to see quick response from you.

    We are still having few concerns and below are our submission on the points provided:

    a. Values need to have the same form on the train set and test set, or in the case of an evaluation recipe, on the original train & test set and on the scored set.

    --> Yes we are using same form on each and every dataset present in recipe and that is Yes/No. This behaviour is constant in our project. We used dataiku internal engine for processing and yes that worked end to end but we tried to do the things on Spark MLLib, we face the issue on same recipe which we executed successfully.

    What were the values of that column for the train & tests sets?

    I am attaching the same for train and test.

    Please help us in resolving this.

  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker
    Options

    I was able to reproduce your issue. It seems that for MLLib engine, if your target is not true/false on a binary classification target, you should force its Meaning to Text in the Script part of the analysis in which you train your model (before training your model).

    Another more robust solution would be to use a Prepare recipe on your dataset to change it to true/false.

    Screen Shot 2020-05-23 at 11.58.21.png

Setup Info
    Tags
      Help me…