Prepare Recipe Custom Step & Spark

Options
lina
lina Registered Posts: 12 ✭✭✭

Hi,

We are working on a custom prepare recipe step that adds a user-input row to the dataset. It's working on local DSS. However, when tested on Spark, it adds a row for each file the dataset is partitioned into.

For Example, the dataset is stored in 10 HDFS files, using the recipe step adds 10 duplicated rows instead of 1. Is there a way to bypass this other than converting the code into a visual recipe?

Thanks

Answers

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 315 Dataiker
    Options

    Hi @lina
    ,

    If you are still curious, I think it would be easiest for us to help if you could open a support ticket or a chat via the chat box and attach a job diagnostic of both the local execution of the job and the Spark execution of the job so that we can take a look.

    Thank you,
    Sarina

Setup Info
    Tags
      Help me…