Python pipelines

oscarM
oscarM Registered Posts: 2 ✭✭✭✭

Hi,

Dataiku rencently add the SQL pipeline fonctionnality to the DSS plateform. I would like to know if there is any chance that the Dataiku's team will add a python pipeline fonctionnality in the near futur ?

Thank you.

Best Answer

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Answer ✓

    Hi,

    We do not plan on adding this kind of capabilities in the near future. The main reason is that it would require significant rework of the Python recipes in order to allow "functional" Python recipes.

    At the moment, Python recipes are fully arbitrary code, so they don't have precise notions of "inputs" and "outputs". It would require creating an alternate "python function" kind of recipe which would take a dict of dataframes as input and output a dict of dataframes as output, or something like that.

    It would however strongly limit what you can do, making it impossible to iterate on dataframes, to iterate on rows, to use managed folders, ...

    The main interest of Spark pipelines and SQL pipelines is to be able to pipe visual things. If you only want to pipe code, you can simply write more of your code in the recipe, using things like project libraries to keep your recipe code neat and clean (which also would allow you to have both granular and more global recipes)

Answers

Setup Info
    Tags
      Help me…