Python pipelines

oscarM · ‎03-10-2020

Hi,

Dataiku rencently add the SQL pipeline fonctionnality to the DSS plateform. I would like to know if there is any chance that the Dataiku's team will add a python pipeline fonctionnality in the near futur ?

Thank you.

Clément_Stenac · ‎03-10-2020

Hi,

We do not plan on adding this kind of capabilities in the near future. The main reason is that it would require significant rework of the Python recipes in order to allow "functional" Python recipes.

At the moment, Python recipes are fully arbitrary code, so they don't have precise notions of "inputs" and "outputs". It would require creating an alternate "python function" kind of recipe which would take a dict of dataframes as input and output a dict of dataframes as output, or something like that.

It would however strongly limit what you can do, making it impossible to iterate on dataframes, to iterate on rows, to use managed folders, ...

The main interest of Spark pipelines and SQL pipelines is to be able to pipe visual things. If you only want to pipe code, you can simply write more of your code in the recipe, using things like project libraries to keep your recipe code neat and clean (which also would allow you to have both granular and more global recipes)

View solution in original post

Clément_Stenac · ‎03-10-2020

Hi,

We do not plan on adding this kind of capabilities in the near future. The main reason is that it would require significant rework of the Python recipes in order to allow "functional" Python recipes.

At the moment, Python recipes are fully arbitrary code, so they don't have precise notions of "inputs" and "outputs". It would require creating an alternate "python function" kind of recipe which would take a dict of dataframes as input and output a dict of dataframes as output, or something like that.

It would however strongly limit what you can do, making it impossible to iterate on dataframes, to iterate on rows, to use managed folders, ...

The main interest of Spark pipelines and SQL pipelines is to be able to pipe visual things. If you only want to pipe code, you can simply write more of your code in the recipe, using things like project libraries to keep your recipe code neat and clean (which also would allow you to have both granular and more global recipes)

oscarM · ‎03-16-2020

Hi,

Ok I see, thank you very much for the advice !

Have a good day.

Sign up to take part

Python pipelines

Python pipelines