Preparation Recipes - ROWS mode documentation

Highlighted
florianbriand
Level 1
Preparation Recipes - ROWS mode documentation

Is there any documentation about the "ROWS" mode for Preparation Recipes ?

The existing documentation only mention the CELL and ROW modes ( https://doc.dataiku.com/dss/latest/plugins/reference/preparation.html )

 

Specifically, is there any ways to control the execution of the process ? For example, if I want to run the "process" function only one time, how can I achieve that ?

0 Kudos
1 Reply
Highlighted
Clément_Stenac Dataiker
Dataiker
Re: Preparation Recipes - ROWS mode documentation

Hi,

We indeed do not have a detailed example for the ROWS mode, but it does work very similarly in a plugin as in the normal processor itself: https://doc.dataiku.com/dss/latest/preparation/processors/python-custom.html#rows-mode

In other words, you can create a processor in the UI, switch it to rows mode, inspect and understand the sample code and port it to your plugin.

You cannot get the process function to be called only once. It would require the entire data to be in memory, which would not scale. Instead, the process function will be called once per row, and each time you return as many rows as you want. In other words, you can return zero rows, while "remembering" the previous rows in a Python variable, and emitting them later, for instance when some kind of "trigger" is reached. Beware that remembering all rows in a dataset could easily lead to out-of-memory situations.

0 Kudos