Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Looking for the equivalent of the Alteryx iterative macro function. So far I have been unable to set up a successful loop in which it takes the first row changes and loops the data back through the workflow until a condition is met.
I am attempting to achieve the following:
Run through every record and then loop the records back through the workflow, repeating the entire process as many times as is specified, or until a condition is met.
I need to capture the changes made by the first pass and loop the change and all other records back through my workflow dynamically.
What you are describing is not a requirement but how you think it's best to achieve it. For instance iterating through every row, irrespective of the technology, is generally bad practice. You can have a read on this Stackoverflow answer showing what other options you have to not iterate through every row in a Pandas data frame:
A true requirement will look more like this: I have an input dataset to which I need to calculate an aggregate of column X based on grouping on column Y (sales per quarter for instance).
The way you phrased your question make me think you may be thinking Dataiku behaves in ways is not really meant to work. Unlike an ETL tool that will process rows as they come the Dataiku flow will usually expect that you have all your inputs up front and then process all rows at once through the flow. It is possible to have "realtime" flows but this is not usual. This is because you should not expect to be able to run your flow concurrently (ie multiple runs at the same time). Recipes may process their inputs row by row but this will not a good design as it will be slow.