Flow Dataset preview respects the set Dataset filter
User Story:
As a data analyst that is trying to debug a flow with large datasets, I would like to be able to set a filter on datasets, And then when I get back to the flow use the preview to see the transformation that occurs to my selected set of records as it is transformed throughout the flow.
Conditions of Satisfaction:
- This needs to run similarly fast to the display of records when looking at just the one dataset by itself.
Nice to Items:
- It would be nice to have a toggle button in the flow preview that one could click to respect and not respect the filters.
Notes:
- Today to do this type of analysis you need to maintain lots of different tabs each looking at each step in the flow. Once filtered being able to go to the flow and click on each of the steps and see what happens without actually changing the steps of the flow would be nice
- I guess I could put a new step temporarily upstream in my flow that only allows the record I care about to get through the flow. However, in some cases of incrementally updates datasets that would not be possible because I would likely loose days of data processing. The looking at the results of the sample filters is less destructive of the underlying data sets.
Comments
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,138 Neuron
I am not sure what this idea is asking. Is this sort of asking being able to temporarily restrict/filter input datasets to a specific set of records (or even record) so you could run the flow on this smaller dataset to see what happens?
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron
This is specificly about the new Preview function in the Flow. Where you can open a window at the bottom of the flow screen and when you click on the dataset you see records from the dataset you clicked on.
The idea is to have this preview respect the filter you would see when you opened up the dataset.
I might from time to time want to look at how a small set of records is transformed through out my flow. Particularly in long flows with dozens of steps. Not having to open each dataset in the flow would be nice. That is what the preview helps with. But It's now not showing the records that I'm particularly caring about, it shows all of the records.
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,138 Neuron
Hi Tom, I think you meant to say "look at how a small set of records is transformed through my dataset flow". And now I get what you mean. Indeed this is a great idea and certainly gets my vote. By re-using the existing dataset filter on the preview section makes tracing records across the flow much easier. I think this is one where one feature intended for exploration/understanding the flow can also be used for debug as well and it seems Dataiku didn't think about the latter use case. I would even argue that expanding your idea and allowing the header Filters/Analise/Sort options to be visible on the Preview window would make a lot of sense too.
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,601 Neuron
In terms of the additional features you are suggesting. I would be all for them as long as they do not add extensively to the time it takes to show these previews. (Very fast is very important in my mind.)