Quick identification when filtering or sampling applied on a dataset

sudipta002 Partner, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 12 Neuron

In the explore view of a dataset, a mark or highlight can be useful when filter or sampling method is applied. This will quickly help identify if default sampling or filtering was changed instead of clicking on configure sample.

Please allow me to explain with an example. I work with datasets of having millions records stored in Redshift/S3. For troubleshooting and validation purpose, I often apply filter to find out some phrases in text column. When the dataset is rebuilt with new set of records, view of the dataset appears as empty because it holds the filtering conditions which are not met with new records. With this view, I can think of two cases - dataset with no records and dataset with applied filter (If I manage to remember it).

Hence, if we could have a way to distinguish whether a dataset is applied with filter when we explore it, I think, we don't need to check by clicking on configure sample.

1 votes

Released · Last Updated


  • Katie
    Katie Dataiker, Registered, Product Ideas Manager Posts: 105 Dataiker

    Hi @sudipta002

    Good timing, we've actually been working on some UX updates to sampling which will introduce much more transparency into whether you're looking at a sampled/filtered dataset before you expand the sampling panel.

    Stay tuned for more updates very soon


  • Katie
    Katie Dataiker, Registered, Product Ideas Manager Posts: 105 Dataiker

    This enhancement is part of version 11.1, which was released today!

    For a bit more detail... we have revamped the UX of sampling in the dataset explore view, prepare recipes, and charts. This includes:

    • Sampling vs whole data badge to clarify whether you're looking at a sample or not
    • Language next to badge clarifying sample settings, eg in dataset explore view it might say something like "10,000 first rows (pre-filter) out of 350,000
      • so, you can see that the sample is filtered without needing to expand the panel
    • Cleaned up buttons & language in sample settings tab & throughout screen to make calls to action clearer
      • for example, a clear "edit filter" button in sample settings & "columns" button on the right side to see displayed columns

    Check it out once you upgrade to 11.1 and let us know if you've got any questions or feedback!


Setup Info
      Help me…