How to rebuild a “SQL query” dataset ?

otassel Registered Posts: 9 ✭✭✭✭


I created a “SQL query” dataset based on a PostgreSQL connection. The data returned from this SQL query is updated every day.

How can I rebuild / update / refresh the dataset automatically via a Scenario (or something else) ? It seems that the "Build" step for a scenario doesn't work for a SQL Query dataset.

Best Answer

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Answer ✓

    A dataset is a "view" of the underlying data. Thus, there is no real meaning of "building" a SQL query dataset, because DSS does not materialize the SQL query dataset.

    If your SQL Query dataset is used as input of a recipe, the results of the SQL query will be read each time said recipe is run, in order to build the *output* datasets of the recipe.

    If you want to materialize the state at a given moment of a SQL query dataset, you need to use a "Sync" recipe in order to copy the virtual view of the SQL query dataset into a physical materialization, in another dataset.

    One exception to this is the "Explore sample" of the dataset, which is computed only once for performance reasons. This is only a sample for exploration, and does not represent the "state" of this dataset. You can force a refresh of the explore sample by clicking on "Configure sample settings" and "Save & Refresh sample"
Setup Info
      Help me…