Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

How Dataiku store the dataset in the project flow?

Level 2
How Dataiku store the dataset in the project flow?


I've created 2 output datasets, the first one store in PostgreSQL and another in managed folder then I changed connection of the second one (the one in managed folder) to PostgreSQL. It turns out that -

  1. The first that directly store in PostgreSQL does not have a data size in storage.
  2. The second which store in managed folder after that change to PostgreSQL does have a data size in storage.

Where the first output dataset was stored? How does it work?

See reference image in attachment

Thank you

0 Kudos
1 Reply


The first dataset is local managed datasets thus it's stored on DSS instance we can calculate the size of the file on disk 

For SQL datasets these are stored in an SQL database we don't calculate the size as this can depend on various factors for SQL databases. You can find this out using specific queries directly to Postgres if needed and potentially add a custom SQL probe. 

Fo rSQL datasets you would typically want  to check other metrics like the number of rows available from the Status - Metrics on the dataset. 

Let me know if that helps!

0 Kudos