Community Conundrum 25:Feature Visualization is now live! Read More

creating dataset from S3 and SQL

Level 3
creating dataset from S3 and SQL

When we are creating the dataset from S3 or SQL table, is  dataset gets created on local filesystem or does it remain at S3 or SQL table and streaming of data takes place to Dataiku?

1 Reply
Dataiker
Dataiker

Hi,

It is important to understand that a dataset in DSS is essentially just a "pointer" to the underlying table. So if your underlying data or table is in S3 or a SQL database, then the data will remain in said datastore and DSS will not "copy" the data locally. If DSS then uses this dataset (as an input or output of a recipe for example), then DSS will connect to it and read or write to it accordingly. 

The only way that a dataset is created locally typically is if you do something like create a managed filesystem or Upload files dataset. I hope that this helps but you may also find our DSS concepts documentation helpful as well. 

Thanks,

Andrew