What happens in DSS when I do this : pd.to_excel for quick export

Solved!
ishan42d
Level 1
What happens in DSS when I do this : pd.to_excel for quick export

I want to understand while using DSS and creating a python recipe , where does my dataset is stored when I do this : df.to_excel(r'test.xlsx'). I want to see this data but I am not able to locate where does this gets stored.

0 Kudos
1 Solution
Liev
Dataiker Alumni

Hi @ishan42d , datasets in DSS are "rich pointers" or configurations that hold information about the shape and location details of your actual data. As such the underlying data can reside in a filesystem, a database or other types of storage. 
Since you're asking regarding exports to excel I assume you're holding your data locally and are interested in exporting the resulting dataset after a transformation. You have several options:

- Click on the dataset, in the menu that comes up, you should be able to export (under Actions). You should be able to choose if you want it in CSV or other formats.

- If you absolutely need the underlying files, then you could navigate to the DATA_DIR of DSS (this is where you installed it) and under managed_datasets folder find the files under the relevant project.

The second way might be harder to format and is usually compressed for efficiency, hence the first method might be better suitable. It also is, as explained above, only available if the datasets are in the filesystem and not in other types of storage.

EDIT: If you want to bypass DSS entirely and execute pd.DataFrame.to_excel you should specify the directory as part of the filename. so if the needed filename is "dataset.xlsx" you should do something like this in your recipe or notebook. 

df.to_excel("/Users/someuser/Desktop/dataset.xlsx")

 

View solution in original post

2 Replies
Liev
Dataiker Alumni

Hi @ishan42d , datasets in DSS are "rich pointers" or configurations that hold information about the shape and location details of your actual data. As such the underlying data can reside in a filesystem, a database or other types of storage. 
Since you're asking regarding exports to excel I assume you're holding your data locally and are interested in exporting the resulting dataset after a transformation. You have several options:

- Click on the dataset, in the menu that comes up, you should be able to export (under Actions). You should be able to choose if you want it in CSV or other formats.

- If you absolutely need the underlying files, then you could navigate to the DATA_DIR of DSS (this is where you installed it) and under managed_datasets folder find the files under the relevant project.

The second way might be harder to format and is usually compressed for efficiency, hence the first method might be better suitable. It also is, as explained above, only available if the datasets are in the filesystem and not in other types of storage.

EDIT: If you want to bypass DSS entirely and execute pd.DataFrame.to_excel you should specify the directory as part of the filename. so if the needed filename is "dataset.xlsx" you should do something like this in your recipe or notebook. 

df.to_excel("/Users/someuser/Desktop/dataset.xlsx")

 

MarkPundurs
Level 3

Even with the edit, the original question is not answered: where exactly is that file? On what machine?

0 Kudos