save out a file from IPython notebook
UserBird
Dataiker, Alpha Tester Posts: 535 Dataiker
Can't figure out how to do this. Have tried two approaches, neither worked.
1. Approach 1: Pandas way: data.to_csv('data.csv')
This does not throw an error, but I don't see the dataset anywhere in my flow...
2. Approach 2: Dataiku way: # Recipe outputs
recommenderdata_tosave = dataiku.Dataset("recommenderdata_tosave")
recommenderdata_tosave.write_with_schema(data_for_recommender)
In the notebook, I get this error:
Exception: None: dataset does not exist: PROJECT.recommenderdata_tosave
This code works in the python recipe in the flow, but not in the notebook for some reason.
Any help would be appreciated.
Answers
-
Hi,
"write_with_schema" does not create the dataset, it only "fills" it. You need to first declare the dataset in your Flow. The best way to do it is to create it as a "managed" dataset, so that DSS handles all the connection details: in the Flow, click on "+ Dataset" > "Internal" > "Managed dataset". You now only need to enter the name, and select where you want this dataset to be stored. You can then use it in the notebook. -
Thanks for your quick and helpful answer. I confirm that this works!
To anyone who comes after me, this is what I did:
1. Create managed data set as explained above (I named it: data_for_recommender_managed)
2. save it from the notebook with the following code:
# Recipe outputs
data_for_recommender_managed = dataiku.Dataset("data_for_recommender_managed")
# the dataframe in memory in the notebook is called: data_for_recommender
data_for_recommender_managed.write_with_schema(data_for_recommender)