save out a file from IPython notebook

UserBird
UserBird Dataiker, Alpha Tester Posts: 535 Dataiker

Can't figure out how to do this. Have tried two approaches, neither worked.

1. Approach 1: Pandas way: data.to_csv('data.csv')

This does not throw an error, but I don't see the dataset anywhere in my flow...

2. Approach 2: Dataiku way: # Recipe outputs

recommenderdata_tosave = dataiku.Dataset("recommenderdata_tosave")

recommenderdata_tosave.write_with_schema(data_for_recommender)

In the notebook, I get this error:


Exception: None: dataset does not exist: PROJECT.recommenderdata_tosave

This code works in the python recipe in the flow, but not in the notebook for some reason.

Any help would be appreciated.

Tagged:

Answers

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Hi,

    "write_with_schema" does not create the dataset, it only "fills" it. You need to first declare the dataset in your Flow. The best way to do it is to create it as a "managed" dataset, so that DSS handles all the connection details: in the Flow, click on "+ Dataset" > "Internal" > "Managed dataset". You now only need to enter the name, and select where you want this dataset to be stored. You can then use it in the notebook.
  • UserBird
    UserBird Dataiker, Alpha Tester Posts: 535 Dataiker
    Thanks for your quick and helpful answer. I confirm that this works!

    To anyone who comes after me, this is what I did:

    1. Create managed data set as explained above (I named it: data_for_recommender_managed)

    2. save it from the notebook with the following code:

    # Recipe outputs
    data_for_recommender_managed = dataiku.Dataset("data_for_recommender_managed")
    # the dataframe in memory in the notebook is called: data_for_recommender
    data_for_recommender_managed.write_with_schema(data_for_recommender)
Setup Info
    Tags
      Help me…