Python code to create a new Dataiku dataset

Highlighted
N_JAYANTH
Level 2
Python code to create a new Dataiku dataset

I would like to create massive dataiku dataset using python interpretor, without using creating them manually in the recipe



Note: The following command works only if I have created a dataiku dataset called "myoutputdataset" in my recipe. But, my problem is to create a new dataiku Dataset with out creating it before in my recipe and save my pandas dataframe in it 




output_ds = dataiku.Dataset("myoutputdataset")
output_ds.write_with_schema(my_dataframe)
14 Replies
Thomas Dataiker
Dataiker
Re: Python code to create a new Dataiku dataset

Hi, 



"myoutputdataset" and "my_dataframe" are just placeholders that need to be changed with your own names / code. 



For instance, the following (complete) recipe has a output DSS dataset called "results" which is filled by a Pandas dataframe called "o":





# -*- coding: utf-8 -*-
import dataiku
import pandas as pd

# Recipe inputs
titanic = dataiku.Dataset("titanic")
df = titanic.get_dataframe()

# Some Python code
# ...
o = df.sort('PassengerId')

# Recipe outputs
output = dataiku.Dataset("results")
output.write_with_schema(o)


Hope this helps. 

0 Kudos
N_JAYANTH
Level 2
Re: Python code to create a new Dataiku dataset
I think you mis-understood my question. I know that "myoutputdataset" and "my_dataframe" are just placeholders. In your code

output = dataiku.Dataset("results")

what is "results". I suppose its a dataiku database, So you have already have a dataiku database named "results". Thats why you are able to write into it. My Question is how do you create the "results" database in dataiku using python code
0 Kudos
Thomas Dataiker
Dataiker
Re: Python code to create a new Dataiku dataset

The "results" Dataset is not created by the Python code, but when you create your Recipe first:



kenjil Dataiker
Dataiker
Re: Python code to create a new Dataiku dataset
The output dataset of a recipe is created in the recipe creation modal.

In case you really want to massively create datasets, there is an python API to administer DSS that you can use
http://doc.dataiku.com/dss/latest/api/public/index.html
Note that this API is NOT intended to be used to create the output dataset of a single recipe.
0 Kudos
N_JAYANTH
Level 2
Re: Python code to create a new Dataiku dataset
So how do I create massive datasets like "results" without mentioning them in the recipe?
0 Kudos
N_JAYANTH
Level 2
Re: Python code to create a new Dataiku dataset
Yes @kenjil I would like to create massive datasets
0 Kudos
kenjil Dataiker
Dataiker
Re: Python code to create a new Dataiku dataset
This has nothing to do with the size of the dataset but with the number of datasets you want to create. There is not point using that API for creating a single dataset, whatever its size.
0 Kudos
N_JAYANTH
Level 2
Re: Python code to create a new Dataiku dataset
I want to create a large number of datasets, Is there any method to do this, please note I have a COMMUNITY EDITION license for DSS
0 Kudos
kenjil Dataiker
Dataiker
Re: Python code to create a new Dataiku dataset
I'm sorry. The admin API is not available in DSS Free Edition.
0 Kudos
Labels (3)