Community Conundrum 27: Stacks of Questions is live! Read More

create_dataset() missing 1 required positional argument: 'type'

Level 3
create_dataset() missing 1 required positional argument: 'type'

Hi, 

Few months ago, I used the function 




create_dataset(name)


But know I am asked to use 




create_dataset(dataset_name, type)


 



The "type" parameter is new and mandatory. But there is no explanation of what it is exactly. Anyone have an idea ? 




project = self.automation.client.get_project('<PROJECT>')
project.create_dataset(dataset_name = '<dataset>', type='Filesystem')


Even if I put a value to this parameter, I still have this error message : 




TypeError: create_dataset() missing 1 required positional argument: 'type'






Thanks ! 



 

0 Kudos
4 Replies
Dataiker
Dataiker

Hi,



The argument is not called "name" but "dataset_name", so you can use either:




create_dataset("administration", "Filesystem")


or




create_dataset(dataset_name="administration", type="Filesystem")
Level 3
Author
Sorry but I tried the first solution. And the second changed nothing, the problem is still with the "type" parameter which is "missing".
0 Kudos
Dataiker
Dataiker
I'm sorry, we can't reproduce your issue. Here is a sample code to generate a filesystem dataset from scratch:

dataset = project.create_dataset("my-fs-dataset", "Filesystem")

definition = dataset.get_definition()
definition["params"]["connection"] = "filesystem_root"
definition["params"]["path"] = "/home/centos/titanic/kaggle_titanic_train.csv"
definition["formatType"] = "csv"
definition["formatParams"] = {"separator": ",", "style": "excel", "parseHeaderRow": True }
definition["schema"] = { "columns" : [{"name": "PassengerId", "type":"string"}, {"name": "Survived", "type": "int"}]}
dataset.set_definition(definition)
0 Kudos
Level 3
Author
First, thank you for your answers !
Then here is the code I wrote :


file_path= '/'.join([project.project_key,
"_V"+str(project.get_variables()['standard']['version']),
"administration"])
format_params = {
'separator': ',',
'style': 'unix',
'parseHeaderRow': True
}

try:
dataset = dataiku.Dataset('administration')
df_dataset = dataset.get_dataframe()

except:
project = design.client.get_project('ADMINISTRATION')
project.create_dataset('administration',
'Filesystem',
params={
"connection": "filesystem_managed",
"path": file_path
},
formatType='csv',
formatParams=format_params
)

df = pd.concat([
design.table,
automation.table], sort=False)

dataset = dataiku.Dataset('administration')
dataset.write_with_schema(df)

The problem is the function write_with_schema(df). Here is the exception I have :
Exception: None: b'Internal error, caused by: NullPointerException: null.
I think there is something arount dropAndCreate parameter...
0 Kudos
Labels (3)