create_dataset() missing 1 required positional argument: 'type'
rmnvncnt
Registered Posts: 41 ✭✭✭✭✭
Hi,
Few months ago, I used the function
create_dataset(name)
But know I am asked to use
create_dataset(dataset_name, type)
The "type" parameter is new and mandatory. But there is no explanation of what it is exactly. Anyone have an idea ?
project = self.automation.client.get_project('<PROJECT>')
project.create_dataset(dataset_name = '<dataset>', type='Filesystem')
Even if I put a value to this parameter, I still have this error message :
TypeError: create_dataset() missing 1 required positional argument: 'type'
Thanks !
Answers
-
Hi,
The argument is not called "name" but "dataset_name", so you can use either:
create_dataset("administration", "Filesystem")or
create_dataset(dataset_name="administration", type="Filesystem") -
Sorry but I tried the first solution. And the second changed nothing, the problem is still with the "type" parameter which is "missing".
-
I'm sorry, we can't reproduce your issue. Here is a sample code to generate a filesystem dataset from scratch:
dataset = project.create_dataset("my-fs-dataset", "Filesystem")
definition = dataset.get_definition()
definition["params"]["connection"] = "filesystem_root"
definition["params"]["path"] = "/home/centos/titanic/kaggle_titanic_train.csv"
definition["formatType"] = "csv"
definition["formatParams"] = {"separator": ",", "style": "excel", "parseHeaderRow": True }
definition["schema"] = { "columns" : [{"name": "PassengerId", "type":"string"}, {"name": "Survived", "type": "int"}]}
dataset.set_definition(definition) -
First, thank you for your answers !
Then here is the code I wrote :
file_path= '/'.join([project.project_key,
"_V"+str(project.get_variables()['standard']['version']),
"administration"])
format_params = {
'separator': ',',
'style': 'unix',
'parseHeaderRow': True
}
try:
dataset = dataiku.Dataset('administration')
df_dataset = dataset.get_dataframe()
except:
project = design.client.get_project('ADMINISTRATION')
project.create_dataset('administration',
'Filesystem',
params={
"connection": "filesystem_managed",
"path": file_path
},
formatType='csv',
formatParams=format_params
)
df = pd.concat([
design.table,
automation.table], sort=False)
dataset = dataiku.Dataset('administration')
dataset.write_with_schema(df)
The problem is the function write_with_schema(df). Here is the exception I have :
Exception: None: b'Internal error, caused by: NullPointerException: null.
I think there is something arount dropAndCreate parameter...