create_dataset() missing 1 required positional argument: 'type'

rmnvncnt
rmnvncnt Registered Posts: 41 ✭✭✭✭✭
edited July 18 in Using Dataiku

Hi,

Few months ago, I used the function


create_dataset(name)

But know I am asked to use


create_dataset(dataset_name, type)

The "type" parameter is new and mandatory. But there is no explanation of what it is exactly. Anyone have an idea ?


project = self.automation.client.get_project('<PROJECT>')
project.create_dataset(dataset_name = '<dataset>', type='Filesystem')

Even if I put a value to this parameter, I still have this error message :


TypeError: create_dataset() missing 1 required positional argument: 'type'





Thanks !

Tagged:

Answers

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    edited July 18

    Hi,

    The argument is not called "name" but "dataset_name", so you can use either:


    create_dataset("administration", "Filesystem")

    or


    create_dataset(dataset_name="administration", type="Filesystem")
  • rmnvncnt
    rmnvncnt Registered Posts: 41 ✭✭✭✭✭
    Sorry but I tried the first solution. And the second changed nothing, the problem is still with the "type" parameter which is "missing".
  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    I'm sorry, we can't reproduce your issue. Here is a sample code to generate a filesystem dataset from scratch:

    dataset = project.create_dataset("my-fs-dataset", "Filesystem")

    definition = dataset.get_definition()
    definition["params"]["connection"] = "filesystem_root"
    definition["params"]["path"] = "/home/centos/titanic/kaggle_titanic_train.csv"
    definition["formatType"] = "csv"
    definition["formatParams"] = {"separator": ",", "style": "excel", "parseHeaderRow": True }
    definition["schema"] = { "columns" : [{"name": "PassengerId", "type":"string"}, {"name": "Survived", "type": "int"}]}
    dataset.set_definition(definition)
  • rmnvncnt
    rmnvncnt Registered Posts: 41 ✭✭✭✭✭
    First, thank you for your answers !
    Then here is the code I wrote :


    file_path= '/'.join([project.project_key,
    "_V"+str(project.get_variables()['standard']['version']),
    "administration"])
    format_params = {
    'separator': ',',
    'style': 'unix',
    'parseHeaderRow': True
    }

    try:
    dataset = dataiku.Dataset('administration')
    df_dataset = dataset.get_dataframe()

    except:
    project = design.client.get_project('ADMINISTRATION')
    project.create_dataset('administration',
    'Filesystem',
    params={
    "connection": "filesystem_managed",
    "path": file_path
    },
    formatType='csv',
    formatParams=format_params
    )

    df = pd.concat([
    design.table,
    automation.table], sort=False)

    dataset = dataiku.Dataset('administration')
    dataset.write_with_schema(df)

    The problem is the function write_with_schema(df). Here is the exception I have :
    Exception: None: b'Internal error, caused by: NullPointerException: null.
    I think there is something arount dropAndCreate parameter...
Setup Info
    Tags
      Help me…