Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi Frank,
You should do a Python Custom Recipe in a plugin or scenario (I use to do it in scenario), with something similar to:
import dataiku
from dataikuapi import SyncRecipeCreator
from dataiku.scenario import Scenario
import pandas as pd
folder_path = "path/to/file/" # Don't forget the last /
file = "file.csv"
df = pd.read_csv(folder_path + file) # you should adapt the parameters
dataset = project.create_dataset(dataset_name, 'Filesystem', params={'connection': 'filesystem_root', 'path': folder_path + file}, formatType='csv', formatParams={'separator': ';', 'style': 'no_escape_no_quote', 'parseHeaderRow': True}) # here too
dataset.set_schema({'columns': [{'name': column, 'type': 'string'} for column in df.columns]}) # I use to set string and then change it
builder = SyncRecipeCreator("sync_output_dataset", project)
builder = builder.with_input(dataset_name)
builder = builder.with_output("output_dataset", append=False)
recipe = builder.build()
scenario.build_dataset("output_dataset", build_mode='NON_RECURSIVE_FORCED_BUILD')