-
Filtering files on a folder based on a external list
Ok, I am beyond (or behind) a newbie on Dataiku so, bear with me on this. I have a folder containing csv files, the folder contains currently 3000 files, size of each file is probably 100KB at most. but all together they go to maybe 15MM. I've created a dataset based on this folder and filter only rows that I needed with a…
-
Add a recipe in the middle of the flow
Dear Dataiku users, I am having trouble finding an answer to my question. I am building a dataiku flow with multiple recipes. Is it possible to add a recipe (let's say a prepare recipe) in the middle of the flow ? I just want to reorganize the order of my columns for the last datasets created in my flow. The only solution…
-
Score Recipe - FileNotFoundError
I'm following the Scoring Basics course, and when I try to run the score recipe, I get the following error: [09:46:39] [INFO] [dku.utils] - *************** Recipe code failed **************[09:46:39] [INFO] [dku.utils] - Begin Python stack[09:46:39] [INFO] [dku.utils] - Traceback (most recent call last):[09:46:39] [INFO]…
-
Integrating with GitHub and Dataiku DSS
Hello, As per the description for working with Github, (Working with Git — Dataiku DSS 12 documentation) I need a DSS user's public SSH key, and if I don't have it, need to generate SSH key, but the example shown on the description is for whom those use Dataiku Cloud. (I don't use Dataiku Cloud but Enterprise.) How do I…
-
Servidor de Dataiku
Está caido el servidor en este momento? , por que no puedo ingresar a la plataforma de aprendizaje. saludos ! Operating system used: Windows
-
Problem using data preparation recipe: Input dataset is not ready (table does not exist)
Using the data preparation recipe I am getting the following error, even the table to prepare exists Input dataset is not ready (table does not exist) More information on this error in DSS documentation Additional technical details * Error code: ERR_JOB_INPUT_DATASET_NOT_READY_NO_TABLE * Error type:…
-
Extract train/test/validation sets from visual ML
Hi, I am looking for ways to extract the exact train/test/validation sets used in visual ML. This not only implies the data splits, but also datasets that include all new features created as a result of the data processing in visual ML. For example, if dummy encoding is used for a text column, I would like the…
-
using mapquest api from dockerized dss : Name or service not known
Hi there, I would like to use mapquest to geocode addresses but I get the following error : open.mapquestapi.com: Name or service not known Is that service still alive?
-
Reading all Files from a directory
Hello ! I'm totally new to dataiku coming from "knime". I'm trying to learn and replicate what i have done on the other software. I would like to read all the Excel file contained in a folder located in a shared drive, for example in a U:// or G://. Once everything is read, i simply monitor the file that have added/changed…
-
is there a way to split output file into smaller chunks?
Hi, I have dataiku flow to format the data and generate csv files to azure blob storage. However it seems like the output files is over 100MB each. We would like to have a control over the output file - spliting data into smaller files - not more than 50MB each file. Wonder if this is possible to do on dataiku flow? Thanks…