Using Dataiku

Filter recipe : How to avoid stop processing when there are no matched records
Hi dataiku users, I want to know how to resolve the situation in subject. I use filter recipe only for processing the exception data, and stack with main data after that, so if there are no filtered records in output dataset, no problem. but in dataiku, if there are not all data sets in stack recipe, return error and stop…
How to use streaming python
Hi All! I'm trying to use streaming Python with the example given in documentation: https://doc.dataiku.com/dss/latest/streaming/cpython.html#writing-to-datasets If i try to follow it , it doesn't work exactly: 1) .get_continuous_writer() expects a source-id as one of the arguments 2) if i give something like…
Kafka - Restart Failed Process
I get random errors on my Kafka due to GCS bucket failures and Bigquery size limits. I'm working with my teams to resolve, but I'm wanting to know if there is an easy way to restart a continuous process in the event of a failure? I thought about setting a scenario to start the process every 30 minutes or so, but I'm sure…
How do I create a backend APIs for the various transformations and visualizations in a flow?
I am trying to create an ML application which can display the various types of transformations that happen in a dataset, like the count of certain rows, their min, max, etc. This app will take data from the flow and display whatever information is needed for the particular run. I need a way to pass the information to this…
Extract underlying code of any recipe on dataiku
I have a similar question to the one posted a few years ago https://community.dataiku.com/t5/General-Discussion/Python-script-to-export-any-kind-of-recipes-into-SQL/m-p/21298 I have a flow with tons of recipes. I want to convert that into "a" code, python, SQL, pyspark... I do not care. The solution in the link works only…
Manage Permissions of Dataiku folder
I want to read a docx file in my dataiku folder through python recipes, but it returns permission denied. How can I change my folder access permissions? PermissionError: [Errno 13] Permission denied: '/data/dataiku/dss_data/managed_folders/TUMING/DYV5ukXU/GDMS005_001.docx'
Dataiku
Hi, Is there any functionality similar to MLflow within Dataiku ? Regards, Varun
Writing an if statement to check if a value is not contained in an array
Hey everyone! Currently I am using the prepare recipe, and specifically the Create if, then, else statements processor. For one of the if statements, I want to write an if statement that check if each value in a specific column is not contained in a given array. I know that I can compare a value with another value;…
Using custom python model for Clustering (Agglomerative Clustering)
Hi all, I have a question regarding custom python models for a clustering modelling task. I am trying to do something really basic, like running Agglomerative Clustering using a different metric and linkage methods (included in sklearn natively). at the moment, I seem to be unable to do so default model in dataiku, so I…
Run a Application Receipe from Python Receipe
Hi, I created an application recipe in project A, which I want to use in project B. This recipe requires multiple parameters as input. In project B, these parameters are defined in my variables. As the application recipe does not allow for a reference to these variables, I try to find a work around for this. I initially…

Trending Discussions

Docs for "pandasutils"?
Hello, My apologies if this is a remedial question, but at the start of every Python recipe the boilerplate code includes an import of: from dataiku import pandasutils as pdu Is there documentation for pandasutils? Is it a package that can be used in Python recipes? I've tried looking through the Dataiku Developer Guide,…
Run a Time Series Forecasting Model
I get the following error message Error message: Failed to train : <class 'ImportError'> : libcuda.so.1: cannot open shared object file: No such file or directory Operating system used: 13.1.4
Identifying the Node Type in a DSS Notebook using Python
In Python, in a DSS notebook, I want to know if the code is running in the design node or the automation node. How can I do that?

Leaderboard

Turribeach 3539

tgb417 2473

Ignacio_Toledo 1079