-
Remove duplicate
Hi, I have gone through few of the post on the remove duplicate but none of that give the clear answer on the same. Can you pls. provide the path to showcase how can i use some column with condition if that value repeats it would stop counting the same value with entire row in the output? K.Rgds, Kalpesh
-
Filter recipe : How to avoid stop processing when there are no matched records
Hi dataiku users, I want to know how to resolve the situation in subject. I use filter recipe only for processing the exception data, and stack with main data after that, so if there are no filtered records in output dataset, no problem. but in dataiku, if there are not all data sets in stack recipe, return error and stop…
-
How to use streaming python
Hi All! I'm trying to use streaming Python with the example given in documentation: https://doc.dataiku.com/dss/latest/streaming/cpython.html#writing-to-datasets If i try to follow it , it doesn't work exactly: 1) .get_continuous_writer() expects a source-id as one of the arguments 2) if i give something like…
-
Kafka - Restart Failed Process
I get random errors on my Kafka due to GCS bucket failures and Bigquery size limits. I'm working with my teams to resolve, but I'm wanting to know if there is an easy way to restart a continuous process in the event of a failure? I thought about setting a scenario to start the process every 30 minutes or so, but I'm sure…
-
How do I create a backend APIs for the various transformations and visualizations in a flow?
I am trying to create an ML application which can display the various types of transformations that happen in a dataset, like the count of certain rows, their min, max, etc. This app will take data from the flow and display whatever information is needed for the particular run. I need a way to pass the information to this…
-
Extract underlying code of any recipe on dataiku
I have a similar question to the one posted a few years ago https://community.dataiku.com/t5/General-Discussion/Python-script-to-export-any-kind-of-recipes-into-SQL/m-p/21298 I have a flow with tons of recipes. I want to convert that into "a" code, python, SQL, pyspark... I do not care. The solution in the link works only…
-
Manage Permissions of Dataiku folder
I want to read a docx file in my dataiku folder through python recipes, but it returns permission denied. How can I change my folder access permissions? PermissionError: [Errno 13] Permission denied: '/data/dataiku/dss_data/managed_folders/TUMING/DYV5ukXU/GDMS005_001.docx'
-
Dataiku
Hi, Is there any functionality similar to MLflow within Dataiku ? Regards, Varun
-
Writing an if statement to check if a value is not contained in an array
Hey everyone! Currently I am using the prepare recipe, and specifically the Create if, then, else statements processor. For one of the if statements, I want to write an if statement that check if each value in a specific column is not contained in a given array. I know that I can compare a value with another value;…
-
Using custom python model for Clustering (Agglomerative Clustering)
Hi all, I have a question regarding custom python models for a clustering modelling task. I am trying to do something really basic, like running Agglomerative Clustering using a different metric and linkage methods (included in sklearn natively). at the moment, I seem to be unable to do so default model in dataiku, so I…