-
API Node - What is the use of the folder 'code-envs-cache' ?
Hello, On the API node we noticed that the folder 'code-envs-cache' requires a lot of disk space. Please, could you tell us how this folder is initialized and used ? Thanks Annie
-
Is there a way to roll back the changes that we have done to a flow?
Hi all, when our flow involves lot of transformation there can come a point where some thing goes wrong and we want to revert back to the point where everything was working. Is there a way I can check what all changes were made to the flow, and go back to the previous version or branch out from there?
-
Efficiently applying window recipes across monthly database partitions
Hi all, I’m looking for best practices when applying window recipes that need to span multiple partitions. In my case, I have a dataset partitioned by month, and I’m using a window recipe with lead/lag functions to look ahead and behind by 3 months. To make this work, I currently: Unpartition the dataset using a sync…
-
SQL Query Recipes using API
I'm struggling to find good code examples on creating SQL query recipes via API version 14 (or compatible). I'm trying to get subsets of data pulled from a SQL table dataset into separate Azure blob datasets for consumption by other parts of our application. It seems like it should be straight forward to find examples, but…
-
How can I retrieve the list of foreign datasets used in my Dataiku project?
How can I retrieve the list of foreign datasets used in my Dataiku project that originate from other projects?
-
Failed to read data from DB
I have an empty prepare recipe, used to copy a dataset from mysql to an oracle database before the job was executed but since one week it's turning into error , i tried a sync, and i am getting this error Failed to read data from DB, caused by: SQLException: Query failed : Query exceeded distributed user memory limit of…
-
"Enrich records with files info" in prepare recipes working only on csv files ?
Hi all, Working on a prepare recipe which is just after an initial dataset regrouping several files. I have no problem with the step when all my initial files are .csv. But when my files are .xml, the resulting column is empty. Same empty result when my files are "on data per-line" based. What am i missing here ? (in…
-
How to retrieve input datasets for a specific dataset using the Python API?
Hi everyone, I'm trying to use the Dataiku Python API to identify which input datasets were used to create a specific dataset within a project. For example, in the project "PRISME_INTEGRATION_TABLES", I want to retrieve the direct input datasets that were used to generate the dataset "PRS_Decision_Complement". I attempted…
-
Label task
Hello, I have created a label task for the evaluation of a query. The input is a sample of rows of a bigger dataset. With the label task you can then assigned one of five categories. Using the label task has lead to changes of the query that feeds the label task. Now I want to reset the task / remove the data associated…
-
Read xlsx. File from managed folder using Python
Hi team, I'm using a Python recipe in Dataiku to read a specific .xlsx file from a managed folder, but I'm encountering an error when trying to load the file into a DataFrame. Here’s a simplified version of my code: folder = dataiku.Folder("FOLDER_ID") file_list = folder.list_paths_in_partition() last_month_str = "YYYYMM"…