-
Metrics and Checks
I am archiving the runs of a recipe as a csv dataset in a folder on Dataiku and syncing the latest run to a separate dataset. Can I establish a check which compares the latest run with the previous run rather than predefined numbers?
-
Duplicate rows need to remove or replace value
My rows are repeating information over and over again because I now have two columns that have a computer name. The one column has different computer names (information from another database) and because of this it is duplicating the results to put in a value for the computer names that are different. I need to get rid of…
-
Impersonation
We have created 2 node design and automation we were able to move users and connection from design node to automation using the .json files. do we have such an alternative for user impersonation as well were we can copy the details from design to automation Operating system used: Linux
-
Setting up Scenario to Run a step once a day
All, I have a data flow that starts with a flow zone that takes a long time to run. I'd like to run this first flow zone just once a day. Is anyone aware of a good simple way to run that zone the first time the scenario runs each day the first step will run. Then every other time the scenario runs the step is checked and…
-
API Endpoint: Ingest Webhook Payload Data
Hello, We have a partner company that is offering to deliver json data via a webhook and I am having trouble figuring out a way to take the data and store it back into a dataset in my project. I originally tried a SQL query endpoint and passed the parameters to a landing DB table via INSERT - this worked when using the…
-
Is it possible to get notified when a job exceeds a pre-determined duration
Hi, I was wondering if it is possible in Dataiku to get notified or emailed when a job exceeds a pre-determined duration. For example I am a user administrator and I want to receive an email if there are jobs running for more than 1 hour. Regards, Weng Kin
-
Can't delete any table, or folder.
When I click on "delete" on any table or folder of my project, I get this modal with this error "invalid argument : invalid loc : empty name". It works on recipe though. I have all rights on the project and it used to work well. Operating system used: Mac OS X
-
Save pandas dataframe in folder in Managed folder (XLSX files)
Hello expert, hope you're well. I'm writing to you because I can't fix this code to add my xlsx dataframe in a directory with today's date. It works fine in csv but I need to store my files in xlsx. I've tested to_excel but it doesn't work... the file remains at 0bytes. Can you help me? Thanks a lot!
-
How to group by several things efficiently
Hi all, I analyse transaction data and I often need to break the metrics by category, while keeping the global value. I want to obtain rows that correspond to the metrics for a precise category of customer (region for example). Currently I use two different Group recipes to obtain either the values grouped by regions or…
-
Extracting information from photos
Hello, I have a file with photos of glasses and I would like to extract from those photos the characteristics (colour, shape...). For this, I was thinking about using chatgpt and Dataiku and generalise it for all the photos. Do you know how I could proceed? Thanks in advance for your answer!