When I used to run tables consisting of huge datasets it usually takes longer duration to compute. When a single recipe exceeds 5hrs, I get this token expired error and the job fails. Do we have some solution to exceed out run duration.
Hi, I want to limit the number of rows in a dataiku dataset. It should only keep the latest 90 Rows and delete the oldest. The dataset is built by appending one row at a time to it. I tried, instead of appending directly to the dataset, to create another dataset with just the newest row and a python recipe to implement the…
Hi, We are currently facing an issue while attempting to run a scored recipe with the "Output Explanations" checkbox selected. The error message, as captured in the attached screenshot We have examined the input dataset, but been unable to pinpoint the root cause of the issue. Interestingly, the job completes successfully…
Hi there, I understand that we cannot use the output of another recipe as another output of a recipe. Is there a work way around this? My situation is as below: one of the recipe would have a condition that if necessary it would need to update an output of another recipe with the datasets. the datasets of the other recipes…
Hi, is there any timeline for when Dataiku will be supporting Pandas v2? Thanks
Hello team, I am trying to redispatch a discrete partition through the sync recipe, using the process described in this documentation. I have tried running on the DSS Engine, however, on my end the recipe fails with the "Job process died (killed - maybe out of memory ?)" error. The Spark, Hive, and/or Imapala engine have…
Hi all, I am using API designer for which I created one API which takes some datasets from the flow and takes the API response fields, and some calculation is being done which is then stored in a pandas data frame. Now this data frame will be created every time the API is called with a new set of responses, I want to store…
I am in a course that is teaching Dataiku as an add-on curriculum feature. I would like to know more about techniques to improve model performance for a classification problem using decision trees and random forest. Also, I see that we are only able to see the test results when running the models. Is there a way to see…
Hello everyone. I am trying to train a simple Cat vs Dog image classification model using Keras in Dataiku DSS. However, I am having certain difficulties in constructing the path for flow_from_directory(). Before we get started, here's a structure of the training data I am using. "training_dataset" is present in the…
Good morning, We would like to print the size of all the datasets in a specific flow zone through Python code in order to monitor the diskspace already taken. Do you know if there is a way to accomplish this task? Operating system used: Linux RedHat
Create an account to contribute great content, engage with others, and show your appreciation.