-
raining Failed – Subprocess Did Not Connect in 60000ms (SocketTimeoutException)
Hi everyone, I'm encountering an issue when trying to train a model in Dataiku DSS. The training fails with the following error: Training failed Read the logsSubprocess did not connect in 60000ms, it probably crashed at startup. Check the logs., caused by: SocketTimeoutException: Accept timed out I am on macbook air Mac os…
-
How to run integration tests on flows with Python recipes
I've recently started to use the "Run integration test" scenario step for testing. It's definitely some work to create the test reference datasets but it once set up it's great to be able to run this test after later code changes to confirm the process works as expected. Our flows typically mostly use SQL script recipes.…
-
Examples for custom prediction in API Designer
Are there any actual useful code examples of using custom prediction in python? I have a model that exists in my Flow and I want to use that model to make a prediction just like the Prediction model api endpoint would do to start and then add more custom code on top of that. The boiler plate code imports dataiku and…
-
Dataiku
While I am trying to add some column values in the resultant column I am getting the value as NaN. Operating system used: Windows
-
Unusual Error with Group By Recipe in Dataiku
Hello, I’ve encountered an unusual error while using the Group By recipe in Dataiku. Here’s a summary of the issue: Context: I created a Group By recipe on three columns, applying three custom aggregations using SQL. Input Data: The recipe takes as input a PostgreSQL (PGSQL) table, which is the output of a JOIN operation…
-
Storing and Retrieving Embeddings in Knowledge Bank via Python
Hello Team, I hope you are doing well. I am currently working on a project in Dataiku 13.1.2, where I am generating embeddings using LLM Mesh in Python code. At present, I am storing these embeddings in a PostgreSQL dataset. However, I would like to store them directly into a Knowledge Bank using Python code. Key…
-
Using a variable or not depending on the scenario
Hi guys, I have a flow with 2 different scenarios. I have one variable v_idproduct used in a post filter join recipe in a sql code (id_product IN v_id_product). In each scenario I have a different list of id products. I want to modify one of the scenarios so that this filter is no longer applied, allowing all product IDs…
-
How to get relationships between flow zones and the datasets
What functionality exists to show the relationship between flowzones. Say we have 42 flow zones with an average of 50 datasets each. Is there a way to summarize the relationships? I am interested in seeing stuff like: Which flow zones have the same color. Which flow zones have datasets that feed into other ones? Which…
-
errors for recipe which contains two outputs, Dataset and managed folder
Hi Everyone, I started facing an error since today for recipe which contains two outputs, one Dataset and one Managed Folder. Do we know what changed in the recent Dataiku update for this?
-
Chrome binary not found inside dataiku instance
hello, i am trying to use the function to use chrome so the user can logging to a website one the user logs in a web scrapping code will be executed however when i run the function i have this error ? are we allowed to call a chrome login inside dataiku instance ? Chrome binary not found at /usr/bin/chromium-browser