-
Can the hyper-parameter change for each new training model with each new dataset
Dear dataikuler thanks for reading my question. Hi, so my problem is when i re-train my model with different dataset (like my first dataset is from 12/10 /2024 to 12/10/2025 and my second dataset is 30/11/2024 to 30/11/2025) and then i deploy the second model i check the hyperparameter of each version and i see all of them…
-
Turn a custom model in the flow into a model object
I was told that it was possible to turn a custom trained model, typically stored in a managed folder, into a visual model object in the flow. Currently our flow looks like this: but we would like to see something like this in the flow: I couldn’t find any documentation on how to do this, so I’m turning to the Dataiku…
-
<class 'json.decoder.JSONDecodeError'> when evaluating a deployed Random Forest model
How to replicate: Using windows10, download the latest Dataiku DSS on-premise version (13.2.3). Create a New project, upload any dataset with a "target" column having binary value. Click the dataset - Lab - AutoML Prediction - Quick Prototype - Train a Random Forest model on "target", using default settings. Deploy the…
-
RAG LLM for multiple datasets
Greetings, While working with the embedding recipe, we faced a limitation where we have two datasets, we want to apply the rag on, how can we apply the knowledge bank on them specifically? Regards
-
Support image segmentation in labelling tasks and Visual ML
I have a couple of use cases where I need to train image (instance) segmentation models (as opposed to predicting bounding boxes in object detection). I'd love for the ML labelling to support image segmentation approaches. For example using SAM (Segment Anything Model) to pre-segment images which can then be annotated by…
-
Support fot 2way partial dependence plots
I'd love to see support for 2way partial dependence plots in mode summary reports to get insights into the interaction of 2 features on their model impact. This would give some deeper insight into feature behavior in the model at hand. See here under 4.1.1 for the sklearn implementation 4.1. Partial Dependence and…
-
Feature handling Dummy encoding
Dataiku's category handling = Dummy encoding with dropping dummy option seems to be using a level with the least exposure/volume as a dummy. Q1. Is there a way to set this dummy manually instead of Dataiku's default method? Want to avoid using category handling = custom preprocessing option. Q2. Using Variable type =…
-
Trouble Training new Models in an existing Project
Hey there, so I am having trouble training new models on an existing project, if I either update an existing recipe or deploy the newly trained model in a new visual tool in the flow whenever I try to score a dataset, I am getting the following error: Error in python process: <class…
-
Are there different ways to set up code environments?
I am trying to install pytorch in python3 in a code environment in data science studio. I can install it in the python3.5 install on the system that Data Science Studio is installed on. I've tried putting torch in the REQUESTED PACKAGES (PIP) part of the code environment administration but that doesn't work because pytorch…
-
torch.cuda.is_available() is False
I am trying to load a pickle file of a pre-trained model to my code recipe but get the following error message: I have already selected the "ai-exec-t4-gpu" in the "Containerized Execution" tab of the environment. I do not understand exactly what could be going wrong here. Appreciate your help!