-
データセットのEmbeddingで作成したナレッジバンクのデータ抽出時のメタデータ列の動的フィルタリングについて
年齢や性別などのメタデータ列と自由記述欄列を持つデータセットに対してEmbeddingレシピを実行し、ナレッジバンクを作成しました。その際、メタデータ列に年齢や性別列を設定しています。 そのナレッジバンクに対してKnowledge Bank Search toolを利用し、ビジュアルエージェントからデータ抽出しようとしています。その際、メタデータ列で動的なフィルタリングをしたいのですが、可能でしょうか。 Knowledge Bank Search tool画面のAllow dynamic filtering項目にチェックすることで可能だと考えていたのですが、うまくいきません。対象列のmeaningをBag of…
-
Censored Regression
It is often the case that modelers encounter censored data, or data that falls >x or <y. In these cases, there are some typical approaches to address the challenge of building a regression model, but currently these are not available in Visual ML. As such, Interval-censored regression, Tobit regression, or censored…
-
Can the hyper-parameter change for each new training model with each new dataset
Dear dataikuler thanks for reading my question. Hi, so my problem is when i re-train my model with different dataset (like my first dataset is from 12/10 /2024 to 12/10/2025 and my second dataset is 30/11/2024 to 30/11/2025) and then i deploy the second model i check the hyperparameter of each version and i see all of them…
-
Turn a custom model in the flow into a model object
I was told that it was possible to turn a custom trained model, typically stored in a managed folder, into a visual model object in the flow. Currently our flow looks like this: but we would like to see something like this in the flow: I couldn’t find any documentation on how to do this, so I’m turning to the Dataiku…
-
<class 'json.decoder.JSONDecodeError'> when evaluating a deployed Random Forest model
How to replicate: Using windows10, download the latest Dataiku DSS on-premise version (13.2.3). Create a New project, upload any dataset with a "target" column having binary value. Click the dataset - Lab - AutoML Prediction - Quick Prototype - Train a Random Forest model on "target", using default settings. Deploy the…
-
RAG LLM for multiple datasets
Greetings, While working with the embedding recipe, we faced a limitation where we have two datasets, we want to apply the rag on, how can we apply the knowledge bank on them specifically? Regards
-
Support image segmentation in labelling tasks and Visual ML
I have a couple of use cases where I need to train image (instance) segmentation models (as opposed to predicting bounding boxes in object detection). I'd love for the ML labelling to support image segmentation approaches. For example using SAM (Segment Anything Model) to pre-segment images which can then be annotated by…
-
Support fot 2way partial dependence plots
I'd love to see support for 2way partial dependence plots in mode summary reports to get insights into the interaction of 2 features on their model impact. This would give some deeper insight into feature behavior in the model at hand. See here under 4.1.1 for the sklearn implementation 4.1. Partial Dependence and…
-
Feature handling Dummy encoding
Dataiku's category handling = Dummy encoding with dropping dummy option seems to be using a level with the least exposure/volume as a dummy. Q1. Is there a way to set this dummy manually instead of Dataiku's default method? Want to avoid using category handling = custom preprocessing option. Q2. Using Variable type =…
-
Trouble Training new Models in an existing Project
Hey there, so I am having trouble training new models on an existing project, if I either update an existing recipe or deploy the newly trained model in a new visual tool in the flow whenever I try to score a dataset, I am getting the following error: Error in python process: <class…