-
RAG LLM for multiple datasets
Greetings, While working with the embedding recipe, we faced a limitation where we have two datasets, we want to apply the rag on, how can we apply the knowledge bank on them specifically? Regards
-
Support image segmentation in labelling tasks and Visual ML
I have a couple of use cases where I need to train image (instance) segmentation models (as opposed to predicting bounding boxes in object detection). I'd love for the ML labelling to support image segmentation approaches. For example using SAM (Segment Anything Model) to pre-segment images which can then be annotated by…
-
Support fot 2way partial dependence plots
I'd love to see support for 2way partial dependence plots in mode summary reports to get insights into the interaction of 2 features on their model impact. This would give some deeper insight into feature behavior in the model at hand. See here under 4.1.1 for the sklearn implementation 4.1. Partial Dependence and…
-
Feature handling Dummy encoding
Dataiku's category handling = Dummy encoding with dropping dummy option seems to be using a level with the least exposure/volume as a dummy. Q1. Is there a way to set this dummy manually instead of Dataiku's default method? Want to avoid using category handling = custom preprocessing option. Q2. Using Variable type =…
-
Trouble Training new Models in an existing Project
Hey there, so I am having trouble training new models on an existing project, if I either update an existing recipe or deploy the newly trained model in a new visual tool in the flow whenever I try to score a dataset, I am getting the following error: Error in python process: <class…
-
Are there different ways to set up code environments?
I am trying to install pytorch in python3 in a code environment in data science studio. I can install it in the python3.5 install on the system that Data Science Studio is installed on. I've tried putting torch in the REQUESTED PACKAGES (PIP) part of the code environment administration but that doesn't work because pytorch…
-
torch.cuda.is_available() is False
I am trying to load a pickle file of a pre-trained model to my code recipe but get the following error message: I have already selected the "ai-exec-t4-gpu" in the "Containerized Execution" tab of the environment. I do not understand exactly what could be going wrong here. Appreciate your help!
-
I have pre-trained models that I would like to use in dataiku recipe.
Hi, I have pre-trained model on my local machine that I would like to use in a recipe. One model is trained using the alibi-detect library and the other one is the popular SAM model. Appreciate any tips on how to use these models in a dataiku recipe.
-
How to add a custom calculated model metric into the evaluation store in Dataiku?
Hello everyone, I am currently working on a project in Dataiku and trying to log a custom model metric into the evaluation store. The model is not a visual ml model its a custom model that is logged in Dataiku as a saved model using the mlflow integration that Dataiku offers. However, I am not sure how to add a custom…
-
Turn a custom model in the flow into a model object
I was told that it was possible to turn a custom trained model, typically stored in a managed folder, into a visual model object in the flow. Currently our flow looks like this: but we would like to see something like this in the flow: I couldn’t find any documentation on how to do this, so I’m turning to the Dataiku…