when I use DSS v 13 to push execution of visual recipes to containerized execution on Kubernetes cluster(k8s), using Spark as the execution engine. I pushed two images to registry: dku-exec-base and dku-spark-base However, when I run the recipe it takes forever running (creating and deleting pods in k8s), I found this line…
Hi all Generally speaking, what are the optimal routes in Dataiku to host e.g. an instruct fine tuned Falcon 7B model using Dataiku? Would it be building a code studio and using vLLM or something along those lines? Or is there capability as part of the LLM mesh? We'd like to host open source models that are instruct fine…
Hello, I used the feature generation feature in AUTO ML predictive model and I noticed that my core variables were repeated with "computed" in parenthesis with different regression coefficient for them. I was wondering if there was any documentation on how dataiku handles interactions for feature generation or anyone could…
Hello, I would like to delete the nth row of my output dataset, yet I don't know how to do that..
In the above example Root element XPath = /ORDERS/ORDER/ORD_DETAIL_set/ORD_DETAIL XPaths to context = /ORDERS/ORDER/ORD_NUM/text() → ORD_NUM I have the result ORD_NUM not aligned with the ORD_DETAIL_set but one row lagging because I believe it is expecting XPaths to context to be before Root element XPath. Is there…
Is it possible to create a custom error message for a check? I've created some checks for important datasets but find the default error messaging of "Checks on the output produced 1 error" a bit lacking as this does not give any information on what's wrong to users who are not as versed in Dataiku or programming in…
When attempting to setup a connection with Chroma I get an error regarding needing sqlite3 >= 3.35.0. vectorstore = Chroma( collection_name="full_documents", embedding_function=OpenAIEmbeddings() ) I get the following error: RuntimeError: Your system has an unsupported version of sqlite3. Chroma requires sqlite3 >=…
Hi Dataiku Team, I am trying to create a project level API key which can be used to execute scenarios in that single project from an external application. I created the API ID and secret in the project security in the Design Instance ( or Dev) This secret works and the external application is able to run the scenario on…
Hi, Initially i have trained the model for partition dataset and deployed the partitioned model (Partitioned on CITY column) . with model Id = 'XYZ' The requirement is to re train the partitioned model for all the partitions (The Number of partitions changes every time the dataset is reloaded) and activate the latest…
Hi , I have build and deployed liner regression model on a partitioned data, so there is a regression model for each partition. I want to get the regression coefficients of each of those model with a python code or recipe (basically to automate, I do not want to download the coefficients manually). Does anyone has any idea…
Create an account to contribute great content, engage with others, and show your appreciation.