-
How can I change the default location where a .conf file is created to any custom location?
Hello Community, I am using the great expectations package in a dss project for my Data Quality checks. I have already installed it in my code env and using it in a python script for the time being. Even though, the package properly runs in a python notebook when I save it back to recipe I get the following error: " Job…
-
RAG Webapp
Hello everyone, In my Dataiku Flow, I have a RAG setup that includes embeddings and prompts. I’d like to replicate this process—achieving the same results as in Prompt Studio—in a Dash web app. The goal is to reuse the knowledge base built in the Flow and leverage the augmented LLM created by the embedding recipe. Does…
-
getting error while importing dataiku project.
while i am migrating dev to val, getting the below error. canyou please provide me with how can i fix it. Importing archive... Traceback (most recent call last): File "/app/dss_install/dataiku-dss-13.2.2/python/dataikuapi/dssclient.py", line 1490, in _perform_http response.raise_for_status() File…
-
MS SQL SERVER CONNECTION
Hello Everyone, I created a connection with my Azure SQL DB using the MS SQL Server connector, the connection went well, but when I clicked on get table list, I got the following error message: Oops: an unexpected error occurred The connection is closed. Please see our options for getting help HTTP code: 500, type:…
-
Seeking Optimization Tips for DSS Flow and Spark Configuration
hello everyone, I am currently working on optimizing my DSS flow. I have a scenario that currently takes 20 minutes to execute, and I am looking to reduce this time to just 5 minutes. I would greatly appreciate any tips or strategies for optimization. Additionally, I am interested in understanding how to configure Spark…
-
Spark Configuration for optimization resource allocation
Hello, I am interested in understanding how to configure Spark settings to ensure optimal resource allocation. Specifically, I am looking for guidance on configuring parameters like spark.driver.cores, spark.dynamicAllocation.initialExecutors, spark.executor.cores, spark.dynamicAllocation.enabled, spark.executor.instances,…
-
How to dynamically output recent 3 days' data with partitioned dataset
Hi all, I am new to Dataiku world, I'd like to ask the right way to output data with specific time range with partitioning method. The thing I want to do is: Dynamically build the recent 3-days data from input datasets. (Use Time Range day partition) I've tested but the output seems to grow even more than I expect, and the…
-
Delete records based on multiple JOINs
Newbie here. Trying to convert a SQL from HIVE that pulls records partly based on several JOIN conditions but limits those record based on other JOIN conditions. In SQL it is a "WHERE NOT EXISTS" condition. The following is the code - SELECT x FROM y, z Multiple left joins… (and) WHERE NOT EXISTS ( SELECT 1 FROM…
-
Avoid python recipe to change data types
Hello everyone, I would like to prevent python from inferring the data type of my dataframe during a python recipe . For example, I would like an id column to remain in string type rather than dataiku converting it to float. I could, for example, convert each of my columns manually, but this is tedious for datasets with…
-
How to leverage locally hosted llm using python scripts
Dataiku Version: 13.3.1 I have several LLMs that I have in my DSS cache from the HuggingFace Connection. I can leverage these models using the prompt recipe. However I am struggling to use them in custom python scripts. For example if I get the LLM id of all the models I have a connection to using the following script, and…