Using Dataiku

Sort by:

61 - 70 of 5.2k

RAG Webapp
Hello everyone, In my Dataiku Flow, I have a RAG setup that includes embeddings and prompts. I’d like to replicate this process—achieving the same results as in Prompt Studio—in a Dash web app. The go…
Answered ✓
Webapps
Chrome
Enterprise
Started by art271
Most recent by Alexandru
Mar 6, 2025
0
1
Solution by Alexandru
Hi,
You can leverage headless API via Dash Webapp :

https://developer.dataiku.com/latest/tutorials/webapps/common/api/index.html

Use the KB and LLM mesh APIs

https://developer.dataiku.com/latest/concepts-and-examples/llm-mesh.html#using-knowledge-banks-as-langchain-retrievers

THanks
Solution by Alexandru
Hi,
You can leverage headless API via Dash Webapp :

https://developer.dataiku.com/latest/tutorials/webapps/common/api/index.html

Use the KB and LLM mesh APIs

https://developer.dataiku.com/latest/concepts-and-examples/llm-mesh.html#using-knowledge-banks-as-langchain-retrievers

THanks
Reply to Discussion
Reply to Discussion
getting error while importing dataiku project.
while i am migrating dev to val, getting the below error. canyou please provide me with how can i fix it. Importing archive... Traceback (most recent call last): File "/app/dss_install/dataiku-dss-13.…
Answered
Enterprise
Edge
Dataiku Version 13
Started by Mastanval
Most recent by Alexandru
Mar 6, 2025
0
1
Last answer by Alexandru
Hi,
The error suggests you are missing a code env

dataikuapi.utils.DataikuException: java.lang.IllegalArgumentException: No PYTHON code env for Python_39_MLF

Can you try creating or mapping to an existing code when trying to import the project?
Thanks
Last answer by Alexandru
Hi,
The error suggests you are missing a code env

dataikuapi.utils.DataikuException: java.lang.IllegalArgumentException: No PYTHON code env for Python_39_MLF

Can you try creating or mapping to an existing code when trying to import the project?
Thanks
Reply to Discussion
Reply to Discussion
MS SQL SERVER CONNECTION
Hello Everyone, I created a connection with my Azure SQL DB using the MS SQL Server connector, the connection went well, but when I clicked on get table list, I got the following error message: Oops: …
Answered
Enterprise
Edge
Dataiku Version 12
Started by samuel_acr_96
Most recent by Alexandru
Mar 6, 2025
0
1
Last answer by
Last answer by Alexandru
Hi @samuel_acr_96 ,
If the issue still persists, can you please open support along with the instance diagnostics taken immediately after your tests? The logs may provide more information on the exact exception

Thanks
Reply to Discussion
Reply to Discussion
column Index
I have a dataset which has Jan, Jan_1, Feb, Feb_1... I was to use Column index to pick the last column. Can you help using Column index without Python?
Answered
Started by Poornima
Most recent by Yasmine_T
Mar 6, 2025
0
3
Last answer by
Last answer by Yasmine_T
Hi again!
This use case would be better handled by our team in a support ticket if you'd like to create one and follow up on there:
https://support.dataiku.com/support/tickets/new
We would be happy to provide support and help you with your use case:)
Best,
Yasmine
Reply to Discussion
Reply to Discussion
Access to webapp back-ends from outside of DSS (with Postman)
I found this article but i have some questions, hope someone can help me. I created a standard webapp with python server, im trying to access the endpoints from postman. I'm sending my project apikey …
Answered
Started by pPGrillo
Most recent by Turribeach
Mar 5, 2025
0
1
Last answer by
Last answer by Turribeach
Well the post you linked says you should pass the key as an HTTP header variable ("X-DKU-APIKey") not using basic authentication. So change your Postman request to pass the key via the required variable. And the API Designer is something completely different than Webapps so it’s not relevant to your issue.
Reply to Discussion
Reply to Discussion
Seeking Optimization Tips for DSS Flow and Spark Configuration
hello everyone, I am currently working on optimizing my DSS flow. I have a scenario that currently takes 20 minutes to execute, and I am looking to reduce this time to just 5 minutes. I would greatly …
Answered
Enterprise
Edge
Dataiku Version 12
Started by HAFEDH
Most recent by Turribeach
Mar 5, 2025
0
3
Last answer by
Last answer by Turribeach
So the first thing you need to realise is that while Dataiku allows you to build a complex data pipeline in a visual way without writting any code this is never going to be the most optimal way of loading/preparing large datasets as fast as possible. The fact that DSS persists all the intermediate datasets is both a big advantage (explainability, debug, etc) and a big dissadvantage too (lots of redundant data, lots of reads and writes). Depending on the recipes and connections that you use you may be able to enable SQL pipelines in part of your flow which should make those recipes run faster.
You should change the flow view to Recipe engines. Any recipe showing as DSS engine should be reviewed because this means the data will have to moved to the DSS server for processing and back to the database for writing the output. This tends to be slower than SQL engine which means the execution happens only on the database without data moving to Dataiku.
Finally you should review your SQL database and make sure it's sized and tuned accordingly. When you start to get into millions of rows traditional RDBMS databases start to struggle so moving to other technologies that can handle billions of rows at speed will help (like Databricks, Snowflake, BigQuery, etc).
Reply to Discussion
Reply to Discussion
Spark Configuration for optimization resource allocation
Hello, I am interested in understanding how to configure Spark settings to ensure optimal resource allocation. Specifically, I am looking for guidance on configuring parameters like spark.driver.cores…
Answered
Enterprise
Edge
Dataiku Version 12
Started by HAFEDH
Most recent by Grixis
Mar 5, 2025
0
1
Last answer by
Last answer by Grixis
Hi HAFEDH,
If you are working working in a large enterprise, I suppose your dataiku instance is managed by an IT service or specific infras service ? In this case, it's not your responsability to change these settings if you have not been infromed of their effectiveness. And if changing the params has an impact on performance, this means that you can affect the availability of the spark queue. It's up to you to decide whether you want to take the risk, given that you specify that the request to use spark jobs in your entreprise is significant and that, you have no knowledge of spark sessions.
Nonetheless, technical you can just try to create a pyspark recipe to script whatever you want to benchmark and try fit tuning a batch of differents config empirically. But the result will depend on your task's resource needs, as each type of job you want to optimize has different needs and performance depending on catching, shuffle, memory and parallelization .
Reply to Discussion
Reply to Discussion
pairwise distance for 2 geopoints
Is there any tool available to calculate pairwise distance. I have 2 different geo points available in dataframe .
Answered
Started by Deep
Most recent by Yasmine_T
Mar 4, 2025
0
1
Last answer by
Last answer by Yasmine_T
Hi,
I hope that you are doing well.
From my understanding you have two columns (col_1 and col_2) with both geopoint data and would like to calculate the distance between these two points as output in a third column (col_3).
If that is the case, we do have a processor called geo distance that is meant to computes the geodesic distance between a geospatial column and another geospatial object. The computation outputs a number of distance units (kilometers, miles) in another column.
In the case of a distance between two geometries, the distance is the shortest distance between these two.
(see: https://doc.dataiku.com/dss/latest/preparation/processors/geo-distance.html).
We have a step by step tutorial available here: https://knowledge.dataiku.com/latest/data-preparation/prepare-recipe/tutorial-geo-processing.html#compute-the-distance-between-two-geopoints

If this does not cover your use case/your request is different, please let me know!
Best,
Yasmine
Reply to Discussion
Reply to Discussion
Dynamic Column
I have an excel input file. Col A to Col T till row T26, I have data where Col T have latest month data. Col V to Col AO have second set of data till AO50. Now, its dynamic data, every month, a new co…
Answered
Edge
Business
Dataiku Version 12
Started by Poornima
Most recent by Turribeach
Mar 4, 2025
0
3
Last answer by
Last answer by Turribeach
No. This has nothing to do with Regex. File based Dataiku datasets use fixed data type schemas. If the file changes you have to manually update the schema. Only Python recipes can write dynamic schemas as their output. And even doing so will complicate your flow so your best option is to pivot the data so months are rows not columns.
Reply to Discussion
Reply to Discussion
Delete records based on multiple JOINs
Newbie here. Trying to convert a SQL from HIVE that pulls records partly based on several JOIN conditions but limits those record based on other JOIN conditions. In SQL it is a "WHERE NOT EXISTS" cond…
Answered
Enterprise
Edge
Dataiku Version 12
Started by Dbase3tate
Most recent by Turribeach
Mar 4, 2025
0
1
Last answer by
Last answer by Turribeach
Use a SQL recipe and you can copy / paste your SQL.
Reply to Discussion
Reply to Discussion