Using Dataiku
- I have a problem when using a python recipe that uses the kubernetes engine when running. I checked the log step by step, the python process should have been completed, but for some reason the kuberne…Last answer by Turribeach
This is an old version which has had many fixes released already. I suggest you test on an v13 environment to see if you can reproduce the issue.
Last answer by TurribeachThis is an old version which has had many fixes released already. I suggest you test on an v13 environment to see if you can reproduce the issue.
- SELECT DISTINCT N1.COLUMN1 as "Column 1", max(N1.COLUMN2) as "Column 2" FROM DB GROUP BY N1.COLUMN1 ORDER BY N1.COLUMN1 Hello, I'm trying to reproduce an sql script with Dataiku recipes, and in that s…Last answer byLast answer by DarinB
Hi @cretois,
I believe @Turribeach is correct in that what you're asking would be result in invalid SQL.
You can accomplish what you want by using the Group recipe followed by a Join recipe.
1. Use Group to get your aggregations (from your pseudo code, group by "Column1" and calculate max of "Column2")
2. Use Join to combine your aggregations with your non-aggregated columns from the original dataset (join on "Column1" and include "Column2_max" & "Column3") - I’ve developed a custom Python API endpoint for regression and successfully predicted outcomes for individual records. However, when I attempt to process a batch of records, I encounter the following …Last answer byLast answer by jpham3
@dchoudhari were you able to get this to work?
I am attempting to what @Alexandru suggested in my test queries for my custom python endpoint and getting an error:
java.lang.UnsupportedOperationException: JsonObject
This is my test query using valid json:
{
"features": {
"items": [
{
"name": "name1"
}
]
}
} - I'm using visual recipes to create a simple RAG system. Can I create API from my RAG set up? Operating system used: LinuxLast answer byLast answer by DarinB
Hi @lji,
There may be a couple of ways to accomplish what you want.
One way, and maybe the easiest), is to use the augmented LLM that you've already created in your simple RAG setup. Here's a tutorial called Perform Completion Queries on LLMs. This example shows you how to find and use your augmented LLM programmatically with Python.
Another way to go about this could be accomplish using the Dataiku Answers API. Dataiku Answers is a chat UI built by Dataiku for scalable conversational applications. This approach requires a few more steps than previous one:
1. Set up your RAG (which you've already done).
2. Install Dataiku Answers (I believe you need Dataiku 12.5 or later) via the plugin store.
3. Configure your Dataiku Answers chat application (documentation).
4. Enable and use the Dataiku Answers API (documentation).
5. Create and expose a Python function that sends queries to your chat application as an API service (documentation).Using Dataiku Answers and the Dataiku Answers API offer a couple of advantages: you get a webapp without having to write any code and it includes a conversation history dataset (you'd have to build this yourself).
- We are experiencing long execution times for a recipe in Dataiku due to handing large datasets, while we have implemented partitioning using a filter on a specific column, it still takes 1.5-2 hours t…Last answer byLast answer by Turribeach
What type of connection is your dataset using?
I will suggest you move away from partitioning data in Dataiku. Partitioning data in Dataiku does not improve performance like in other technologies, it merely avoids having to compute the whole dataset. This may reduce compute time but it introduces a whole set of limitations and issues when using partitions in Dataiku. If you only aim for using partitioning in Dataiku is performance you should move away from that and look at data technologies that can handle your datasets matching your performance requirements. A few samples are Databricks, BigQuery or Snowflake.
- Hello DataIKU friends, I would like to enforce better documentation standard across our DSS project, one way I would like to do this is by having a "template" project description which each author com…Solution bySolution by Sergey
Hi @ben_p
Have you had a chance to check the project creation macro:
https://doc.dataiku.com/dss/latest/plugins/reference/project-creation-macros.html
So when you finished, you should have something like this in the dropdown for the new projects:
- Hello everyone, I am trying to parse XML values within a column of a Dataiku dataset using a visual recipe (preparation recipe) in Dataiku. For JSON values, I can use processors like "Unnest Object (F…Last answer byLast answer by KYOUNGJIN
Hello younhyun,
I tried the process again using the XML file you provided.
I'm sharing my progress this time, along with an additional screenshot since the feature shown in your example doesn't seem to be enabled on my side.Could you please let me know which Dataiku version you're currently using?
Thank you!
- Hello, I am currently using an Other SQL Database connection in Dataiku and utilizing Visual Recipes to save data to a table. In some cases, such as when uploading files or handling simple datasets, t…Last answer byLast answer by KYOUNGJIN
Hello Turribeach,
I initially suspected the issue was related to using an "Other database" connection, but it seems we need to test this again with an officially supported SQL connection.
Thank you for reviewing this issue.
- Hello everyone, I would like to store a column of type geometry or geopoint in my HDFS dataset with the aim of later performing a geojoin recipe between a geometry column containing polygons and a geo…Last answer by
- hi i have a project that was working fine ,but since i took this project to automation env some of my recipe in dev env and automation env are returning empty it important to know that the dev and aut…Last answer byLast answer by shira
im getting an error that im not sure that affects the process
"error- dip.hiveserver2.log.sniffer- failed to get the log from the statement java.lang.reflect.invocationTargetException…..
couldnt find log associated with operation handle"the oddest thing is that if i clone the recipe and i create it on a new dataset i'm getting results