-
RAG LLM for multiple datasets
Greetings, While working with the embedding recipe, we faced a limitation where we have two datasets, we want to apply the rag on, how can we apply the knowledge bank on them specifically? Regards
-
Looking to replicate a SUM(COUNTIF) formula in Dataiku
I am working on a scorecard in Dataiku and I would like to calculate the percentage of completion in a set number of columns. Basically, I would like to replicate this formula in excel: =SUM(COUNTIF(ColumnX:ColumnXX,"*")/Total Number of Columns) and am having issues. The columns are a mix of strings, integers, and text,…
-
How to append dataframe in existing output dataset
Hello experts, In dataiku v12.3.0, I was trying to append dataframe using write_dataframe() in existing dataset (with same schema). But it always overwrites with last dataframe even though the dataset spec is configured like: dataset.spec_item["appendMode"] = True The dataset is classified as output so it doesn't let me…
-
How can I replace a dataset created from a csv?
I have uploaded a CSV and stored it in the filesystem_folders. I have built several recipes from this dataset. I have now received an updated version of the CSV, but cannot figure out how to upload it and overwrite the original dataset. It seems to require I create a new dataset. If I do create a new dataset, there doesn't…
-
How to implement a feedback loop on a dataset ?
Hello, Each month, I have to compute a dataset that takes the previous month's dataset (M-1) and add some stuff in it. I wonder how I could to it in Dataiku as for the recipe, I should take the last output dataset (M-1) as the input. I don't think it is currently possible to produce a feedback-loop in Dataiku: do you…
-
How do i create a categorisation model for a reviews dataset
Hi there - new to dataiku, Lets say i have an excel sheet of 2 columns where one has app reviews and the other has dates they were posted. Is there a video tutorial anywhere or example where i can create a model to categorise the app reviews into categories eg) ux/ui problem or customer service problem as well as include…
-
Window recipe not producing expected results when using DSS engine
Hi there, The issue I am having is that the DSS engine is producing a completely different result than when I use the SQL engine. Has anyone faced a similar issue? I would appreciate some insight on this. Basically, all I want to do is produce a columns with the MAX() value inferred from another column. No partitions, no…
-
How to output to / update my snowflake table using Dataiku
I have a snowflake table and I've set up the connection and everything looks good, Dataiku requires me to create a dataset using that snowflake table that I can use as my input / output. The issue is I have that dataset as my output and when I run my flow, I can see my results, but it isn't actually outputting to my…
-
How to correctly do time conversions
I have a column that has been parsed and is in UTC, when I try to format the date to be in eastern / New York time I get a new column that is -5 hours, but isn't the current the current difference -4 hours? I'm sure this has something to do with daylight savings time vs normal time, but I just want to ensure that my…
-
How to choose between updating and appending an output table
I have an connection / output recipe that is outputting to a table in snowflake, I don't see any option in here to choose between outputting and appending data, where can I set that up? does the option pop up after I press run? or is that something I can set up before hand? Operating system used: windows