Hi, I am new to DataIku and trying to find areas of overlap in 2 datasets using fuzzy matching. Is there a way to get a numerical ranking for how close matches are, so I can identify the highest matches and remove duplicate suggestions if needed? Thanks,
Hello, I have an issue with my Dataiku project. I wrote a Python script that appends new data from the input dataset to the output dataset. I think the problem may be related to recursion in Dataiku. Could you please suggest a solution? Thank you in advance!
I'm working on a project that requires me to send an R Markdown report to a Box folder via email daily. However, the emailed file has the same name each day, leading to overwrites and versioning issues (e.g., v1, v2, v3). I'd like to automate this process by adding the current date to the filename of the emailed report.…
Hello experts, In dataiku v12.3.0, I was trying to append dataframe using write_dataframe() in existing dataset (with same schema). But it always overwrites with last dataframe even though the dataset spec is configured like: dataset.spec_item["appendMode"] = True The dataset is classified as output so it doesn't let me…
I often send screenshots of the charts to my stakeholders over Teams chat to get quick confirmations/alignment on the intermediary results. A feedback I heard often is that they are having a hard time reading the legend, axis titles, values on axis etc. I don't want to manually change the font size for all these different…
I would like to make a custom Webapp where the Python backend talks to the frontend and vice versa. I see that the Dataiku Answers webapp uses websockets and I would like to do the same. My current attempts using Flask-SocketIO did not work unfortunately, as it seems to use Werkzeug under the hood and I cannot start the…
I trained and deployed a model using mlflow in Dataiku. I want to make predictions on a test dataset using this deployed model. However, I don't want to use the "predict" visual recipe. Instead, I want to load the model in a script and make predictions. But I am not able to do it. Operating system used: Linux Operating…
Hi - I am trying to create a scenario that will auto-trigger once other time-based scenarios (in other projects) have completed. I think this is possible for 1 scenario using "Trigger after scenario" which automatically checks the status of a scenario at the frequency you set but I can't figure out how to do this using…
I have added a current_date column to my table in Greenplum using a Prepare recipe (with now() in Formula language). I want to sync this column to an Oracle database, but I need to keep only the date part of the value. For example, I want to convert a value like 2025-01-25T21:50:28.102Z into 2025-01-25 and store it as a…
Hi, Is it possible in a split (and only a split recipe not an sql one) with formula to check a condition by join. For example, to check that one of the child lines a of a parent b contains a certain value. If so, put all the corresponding lines a ->b* (of which at least one b satisfies the condition) in the split. Best…
Create an account to contribute great content, engage with others, and show your appreciation.