Can we use a Kubernetes cluster with the free edition of Dataiku?
Hi, Can we use a Kubernetes cluster with the free edition of Dataiku? Let's say we have a Linux VM in AWS or Azure environment where we have deployed the free edition of DSS Ver 12 or later. Is it possible to use a Kubernetes cluster in this environment to reduce model training time? Thank you. Taka Operating system used:…
Changing sample size in data preview while sample computation's loading
I've changed the sampling method on a big dataset and the Waiting for other sample computation prompt is loading indefinitely. Is there any way to change those settings before current loads? (abort button doesn't help) Operating system used: Windows 10
Project Folder should be capable to manage permissions for underlying projects.
Hi everyone, I’d like to suggest an improvement for Dataiku's folder and project permission management. I find it strange that Dataiku doesn’t inherit folder permissions into project permissions. In case of, project folders are set up for different teams of entities - it shouldn't just be a visual organisation on the…
External data catalog integration
Hi everyone, I'm looking for a way to integrate DataIku into a standalone Data Catalog tool. For example, DataHub. This stems from the fact that some initial data load and transformation happens inside the DWH through orchestration tool like Airflow and transformation tool like dbt. This creates initial datasets that are…
Using Scenarios to automatically retrain models
Hi, I'm new to Dataiku and the community and I'm using Dataiku online. Documentation indicates that scenarios can be used to "Automate the retraining of “saved models” on a regular basis, and only activate the new version if the performance is improved". This is exactly what I need to set up but I can't seem to find…
change model version
Hello, I have trained and created a model in the flow. I have after trained new versions of the model but I can't find the way to change the version on the flow (the active model is always the frist one trained). Thanks Operating system used: Linux
Standalone Recipe Configuration
I am having difficulty in configuring the Standalone Evaluate Recipe It is not able to provide performance drift however i configure. The below is what i see It was also not clear on what a reference dataset is needed as it is optional and there is no mention of it here.
Removing Multiple Spaces from Data in all columns
One of the great features of DSS is that you can trim the leading and trailing spaces from all of the data in all of your columns with one visual recipe Step. Very Cool. I was then trying to figure out how to clean up all the white space. Any ideas? Operating system used: Ubuntu 18.04 - WSL2 - Windows 11
Data validation compared to previous data
Hello, is there a way to check and validate data? I have webshop traffic data in my spreadsheet on a daily basis. These are divided into our different channels like SEA, Price Search Engines, SEO and so on. I'm looking for a way to check if there are major discrepancies in new data compared to previous ones. In this way I…
Executing a .exe in dataiku dss
I have a folder of data which can be converted to a desired format using a .exe application. The input data and .exe isgiven by 3rd party. How can I do this in DSS? Operating system used: Cloud