-
Save ML Lib pipeline model in pyspark recipe to hdfs managed folder without using local file system
I can't use a Dataiku Lab feature to train our model for various reasons, and I need to do it in a pyspark recipe (spark submit). I am training an ML Lib GBTRegressor. Once the pipeline model is trained, I would like to save it. I have no access to the local filesystem (our IT policies). I also don't have access to hdfs…
-
How to keep original data when deleting shared data
When I share a data by using "share to a flow zone" function and deleted shared data, the original data is also deleted. I want to keep original data when deleting shared data, how can I do this?
-
Why does Dataiku consistently choose the wrong data type for excel files I upload
Should I be using excel files? No. But that's the world I live in. Dataiku needs to do better in data types. It requires significant and repetitive manual intervention to make sure that it continues to apply the right data type to my excel files. There is no good reason it has to be this bad. Particularly annoying is the…
-
How Can I Integrate Speakatoo’s Text-to-Speech API with Dataiku for Audio Data Insights?
I’m looking to enhance my Dataiku workflows by integrating Speakatoo’s text-to-speech (TTS) API to turn data insights or alerts into audio. Has anyone tried using a TTS service like Speakatoo within Dataiku for this purpose? I think it could help make data monitoring or reporting more accessible. What challenges should I…
-
Data behavior when importing data from a DB connected to Dataiku
When Snowflake and Dataiku are connected and data is imported, Dataiku will physically import the data? Thank you in advance.
-
Possbile lazy loading bug
I'm running version 13.0.3. I've been mass importing some tables and at one point you get to a screen where the names of the datasets are shown. I then want to change those one by one and I navigate by using tab. I've noticed that the navigation goes wrong after some tabs and I have to scroll to where I was and place my…
-
Possible flow copy bug
I've been copying a few flows within a project, reusing them of sorts. When you copy to the same project you get a screen where you can rename the datasets (or it will do the _1 suffix). I've noticed some recepies seem to be missing in these lists. I haven't tested extensively but I've seen missing join recipes, filter…
-
Github Configuration within Dataiku
Hi Team, Do you have any recommendations on how to clone our git repo within Dataiku? I did try cloning within our project and had trouble retrieving that code. Is it necessary to have github plugin installed by default? Thanks!
-
Is there any limit to the number of projects within a subscription?
Is there any upper limit to the below within a subscription? No of projects No of workspaces
-
Dataiku flow documentation
From the Flow documentation, there are values for Datasets (total size and record count). How do I add more datasets metrics such as Min, Max, Std Dev, etc into the documentation? Hope that someone can share the specific steps. If there are samples, it will be much appreciated.