-
Dataiku to Greenplum: Performance Lag on Large Data Loads & Batch Read Control
Hello, During a Proof of Concept (PoC), we're experiencing performance degradation when loading 20 million rows of data with 500 columns into GPDB (Greenplum Database). We've observed in the Dataiku logs that it continuously reads data in batches of 2000 rows. We're looking for a setting to adjust this batch size. We've…
-
Run python Recipe with Scenario
Hi, I have a Python recipe that takes two datasets as inputs and provides a dataset as output, now I want to run this recipe with a scenario and run it every day at a specific time. How can I run this recipe? Thanks
-
Question about the install path of Dataiku
Due to the space in my username(Zhao Guanghao)of my laptop, I can not run dataiku correctly. Meanwhile, the filename under the 'User' file in C disk can't be modified, once you rename it directly, the system can not identify the account and it will collapse. Besides, Dataiku can only be installed in the path 'C…
-
dataiku library messagesender Email CC
Hi All, I am trying to use the MessageSender to send some emails in custom recipes. I was hoping if there is a way to pass the CC email list in the send function Please do let me know if there any suggestions or alternatives from dataiku.core.message_sender import MessageSender s = MessageSender(channel_id='SMTP',…
-
Saving Vector Store as KB
I was wondering if there was any way of saving a FAISS vector store I create in a python notebook as a knowledge bank I can use later on? I created a vector store (see code below) which has summaries as the embedded objects, and the parent documents as the retrieved documents. I did this based on LangChain's…
-
Using date in DataIKU
Hi, Despite going through documentation multiple times, I still don't really understand how dates work in DSS. I'm importing dataset from a connection. Without turning on any of the options in Date & Time handling, this is how data looks like: It says that the data type is string, while in the database itself it is, in…
-
SOLVED. Cannot replicate GLM predictions
SOLVED. It was the offset - needed to take the natural log of it before calibrating. Hello, I built a model using the GLM Classification plugin. The AUC is ~0.8 so it's fitting my data well but when I implement the GLM formula manually into Tableau the predictions are far too low despite having the correct shape. The model…
-
how to turn on chart zoom in&out feature in dashboard
Hi community, When I plot chart, there is a nice feature of zoom in&out by date (in the bottom of the chart), as shown below. However, when I publish the chart into dashboard, it seems that this feature is dropped by dataiku (see below chart). Is there a way to turn on such timeline/date zoom in&out feature in dashboard…
-
How to extract rows flagged by a custom Python rule in the Data Quality tab ?
Hi everyone, I'm working with Dataiku DSS version 13.5, and I'm using the Data Quality tab on datasets to define validation rules. When I use standard rules (e.g., missing values, uniqueness, etc.), I can easily export the rows in error. However, when I define a custom Python rule, I can see the column status marked as…
-
how to define helper in python code in Dataiku project
In Dataiku project I've got python code - but I need also to declare additional python code that will be than used like library from HelperLibrary.library1.codev1 import testprocessing Is it possible? Operating system used: Windows Operating system used: Windows