-
trigger based on python script
I am trying to trigger a scenario based on python script but it is not working. It doesn't trigger at all. Can someone tell me what I am doing wrong? The script is : from dataiku.scenario import Trigger from datetime import date import pytz tz = pytz.timezone('Europe/Amsterdam') t = Trigger() now = datetime.datetime.now(tz…
-
Joins recipe on large datasets causing issue.
Hi @AlexT , I am using join recipe and there i am joining tow datasets one of the dataset has 7M records. It is caching that dataset in memory and running for longer period of time and later I am getting out of space issue. Kindly help me how I can resolve this issue. Regards, Ankur.
-
Logging metrics
Hi everyone, I am running SQL recipes that create datasets in DataIKU, on occasion I need to troubleshoot the output of these and I am interested in seeing metrics such as row count, trended over time. I could write another SQL recipe that looks out the output dataset and creates another, but I wondered if there was a way…
-
synchronize dataset schema from metastore
Hi Everyone, If i have a S3 dataset and it has a linked Glue Metastore. I know we can sync the dataset definitions to the metastore. But i was wondering is there a way to sync the dataset schema from the metastore as well.
-
getting 404 error while accessing the file from directory
Hi Folks! Thanks in advance. I am using dataiku since last 4 months for a standard webapp type project in Dataiku. I want to render a pdf file into frontend from the local dataiku directory. I have got the path and name of file from below get_pdfs function and trying to display pdf file into html with #pdfpara id. When I…
-
Please help Prepare simple multiplication Recipe-Automation
Hi, I'm a complete beginner and trying to do a simple automation after finishing core design trainings. I have raw file that comes monthly in the format below. There are five products below and I need to multiply each product(product will be in a text format) by their corresponding month in that row. Please see my example…
-
Reduce Default Sampling Nb. Records from 10000
Hi, I am just wondering on a user-level am I able to reduce the default number of records queried for when sampling a dataset from the 10k default to a custom amount? Often I want to adjust the sampling parameters or filters and waiting for 10k records to load before doing that is time wasted. Thanks!
-
save digraph from notebook
Hi, From this example, # Create Digraph object dot = Digraph() # Add nodes dot.node('1') dot.node('3') dot.node('2') dot.node('5') # Add edges dot.edges(['12', '13', '35']) # Visualize the graph dot I would like to save this graph on a folder with a svg or png extension. Thanks
-
FutureAborter error, not listed in docs
Hi, We have a partitioned scenario that looks to be failing, we think due to memory: [2022/03/17-10:24:25.351] [ShortTaskExec-0] [INFO] [dku.future.aborter] - Executing abort on FutureAborter FutureAborter@1215237485,createdInThread=ActivityExecutor-56-56,forChildThread=false[2022/03/17-10:24:25.353] [ShortTaskExec-0]…
-
Most optimized way to keep the libraries loaded ?
Hi, I want to create a realtime API using Dataiku, the thing is that the API will be called a lot per day, but the libraries required for the treatment takes very long to load (about 20 seconds). Is there a way to keep the librairies permanently loaded in the DSS ? So that I only need to run my ML/ DL jobs without having…