Using Dataiku

Default project key is not specified (no DKU_CURRENT_PROJECT_KEY in env)
I'm creating a python function endpoint with this script: And I don't know how to deal with this error: Dev server deployment FAILED Failed to initiate function server : <class 'Exception'> : Default project key is not specified (no DKU_CURRENT_PROJECT_KEY in env)
how to increase character length for a column in a table I am loading from Oracle to Snowflake
I keep getting the below error when I am trying to load data from oracle to snoowflake using a dataiku sql query recipe. User character length limit (25) exceeded by string 'Gomez Hidalgo del Castillo' File 'snowflake_stage/snowflake_stage/tmp.FBe55Q5TgEnla6Jj/out-s0-c1.csv.gz', line 2939, character 1 Row 2939, column…
Using the API, what are my options for finding the last time a scenario ran in a project?
I have tried looping through the project's scenarios using get_last_finished_run(), but it looks as if that throws an exception if a scenario has never run, which can be the case. Since that loop is already within a try/except block, the coding gets tricky. Any help much appreciated here. Operating system used: AWS
Simplest way to get the aggregate value from one dataset, and bring it in to another
I have dataset A and dataset B. I need the aggregate total from one column called "Total Commission" from B. I want to bring it into A and populate a single column with that value. I know I can do this in Python with two dataframes and I know I can do this with a join if I create a join key in the datasets. Is there a…
Data validation compared to previous data
Hello, is there a way to check and validate data? I have webshop traffic data in my spreadsheet on a daily basis. These are divided into our different channels like SEA, Price Search Engines, SEO and so on. I'm looking for a way to check if there are major discrepancies in new data compared to previous ones. In this way I…
Handling Project variables in Scenarios
Hi Team, I have created scenario with consists of 5 steps. step 1: sql query to check latest data step 2: set project variable as execution start time using now() formula step 3: Build the dataset step 4: Run Ml model step 5: set project variable as execution end time using now() formula. Here question is step 2 and step 5…
Display user id on Scenario Last Runs screen
I am surprised this is missing from the GUI but what really surprises me more is that it's not even shown in the logs.The fact that one needs to query the API to get this data should be a good indication that both logging and GUI need to have an overhaul. I have posted a Product Idea feel free to vote for it. It also asks…
Data Architecture Diagram
is there any data architecture diagram available for dataiku which shows a complete project ???
Dataset change trigger in a scenario
Hello, Does anyone have an idea why the "Dataset change" trigger on a scenario doesn't work for me? The dataset that I'm using is created out of files from Azure Blob Storage. Even though a new file arrives there every 30 minutes, my scenario is not firing. Thanks! TP Operating system used: Windows
Remove old Versioned Environments and Kernels after importing bundle
Hello, I noticed that when importing bundles to an automation node, environments are versioned and all old kernels are available for use in Jupyter notebooks, along with all the old environments. Now I understand that this versioning is done so that if projects/bundles which are using the same environment name can continue…

Trending Discussions

Logging in dataiku notebook / recipe ...
Hello Team, I am working on pyspark recipes. I use notebook to build the logic and change it back into recipe. The dataiku and spark operations ( e.g. df.count() ) emits a lot of log statements to the console and makes the notebook very difficult to use. Is there a way for me to supress logging from dataku and spark APIs?…
Defining a global variable in the base name of the output file for a dataset
Hello I am working on a flow that has a python recipe that sets global variables. In the output dataset of the recipe a couple of these variables are being used to set the path and filename of the dataset which is stored in Azure. From researching on how to define the filename it states to set the "Force single output…
i am looking a strange error while accessing my dataflows
flows were working previously but now this error window limits me to use any of my projects

Leaderboard

Turribeach 3581

tgb417 2477

Ignacio_Toledo 1079