-
HOw to learn dataiku
hii guys, i want to use the dataiku as an administrtor i dont know where ton start? do you know from where i can start. documents are not of my type
-
Running multiple SQL statements within a Python recipe to return a dataframe
Hi, Here is some SQL our internal Python recipe creates dynamically as it boots to check of the database, schema and table have proper permissions before performing a function. show grants on database MYDB;create or replace temporary table db_check as select EXISTS( select * from table(result_scan(-1)) where "privilege" =…
-
Wrong date format when reading Excel file
Hi, I have some issue when I'm reading Excel files stored in folder. I import 1 file by month in my folder and I create a dataset to read all files and obtain one stacked table as result. For some of them (but not all), the format is not read like it's stored : * In my Excel file, the column is store with dd/MM/yyyy format…
-
Very big dataset
I have a very large dataset, 16.8billion records and about 8TB. It takes days to do any operation on the data and the project owner want to use all the data and not subset. Dataiku and S3 get into memory errors after several hours of running. Looking for some general guidelines on how to handle this situation. Thank you.
-
Spark integration problem on k8s
Hi all , I have problem with deploying. the spark on k8s and it shows me the below error . dataiku@dataiku---design:~$ /data/design/bin/dssadmin install-spark-integration -standaloneArchive spark-3.3.4-bin-hadoop3.tgz [+] Saving installation log to /data/design/run/install.log [+] Standalone mode selected + Using…
-
Dependency between Projects for Building DataSets
Is there an option to set dependencies as we do with DataStage ETL jobs while triggering jobs. Like set predecessor for a job that is waiting to be executed in the queue. Ex: Imagine Project B is being built or refreshed, with in Project B there is a data set which gets refreshed only when Project A is built/refreshed. Is…
-
Project Export Null Pointer Exception
I'm trying to export a project and this error comes up. I suspected it could be due to limited disk space. I tried not exporting any data but just the shell project and the error persists. Can anyone help? An internal error occurred Please report this issue to Dataiku DSS Support Technical details follow: * Internal error,…
-
Run a recipe for all partitions available
When I run a recipe how do I run it for all the partitions of one variable. In the below photo I would like to run this code recipe for all partitions in RW_Index.
-
CSV files with Date Column and Others with no Date Column - How to combine them?
Hi, I have several CSV files in a 2024 folder that feeds into DataIku. These files have a date column already. The new files that I will be adding to this folder will not have a date column. Example: Existing files - Column A | Column B | Column C 02-10-2024 | Name | Product New files Column A | Column B | Column C Name |…
-
User-level Environment Variables
Is there a way to setup user-level environment variables? I know that we can setup dataiku user variables, but what I was looking for is more about environment variables seen by non-dataiku applications that are used within Dataiku. I know this may sound confusing, so here is my use case: I want to setup a cross-account S3…