-
Wrong date format when reading Excel file
Hi, I have some issue when I'm reading Excel files stored in folder. I import 1 file by month in my folder and I create a dataset to read all files and obtain one stacked table as result. For some of them (but not all), the format is not read like it's stored : * In my Excel file, the column is store with dd/MM/yyyy format…
-
Very big dataset
I have a very large dataset, 16.8billion records and about 8TB. It takes days to do any operation on the data and the project owner want to use all the data and not subset. Dataiku and S3 get into memory errors after several hours of running. Looking for some general guidelines on how to handle this situation. Thank you.
-
Spark integration problem on k8s
Hi all , I have problem with deploying. the spark on k8s and it shows me the below error . dataiku@dataiku---design:~$ /data/design/bin/dssadmin install-spark-integration -standaloneArchive spark-3.3.4-bin-hadoop3.tgz [+] Saving installation log to /data/design/run/install.log [+] Standalone mode selected + Using…
-
Dependency between Projects for Building DataSets
Is there an option to set dependencies as we do with DataStage ETL jobs while triggering jobs. Like set predecessor for a job that is waiting to be executed in the queue. Ex: Imagine Project B is being built or refreshed, with in Project B there is a data set which gets refreshed only when Project A is built/refreshed. Is…
-
Project Export Null Pointer Exception
I'm trying to export a project and this error comes up. I suspected it could be due to limited disk space. I tried not exporting any data but just the shell project and the error persists. Can anyone help? An internal error occurred Please report this issue to Dataiku DSS Support Technical details follow: * Internal error,…
-
Run a recipe for all partitions available
When I run a recipe how do I run it for all the partitions of one variable. In the below photo I would like to run this code recipe for all partitions in RW_Index.
-
CSV files with Date Column and Others with no Date Column - How to combine them?
Hi, I have several CSV files in a 2024 folder that feeds into DataIku. These files have a date column already. The new files that I will be adding to this folder will not have a date column. Example: Existing files - Column A | Column B | Column C 02-10-2024 | Name | Product New files Column A | Column B | Column C Name |…
-
User-level Environment Variables
Is there a way to setup user-level environment variables? I know that we can setup dataiku user variables, but what I was looking for is more about environment variables seen by non-dataiku applications that are used within Dataiku. I know this may sound confusing, so here is my use case: I want to setup a cross-account S3…
-
OAUTH authentication possibilities for Python library
Hi, We have an internal library that queries Snowflake. On Jupyter lab, users are authenticated using external browser. Is this possible on Dataiku? If not possible, is there a way for our Python code library to pass the username to the Dataiku Snowflake connection and get back an access token to run queries? Can we access…
-
Change Project image through code
This might be a bit of a silly question, but is there a possibility to change the image used for a project through code (or another automated way)? The reason for this is that, after duplicating projects from Development to Acceptance, we would like to distinguish between them through the image but want to get rid of the…