-
External data catalog integration
Hi everyone, I'm looking for a way to integrate DataIku into a standalone Data Catalog tool. For example, DataHub. This stems from the fact that some initial data load and transformation happens inside the DWH through orchestration tool like Airflow and transformation tool like dbt. This creates initial datasets that are…
-
How to prevent DSS replace NA with null?
Hi, I'm using Python recipe to query and insert data to the output SQL Server dataset as below. import dataikuimport pandas as pdfrom dataiku import SQLExecutor2# Read recipe inputsp787PDMItem = dataiku.Dataset("_P787PDMItem_src")p787PDMItem_df = p787PDMItem.get_dataframe()# Initialize an empty DataFrame to collect all…
-
How to ingnore the Identity Column of SQL Server table in Dataiku
I have the SQL Server datasets as below TableSource has 2 columns [Name] string, [Address] string from connection 1 TableDest has 3 columns [Id] int identity, [Name], [Address] from connection 2 How to import data from TableSource to TableDest on Dataiku? Note: the connection1 and connection2 are on different SQL servers.
-
HDFS - Force Parquet as default settings for recipe output
Greetings ! I'm currently on a platform with Dataiku 11.3.1 and writing datasets on HDFS. IT requires all dataset to be written in Parquet, but the default setting is on CSV (Hive) and it can generate errors. Is there a way to configure the connection to force the default settings to be Parquet ? Best regards,
-
Is the "Admin" privilege necessary to create branches in a project?
Hello, I am an administrator on our DSS servers and can create branches in project without issue. However it seems that unless a non-admin user created the project they cannot create branches (or switch branches) in a project unless they are made "Admin" under the "Security" tab, either by giving the permission directly or…
-
Using Script Scenario
The Webhook URL of a MS Teams channel can be used in the 'Reporter' to send scenario related messages to that Teams channel. However, I would like to know whether or not it possible to send messages to the same MS Teams channel programmatically using Dataiku Scenario API. I simply want to use two separate channels (mail…
-
Exclude specific dataset build from scenario
Hello, I need to exclude a part of my flow from being built during the execution of a scenario, because this part is to built only once in a while. For now, I have to do it in "Build only these items" mode and list all the datasets I actually want to build. Would it be a way to exclude the few datasets I DON'T want to…
-
starting a project with pyhon script that generate data, is it possible ?
Hello everyone, I'm new to dataiku. I developped a python script that is collecting data from differentes (web scrapping, local files...) sources before generating a pandas dataframe, then I performe my analysis on it. I would like to switch this project into dataiku. BUT, when I start a project, I need a dataset whereas I…
-
Using date in DataIKU
Hi, Despite going through documentation multiple times, I still don't really understand how dates work in DSS. I'm importing dataset from a connection. Without turning on any of the options in Date & Time handling, this is how data looks like: It says that the data type is string, while in the database itself it is, in…
-
Dataset settings - userModified
Hello, When we look into jsons with configuration of each dataset, there's this setting "userModified" under "schema" section. Does anybody know if it's just a flag indicating whether user modified the schema or is it used for anything else? In one of our datasets, it was set to "true" by Dataiku, but after merging two…