-
How to establish connection to Cosmos DB in Dataiku
Could you please provide guidance on how to establish a connection between Dataiku and Cosmos DB for data integration and analysis purposes
-
"IF contains" on file path
* File_Path: \USER\FOLDER_A\ABC\FILE_A * Formula: if(contains(toUppercase(File_Path),"\ABC\"), "yes", "no") * This returns "Invalid Formula" message: Unexpected 'yes' (Paring error at offset ) I tested the formula using "ABC" instead "\ABC\" and it works as intended, however for my project retaining the slash symbols…
-
NLP Finding
I have to read the logs and find where the error has occurred so the item is marked as exception using NLP. can anyone help on this please.
-
Functionality questions on DataIKU
* Can Spark be configured for ML algorithms; it looks like current processing is in memory? * Is there Spark processing option available for K means clustering and PCA linear regression? * Is Light GBM available with Spark? * Is automated Hyperparameter tuning available in Dataiku? * Schema visibility: * Current DataIKU…
-
Sync-recipe to Snowflake
I have a flow that gets data from two Snowflake sources, then Python recipe checks the difference of the max(date columns) of both and extracts the rows that are missing from the other dataset. I first tried Syncing that back to snowflake (like updating the other source set with appending the missing rows) but encountered…
-
Regex function to return string between 2 characters
I'm trying to create a regex function that gives me the string between 2 characters I have the string below word1_word2_word3_word4_word5_word6_word7_word8_length_string.txt and I'm trying to return everything after the 7th instance of "_" and before ".txt" Desired output: word8_length_string is there a way to use a regex…
-
Split columns to rows with new line delimiter
I have data structured as such categorycategory details1categoryname categorytype categorylength my category details cell has three values just separated by a new line, is there a way to split that field by a new line delimiter so that I get 2 new rows for those extra fields? me desired output is categorycategory details1…
-
delete copied zone in current project
I am unable to delete copied zone in my current project. How to delete it Operating system used: Windows
-
partitioning
Hi there, i have a problem aggregating big data file base on s3 the data is stored like this way /2022-12-21T00/B00/part-00004-d89b41ad-3d2b-4350-9880-c5f1dfbdbea6.c000.csv.gz the T00 stands for the hour and the B00 is a group that always contains the same subjects. now what i try to achieve what is to aggregate the B00…
-
calculate the difference between 2 datetime values in hh:mm
I have two datetime values date1: "2024-05-60 12:00:00" and date2: "2024-05-07 09:58:00" is it possible to calculate the difference in hh:mm between those two datetimes in dataiku? Operating system used: windows