Using Dataiku
- Hi All, I would appreciate it if someone can provide me with a python script to select (read_csv) the latest csv file in a SFTP folder? Currently I am using the following script to read csv files from…Last answer by PrathameshPatil
Hi,
Were you able to find a solution on this where DataIKU reads the latest uploaded file out of all the list of uploaded files? - Currently using Scenario reporters to send data to a dataset with below configuration. { "flowname": "${scenarioName}", "status": "${outcome}", "summary": "${failedEventsSummary}" } The issue is faile…Solution by Turribeach
There isn't a quick way of getting the error. You will need to process this variable yourself or look at using the Python API to get the error. Here is sample way of doing it:
https://community.dataiku.com/t5/Using-Dataiku/Get-the-run-error-message-from-Dataiku-API/m-p/41593
Solution by TurribeachThere isn't a quick way of getting the error. You will need to process this variable yourself or look at using the Python API to get the error. Here is sample way of doing it:
https://community.dataiku.com/t5/Using-Dataiku/Get-the-run-error-message-from-Dataiku-API/m-p/41593
- Hi, I want to upload a TDE file from tableau server to DSS. How should i go about it? NarayanLast answer by
- colAB310 if(contains(toUppercase(col),"A"),"letters",if(contains(toUppercase(col),"B"),"letters","others")) The code above works but is it possible to shorten it to combine the logic for "A" and "B"? …Solution bySolution by Turribeach
It's possible but not going to help you simplify the expression. Using the boolean operators:
if(contains(toUppercase("ABC"),"D") || contains(toUppercase("ABC"),"A"), "letters", "others")
Or using boolean functions:
if(or(contains(toUppercase("ABC"),"D"), contains(toUppercase("ABC"),"A")), "letters", "others")
You will be better using match() with a regular expression and check the output with length().
- Could you please provide guidance on how to establish a connection between Dataiku and Cosmos DB for data integration and analysis purposesLast answer by
- * File_Path: \USER\FOLDER_A\ABC\FILE_A * Formula: if(contains(toUppercase(File_Path),"\ABC\"), "yes", "no") * This returns "Invalid Formula" message: Unexpected 'yes' (Paring error at offset ) I teste…Solution by
- I have to read the logs and find where the error has occurred so the item is marked as exception using NLP. can anyone help on this please.Last answer byLast answer by LouisDHulst
Hi @haritha1
,Can you expand on your use case a bit please? Is this a machine learning project? Which logs do you need to parse? Do you need this to run after a Scenario fails? We need more information in order to help you
- * Can Spark be configured for ML algorithms; it looks like current processing is in memory? * Is there Spark processing option available for K means clustering and PCA linear regression? * Is Light GB…Last answer byLast answer by Turribeach
I will suggest you split these in individual questions / threads as you are asking too many questions in a single post. Where you have an error you should post the error you get, "doesn't work" or "times out" doesn't really say much, post the full error trace from the backend log. Where you say Dataiku lacks schema visibility, please post what you see (screen shot) and why think it's missing something.
- I have a flow that gets data from two Snowflake sources, then Python recipe checks the difference of the max(date columns) of both and extracts the rows that are missing from the other dataset. I firs…Solution bySolution by Turribeach
If you need to maintain an historical dataset I will strongly advise you NOT to use append mode. While append mode does work in some instances it is not safe to use. Any schema changes or differences will cause Dataiku to drop the table and recreate it causing you to loose all your historical data. Even when you use the write_from_dataframe(), which should not change the schema, I have seen cases where append dataset tables get dropped. Dataiku simple does not handle this ETL concept well. There are different solutions but the safest one is to handle the inserts in code (ie Python) which "hides" the output table from Dataiku and prevents accidental table dropping.
- I'm trying to create a regex function that gives me the string between 2 characters I have the string below word1_word2_word3_word4_word5_word6_word7_word8_length_string.txt and I'm trying to return e…Last answer byLast answer by AdrienL
Some alternatives:
- For the regex, it really depends on what the contraints are. For instance,
^.*_(.+)\.\w+$
would match between the last _ and the extension, see (and edit, explain, play with) the test cases and regex here - For the tool, you can use many things depending on the need:
- a python recipe
- a step in a data preparation recipe, with multiple possibilities
- python step
- formula step (as suggested by Louis)
- text extraction step (probably the simplest for a simple need), with its
- more exotic options
- For the regex, it really depends on what the contraints are. For instance,