Using Dataiku
- Hi, I'm trying to create a new column, "Parent", based on the 2 columns X and Y. It's kind of a basic tree structure in Y. 2 is the daughter of 1 and so on, as in the example below. In the "Parent", w…Last answer by Turribeach
What I need to be sure is that the data is sortable not sorted. These are different things. You can use Window recipe to calculate a parent but for that to work you need to specify a sort order. So it doesn't matter what sort order the data comes in, what matters is if I can sort by columns X and Y and obtain the same result. With regards to G I can't help unless I understand the rule of how you want the logic to work. For all the other rows you want to populate with the Level 1 parent X value when Y (level) >1. So that's a rule I can work with. With G I don't reallty know what you want.
Last answer by TurribeachWhat I need to be sure is that the data is sortable not sorted. These are different things. You can use Window recipe to calculate a parent but for that to work you need to specify a sort order. So it doesn't matter what sort order the data comes in, what matters is if I can sort by columns X and Y and obtain the same result. With regards to G I can't help unless I understand the rule of how you want the logic to work. For all the other rows you want to populate with the Level 1 parent X value when Y (level) >1. So that's a rule I can work with. With G I don't reallty know what you want.
- Hi Team Trying to achieve the result by joining and then applying the window function to get Labor only in 1 record for project instead of all records. Attached is the data and the current o/p vs desi…Solution by satishkurra
Hi All
Able to achieve this using window function by selecting row number and once done, applied formula to get the data to show only 1 time for project as below
if(rownumber==1,Labor,0)
It worked. Thanks
- hello I am contacting you because I am having difficulty using Eks S3 on Dataiku. When saving data on the Dataiku server, there is no limit to the number of data, but when saving data on EKS s3, no mo…Last answer byLast answer by Alexandru
Hi @SunghoPark
,
This is an unusual limit in terms of 4k pieces to hit. It's unlikely related to S3 per say or EKS but something else like credentials used for S3 e.g you are hitting 1h STS token limit when using role chaining:
https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_terms-and-concepts.html
To investigate this further, please raise a support ticket with the job diagnostics running on EKS. You could try to use a connection with Access Key and Secret instead.
Kind Regards, - I am using the Dashboard facility to show metrics that have been rounded to 1 d.p. within a prepare recipe. The resulting metrics sometimes show several trailing zeros, after the 1st decimal place. Is…Solution bySolution by tgb417
As a possible work around, that should work fully in the formula language.
To for example get 1 decimal place
- What about multiplying the number you have in [my_collumn] by 10,
- then rounding,
- then divide by 10
(round([my_column]*10))/10
To get two decimal places multiply and divide by 100. Etc,etc,etc and so forth.
That said it might be nice to put in a product idea to extend the round function to take a number of decimal places to round.
- Hello everyone, I would like to know if it is possible in Gemini Pro Vision, through Prompt Studio, to add images as a parameter. If so, should this image be a URL, a decoded image, bytes? Thank you v…
- Hi All, I have a partitioned dataset which is partitioned by column 'current' and each partitioned contains 2000000 rows. That's why I want to filter rows in the attached dataset so that whenever ther…
- Hi, I want to synchronize an Oracle table of 1 billion rows to another Oracle table. The query is very long and I end up with the following Oracle error: [11:06:27] [INFO] [dku.output.sql] - appended …Solution bySolution by jrouquiePartitioning might indeed be an answer to your case. Beware that it requires to learn a bit about it first, and it needs some practice (don't be discouraged if things don't work on the first attempt !).
> synchronize an Oracle table of 1 billion rows to another Oracle table
If both datasets are partitioned, and you set the partition dependency to be "Equal" (which is the default), then DSS will indeed run the recipe partition by partition, as “multiple little queries”.
>How can I run several repartition key in one go
Specifying which partitions you want DSS to sync is done on the recipe page, just above the "run" button.
See http://doc.dataiku.com/dss/latest/partitions/identifiers.html to specify a (list of) partitions and more generaly http://doc.dataiku.com/dss/latest/partitions/index.html to start learning about partitioning.
> How can run all partitions in one go without listing the 200 possible values
This isn't directly supported, but there is a workaround for now: add a recipe from the dataset for which you want to build all partitions to a dummy unpartitioned dataset. Define partition dependencies as “all available”. - Hi all, Is anyone else having trouble viewing a Python Dash web app from VSCode using the Dataiku DSS plugin? To be clear, we can see the app itself in the side-bar menu, but we can't view or edit the…Last answer by
- Hi All I have made a dashboard with charts and I wish to send the dashboard to an end user without DSS license. Is it possible? RegardsLast answer by
- Hi, We need this behavior to work when executing SQL from Python. show grants on database MYDB;-- DKU_END_STATEMENTcreate or replace temporary table db_check as select EXISTS( select * from table(resu…Last answer byLast answer by Zach
Hi @info-rchitect
,You can accomplish this by using pre_queries to create the temporary tables.
For example, the following script will create a temporary table based on the existing table "other_table", then access the temporary table in the main query:
from dataiku import SQLExecutor2 executor = SQLExecutor2(connection="MY_CONNECTION") pre_queries = [ "CREATE TEMPORARY TABLE temp_table AS SELECT * FROM other_table", ] query = "SELECT * from temp_table" df = executor.query_to_df(query, pre_queries=pre_queries)
Reference documentation: https://developer.dataiku.com/12/api-reference/python/sql.html#dataiku.SQLExecutor2.query_to_df