Can we get the sum or count of a column in dataiku
For example, I have two columns
ColumnA-id ColumnB-sessioncnt
12345 20
23456 10
Grandtotal 30
Can I able to get this format. Please help me in that as I am new to dataiku.
Answers
-
Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron
-
Thanks for the reply!
-
Thanks! But in my case it was different like I need to get the sum for the output. It has only two columns, one is ID and another one is visits count. So, here I need the sum of 2 nd column as grand total in above or below of the column.
Attached screenshot for reference. Kindly help me in case if it was possible. Thanks in advance!
-
Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron
Hi @kusuma
,I don't see a dataiku feature that would allow you to produce this kind of summary table. What we usually do in this case, when we are creating a final summary table, is to either export the data to an Excel file, and do the last steps with Excel; or use a python recipe to create an insight that is shown as a summary table, or directly writes an Excel file with that format.
If you want a python recipe example, let me know and I can add it here.
Cheers
-
Thanks a lot @Ignacio_Toledo
. Yes, please add python recipe. Thanks in advance! -
Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron
Working on it!
-
@kusuma
, is the objective to show it as an insight in a dataiku dashboard?One thing i had tried once was to create a dummy column (like SUM_COL in example image below)
and then go to charts and use a Pivot table and add the dummy column to the COLUMNS parameter.
But one limitation with this is that we cannot add the words like Total / Grand Total etc. -
Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron
Hi @kusuma
,This is one option for a python code that will write the output in a Dataiku dataset:
import dataiku import pandas as pd, numpy as np data_for_summary = dataiku.Dataset("data_for_summary") data_for_summary_df = data_for_summary.get_dataframe() summarized_df = data_for_summary_df.append( pd.DataFrame( [['Grand Total', data_for_summary_df.Sessions.sum()]] , columns=["User Logon", "Sessions"]) ) # Write recipe outputs summary = dataiku.Dataset("summary") summary.write_with_schema(summarized_df)
Where the input dataset was called "data_for_summary" and the output "summary", as shown in this next screenshot:
But @NN
solution looks even better, I think!Now, I think what DSS is currently missing is an option to create summary tables like this one using insights or the chart capabilities.
Cheers!
-
@Ignacio_Toledo
Completely Agree with you..
Summary tables is something all the BI tools i worked with used to have and i too do wish for something similar in dataiku. -
Apologies for late response, @Ignacio_Toledo
Thanks a lot for sharing the code! -
@NN
Yes, Thanks a lot for sharing it, it was very helpful for me.