Public

# univariate analysis of several variables in one graph

Options
Registered Posts: 13 ✭✭✭✭

Hi,

I have a table like below. DSS can show the variarite analysis (like boxplot) in the 'statistics', but only seperately. Can I create a graph, which include three boxplots together, so that one can clearly see the comparision among those variaibles.

Thanks a lot!

 work hours 2017 work hours 2018 works hours 2019 42 40 42 41 45 43 45 40 41 43 42 40 42 41 41

• Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
Options

several ideas come to mind

1. Restructure the data in long form, with a column for years and a column for hours. In the R world this is some times called tidy data.

2. Use dashboards and post the three charts you already know how to make to a single dashboard.

• Registered Posts: 13 ✭✭✭✭
Options

Thanks @tgb417
the first option is good!

• Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
Options

Excellent!

Long form or so-called "tidy data" can be super helpful for graphing.

Knowing how to use the Visual Prepare recipe to "fold" multiple columns data can be very helpful in creating tidy data set in DSS. Here is a similar example I've been working with recently related to a number of customer name columns. Where I want the column name in one column and the names in another. (You can think about this a "unpivot" function.)

In your case in the column list, you would put each one of your columns in your table.

• "work hours 2017"
• "work hours 2018"
• "work hours 2019"

Help me…