Survey banner
The Dataiku Community is moving to a new home! Some short term disruption starting next week: LEARN MORE

Problem with pivot -- help?

wjkelly
Level 3
Problem with pivot -- help?

I have a dataset that contains, among other fields:
* Store name

* Product purchased

* Brand of product

* Date of purchase

* Whether the product delivered was a trade sample (field is called is_sample and is boolean)

 

I'm trying to build a pivot where the columns are: Store name, Brand, and two pivot columns based on the "is_sample" field, so column one would be Max(order_date) for items that are "false" (i.e. they are NOT trade samples), and the next column would be max(order_date) for items that are "true" (i.e. they ARE trade samples).

The structure of the resulting dataset would have multiple lines per store name, each line being the brand ordered, and there would be dates or empties in each of the last two columns that represent the most recent date the store ordered the brand and/or received a trade sample of the brand.

I have the pivot set up to deliver this, but every time I run it, it puts empty values in 100% of the pivot columns. In other words, I get "Store Name" | "Brand" | <empty value for max date of non-sample order> | <empty value for max date of sample order>. That's where I'm flummoxed. 

When I run the same exercise in Excel with exactly the same data (i.e. I download the input dataset from Dataiku as an Excel sheet and define the pivot off that sheet), it works like a charm -- every line is Store, Brand, Max date of non-sample order, Max date of sample order. And if there isn't data for either of those last two columns, it's just empty.

Ideas about how to make this work in Dataiku?


Operating system used: MacOS Monterey v12.4

0 Kudos
1 Reply
wjkelly
Level 3
Author

Hey -- I'm still actually interested in solving this one. Anyone have any suggestions for me? 

Honestly, it seems like Dataiku could simply do the pivot the way I anticipated -- let me separate the results by a binary field, and display the max value under each of the "true"/"false" columns. 

 

0 Kudos