Community Conundrum 25:Feature Visualization is now live! Read More

Sorting dataset for pivot

Dataiker
Dataiker
Sorting dataset for pivot
Is there a way to guarantee a dataset is sorted in order to pivot it?

In the past I had to add an additional Python or SQL recipe to sort the data, but I realized that even this is not 100% working anymore. Sometimes after an SQL ORDER BY or a Pandas sort_values the resulting dataset contains random rows that are not sorted, even though 99% of the rows are sorted correctly, e.g. I have this right now:

ID Tag
1 automation
..
100 automation
101 biotech <--- should not be there!
102 automation
...
152 automation
153 biotech
....

Since it doesn't matter whether I sort using Python or SQL, I guess it has something to do with how Dataiku works internally. Is there anything I can do?
0 Kudos
1 Reply
Dataiker
Dataiker
The fact that python or SQL does not manage to sort your dataset has unfortunately nothing to do with how Dataiku works internally, we orchestrate the execution of recipes/queries.

The issue is somewhere else...

In any case, investigating more into this issue would require a dataset extract so that we can try to reproduce the issue.
0 Kudos
Labels (2)