Sorting dataset for pivot

Options
UserBird
UserBird Dataiker, Alpha Tester Posts: 535 Dataiker
Is there a way to guarantee a dataset is sorted in order to pivot it?

In the past I had to add an additional Python or SQL recipe to sort the data, but I realized that even this is not 100% working anymore. Sometimes after an SQL ORDER BY or a Pandas sort_values the resulting dataset contains random rows that are not sorted, even though 99% of the rows are sorted correctly, e.g. I have this right now:

ID Tag
1 automation
..
100 automation
101 biotech <--- should not be there!<BR />102 automation
...
152 automation
153 biotech
....

Since it doesn't matter whether I sort using Python or SQL, I guess it has something to do with how Dataiku works internally. Is there anything I can do?

Answers

  • cperdigou
    cperdigou Alpha Tester, Dataiker Alumni Posts: 115 ✭✭✭✭✭✭✭
    Options
    The fact that python or SQL does not manage to sort your dataset has unfortunately nothing to do with how Dataiku works internally, we orchestrate the execution of recipes/queries.

    The issue is somewhere else...

    In any case, investigating more into this issue would require a dataset extract so that we can try to reproduce the issue.
Setup Info
    Tags
      Help me…