Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I have a MySQL table of a billion rows that can be partitioned into 100,000 by a category column.
From that table I would like to run a complex query against each partition to produce another subset of rows, unaggregated.
How can I do this in DSS to run these queries in a few parallel threads at a time and then consolidate the results?
Hi,
If your SQL dataset is partitioned, see: https://doc.dataiku.com/dss/latest/partitions/sql_datasets.html
Then you will have multiple parallel threads being created as each partition will have its own activity. By default 5 activities can be executed simultaneously on a DSS instance. This limit can be increased depending on your use case.
Hi,
If your SQL dataset is partitioned, see: https://doc.dataiku.com/dss/latest/partitions/sql_datasets.html
Then you will have multiple parallel threads being created as each partition will have its own activity. By default 5 activities can be executed simultaneously on a DSS instance. This limit can be increased depending on your use case.