In some cases, data systems require a maximum data file size. Multiple uploads are needed to load a full dataset. For Dataiku's dataset download functionality, under advanced features, it would be helpful if it were possible to automatically split a CSV into files of a maximum size and quickly download it. The workaround of using the split recipe can be cumbersome since the total number of files is variable depending on the number of records in the dataset. For example, I have a 50 MB dataset I'd like to download as a series of 5 MB files. Right now, my options are the split recipe and a Python recipe storing to a managed folder. But both of these are time consuming when data is needed in ad hoc cases. Building it into the download system would save that time.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.