-
How can I apply a sequence of "Preparation" steps I have made for one dataset to another
I have several datasets where some of the columns are identical (for ex. Country, Product-name), but otherwise differs. I need to prepare them before analysing and eventually merging them, in order to make sure that rows with the same values for Country and Product-name can be matched. Now that I have defined a set of…
-
How to stack partitionned datasets with incoherent partitions?
Hi, Let's say we have 2 datasets, partitioned by letters : Dataset 1 : - partition A - partition B Dataset 2 : - partition B - partition C I would like to get a "summed" dataset, where existing partitions are stacked. Dataset 3 : - partition A (from 1) - partition B (from 1 + 2) - partition C (from 2) Simply stacking those…
-
How to import batch of CSV's with different schema from local folder
I have been using the visual recipes to clean and stack a set of data files (CSV's). They do not all have a single schema, but rather various parts of an overall schema (what the final 'stacked' dataset will be). Instead of uploading each file individually (over 100), I would like to upload them all into my flow at once.…
-
After creating a group of steps in one recipe, can I use that group in another recipe?
I want to process several datasets in the same way, I have put all the steps (but one, which should be individual) of one recipe into two groups, which I now want t use for several other datasets. Is that possible and if so how?
-
Cannot remove input data set from stack recipe
According to the UI, it shall be possible, when editing a stack recipe in the Settings tab, and there is also a trash can icon next to each dataset. However, nothing seems to happen when I press this icon. The alternative is of course to make a new recipe without this input, but if there is a trash can it should work.
-
Stack recipe not working?
Stack recipe was previously working well, then suddenly after building the dataset, all the rows are missing but the columns specified in the recipe are present. If we click on analyse of a single column this is what it prints `An invalid argument has been encountered : NaN is not a valid double value as per JSON…
-
Convert stack to merge
I have a dataiku dataset like the following I want to merge these rows based on "PATNO" into a single row. Note, there are different groups of PATNO, So I need to iterate over them too. Is there any way to do this?