We're excited to announce that we're launching the second installment of Dataiku Product Days Register Now

Can we detach the downstream from the dataset and attach to another dataset ?

Tsurapaneni
Level 3
Level 3
Can we detach the downstream from the dataset and attach to another dataset ?

Hi Team,

Hope you are all doing well during this Pandemic !

I have a question in my use case which is, my dataset has more than 70 million rows and as it is taking more run time for a simple prep recipe. We thought of fastening the development time by taking the sample of it and then running it for now. I have to create a sample/filter recipe for that (in order to take a chunk if data, let me know if anything else can be used instead of this). But in the final build and when going further I have to remove the sample/filter recipe and then attach the downstream flow of that sample recipe to the other dataset. Also, can we change the input dataset of a recipe after it is created. if yes, please help me on how we can do it. 

Can you also help me with the ways to handle this long and large datasets to quicken the run time process.

Regards,

Tejasri.

0 Kudos
1 Reply
tgb417
Neuron
Neuron

In my experience with a much smaller file system and PostgreSQL data sets, I can change both input and output data sets to a recipe.

I do need to be a bit careful about the Schema of the two datasets you are changing are by in large the same.

Steps:

  1. Open you recipe
  2. Choose "Input/Output" at the top
  3. Choose the "Change" button
  4. Choose from the list of existing datasets or create a new data set.

Changing Input and Output Datasets.jpg

 

 

--Tom