Split recipe

Solved!
emate
Level 5
Split recipe

Hi Team,

I have a project with 3 parts flow [quarterly data, monthly data, "CD" data, as all 3 parts has to go through the same prep we are doing that and stacking it 3 files together, at some point we are splitting it back to in to 3 different flows with split recipe.

It was working just fine last month, suddenly I am re-running quarterly flow and now this split recipe works only for monthly flow, and only monthly table is getting populated and both quarterly and CD is totally empty... Do you have an idea what might caused that?

As you can see on the screen 1 - input table has column "new_dataset" = quarterly, 2 screen it is recipe and 3rd is an output.

Quarterly scenario just ignores that this table is empty, skip all the recipes till the end and it is running to an end as a "success".

Thanks for any suggestions,

emate

0 Kudos
1 Solution
ATsao
Dataiker

Hi Emate,

This error is indicative of your input dataset is empty. More specifically, you appear to have partitioning enabled for your datasets, and the "quarterly" partition is likely empty. Could you try double-checking your partitioning setup to make sure that it's correct and try rebuilding your impala_result_tab_pq input dataset?

If you continue to face issues, we will ultimately need a job diagnosis to be able to help you better investigate this issue. If you don't feel comfortable with sharing this job diagnosis publicly, then please open a support ticket with us directly and we can assist you further. Also, if the job diagnosis is too big to be attached, then feel free to upload it to https://dl.dataiku.com instead and share us the resulting link. 

Thanks,

Andrew

View solution in original post

0 Kudos
3 Replies
emate
Level 5
Author

And an update; when I am trying to run only this recipe manually i'm getting the error (attached). "No files in the dataset" - when clearly we can see on screenshot 1.png that file is existing.

0 Kudos
ATsao
Dataiker

Hi Emate,

This error is indicative of your input dataset is empty. More specifically, you appear to have partitioning enabled for your datasets, and the "quarterly" partition is likely empty. Could you try double-checking your partitioning setup to make sure that it's correct and try rebuilding your impala_result_tab_pq input dataset?

If you continue to face issues, we will ultimately need a job diagnosis to be able to help you better investigate this issue. If you don't feel comfortable with sharing this job diagnosis publicly, then please open a support ticket with us directly and we can assist you further. Also, if the job diagnosis is too big to be attached, then feel free to upload it to https://dl.dataiku.com instead and share us the resulting link. 

Thanks,

Andrew

0 Kudos
emate
Level 5
Author

Hi @ATsao 

Thanks for your answer. I opened a support ticket with job_diagnosis in it. As this is very strange situation, as I didnt change anything crucial in the flow, and it was working over a month ago. I was trying to re-build impala_result_tab_pq and there was no problem with it, and as you can see on the screen I even filter this dataset by new_dataset=quarterly to check if there is no records for quarterly, but it seems like we have them. And this is the dataset that is just before that split recipe.

Mateusz

0 Kudos