Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Empty dataset after running prepare, filter and Python recipe.

vinayk
Level 2
Level 2
Empty dataset after running prepare, filter and Python recipe.

Hello Team,

I have loaded the data from snowflake into Dataiku, now I want to filter the data for specific country, When I used prepare or filter or Python recipe the out put dataset is showing empty rows, where in prepare recipe we can some rows for that specific country.

This is first time facing this issue, can you please tell me what can be issue, why output dataset have 0 rows.

Thanks in advance.

0 Kudos
4 Replies
SarinaS
Dataiker
Dataiker

Hi @vinayk,

I think it would be easiest if we could take a look at a job diagnostic for the job(s) that are resulting in the empty output datasets. I would also suggest doing just a simple sync recipe or a prepare recipe with no steps just to verify if the sync/empty prepare recipe results in expected output data. It would also be helpful to see some screenshots of the input and output datasets showing the data that should be included in the output dataset.  In order to pass along the job diagnostic, you can either chat with us via the "chat" icon on our website, or open a support ticket with us. 

Thank you!
Sarina

vinayk
Level 2
Level 2
Author

Thank you @SarinaS, it was because of the schema issue, sync recipe helped.

0 Kudos
vinayk
Level 2
Level 2
Author

@SarinaS But after the schema issue, it is taking more time to filter the data for specific country, is there any way to reduce the timing of filtering the data, the data has 80 million records. 

0 Kudos
SarinaS
Dataiker
Dataiker

Hi @vinayk,

Thank you for the update, I'm glad that it seems like you were able to resolve the schema issue! For the performance, I do think we need to look at a job diagnostic for the job to see if any performance improvements can be made. If you want, you can open a chat on our website with us or a support ticket to send us the diagnostic. 

That will allow us to review all of your settings. But for a quick understanding, I understand that your input dataset is a Snowflake dataset. What type of output dataset are you using? And what engine is your recipe running on?  

Thanks,
Sarina

0 Kudos