joins in spark engine

Mahesh_M
Level 1
joins in spark engine

when I use Left join with Join recipe using Spark engine and parquet datasets, it is not giving expected results.

I have 2M records in left table and 1 M in right table, but the result is only 2 record.

0 Kudos
4 Replies
SarinaS
Dataiker

Hi @Mahesh_M,

Can you please open a support ticket, and attach a job diagnostic for the job that results in the unexpected number of output rows? In addition, can you make sure to attach screenshots for both input datasets highlighting several rows of data that you expected to join together based on the join condition that then do not appear in the output dataset? That should help with troubleshooting!  

Thanks,
Sarina

0 Kudos
tgb417

@Mahesh_M ,

Definitely, followup with support.  

--Tom
0 Kudos
vinayshankar
Level 1

Hi,

 

I am facing same issue. I am using Dataiku 9.0.3 version.

 

If your issue is resolved. can you please post resolution here.

0 Kudos
SarinaS
Dataiker

Hi @vinayshankar,

The issue is likely specific to your data and join setup. If you would like us to review it, please also open a support ticket, and attach a job diagnostic for the job that results in the unexpected number of output rows? In addition, can you make sure to attach screenshots for both input datasets highlighting several rows of data that you expected to join together based on the join condition that then do not appear in the output dataset? That should help with troubleshooting!  
 
Thanks,
Sarina

0 Kudos