Changing data during a join

Amarilla · March 2021

Good morning all!

I am in study and I use Dataiku but I am blocking on certain point.

At this stage I made a join between two Dataset but the data of a column has been changed.

The data are sources of advice, we go from 5,975 Qualitelis, 3,862 Booking, 118 Trip to 8,838 Booking and 1,162 Tripodvisor at the end of the join.

Am I making a mistake? Thanks in advance: D

Andrey · March 2021

Hi, sorry for a late response. Did you try running the analysis on the whole column data and not just on a sample?

You can choose the whole data in the dropdown that currently says "Sample":

Andrey · March 2021

Hello,

If I understood the question correctly you can go to the dataset settings -> schema and redetect it from the changed data. Then you can use the schema propagation tool on the flow to apply the new schema downstream. If needed you can also change the join recipe settings if you want to join the data differently.

Amarilla · March 2021

Thanks for your feedback @Andrey
! In settings I can't find redetect it from changed data

Andrey · March 2021

It's the "Check now" button under the "Schema" tab

Screenshot 2021-03-18 at 14.22.04.png

Amarilla · March 2021

It notes me this: " The schema ans the data are consistent.

That unfortunately didn't solve my problem ..

Andrey · March 2021

Looks like I didn't understand the question. What did you mean by "but the data of a column has been changed".

Is it the data in one of the datasets that got changed? Did the structure of that data change (e.g. the schema got different)?

Amarilla · March 2021

In my first dataset I have a column with data corresponding to: 5,975 data lines named "Qualitelis", 3,862 "Booking", 118 "Trip".

And at the exit of my join the data of this column this find to be: 8 838 "Reservation" and 1 162 "Tripodvisor".

I don't know if I was clearer

Andrey · March 2021

could you please send a screenshot of both dataset contents and also of a join recipe settings (a tab with all join conditions) to see exactly how they're being joined?