Sharepoint Files with Different Field Counts, but Same Field Names

munoj9
Level 2
Sharepoint Files with Different Field Counts, but Same Field Names

Hello everyone, 

 

I am currently importing files from SharePoint that are sent to me via email on a weekly basis. 

Each file may have a different number of fields each time it is sent. 

I notices that Dataiku is not appending on the name each time and simply creating null values on the fields that arent visible on each dataset. 

 

Instead it pushes values to the left to fill columns in. 

example:

data1.csv

name,lastname,money,activity

jj,mounts,1000,100

 

data2.csv

name,lastname,activity

jj,mounts,100

 

sharepoint dataset built in dataiku

name lastname money activity

jj           mounts  1000     100

jj           mounts   100

 

 

the above result would be wrong and its what Dataiku is doing. 

It should have filled money with a blank and activity with 100

How can i fix this error? 

0 Kudos
6 Replies
Turribeach

How exactly are you loading these files? Describe your flow. 

0 Kudos
munoj9
Level 2
Author

I built a flow that creates a dataset from the Sharepoint plug in, then i use R to pull the data from that dataset to make transformations and join to other datasets. 

The issue is at the very beginning when Dataiku pulls in the data from Sharepoint though. 

0 Kudos
Turribeach

It's still not clear to me how your flow works. Do you have 1 dataset per file? Are you re-using the same dataset for all files? If so that's the problem. Datasets have a defined schema. You can't just change files behind it without updating the schema. This might be a case for a Python recipe where you can handle different schemas programatically.

0 Kudos
munoj9
Level 2
Author

I am reusing the same dataset for all files. 

I thought the process would give me an option to append the files based on the field names which in context is a very simple ask. 

Do you have any documentation on the Python solution, i would love to entertain that option. 

 

Thanks!

0 Kudos
munoj9
Level 2
Author

thank you very much, i will look into this!

0 Kudos