How to manage .xlsb files

GeorgeAlex
Level 1
How to manage .xlsb files

I have a requirement to read 7 excel sheets as part of the source data files.

I am selecting a folder and supposed to grab al the excel files available in the folder.

6 of the files are .xlsx files. However 1 of the file is .xlsb file.

I am getting formatting errors . Please let me know how do I proceed.


Operating system used: Windows

0 Kudos
1 Reply
tgb417

@GeorgeAlex 

Do you need to read content from the .xlsb file or can you ignore it?

If you can ignore the .xlsb file you might look at creating a dataset from a folder and create a glob based inclusion rule for *.xlsx.  This should treat all 6 .xlsx files as one dataset leaving out the *.xlsb file.

The documentation seems to be a bit think on this point.  That said I use this approach often. One caution when using this approach the worksheet names in all of the .xlsx workbook file must have exactly the same name.  If the worksheet names are different in every workbook you will have problems with that approach.

You might find this thread of some interest.
https://community.dataiku.com/t5/Using-Dataiku/Data-Refresh-out-of-a-Managed-Folder/m-p/25259

If that does not work for your use case.  You might find this Dataiku plugin to be of help.  I've not used it.  But you might find it helpful

https://www.dataiku.com/product/plugins/excel-sheet-importer/

 

--Tom
0 Kudos