Discover all of the brand-new features and improvements to existing capabilities in the Dataiku 11.3 updateLET'S GO

Encrypted Excel files

Gwosa-Sweden
Level 1
Encrypted Excel files

Does anyone has the experience to load an excrypted Excel file?

0 Kudos
3 Replies
Turribeach
Level 6

The msoffcrypto-tool Python package may be one approach but it doesn't support the latest Excel formats. What format is your Excel file on?

https://github.com/nolze/msoffcrypto-tool

0 Kudos
greistaen
Level 1

Could you provide python code to

1. read from non-local source folders of excel files 

2. test if they are password protected.

3. if yes, perform password decrypt using msoffcrypto-tool

4. save password unprotected excel back to source folder

 

Thanks.

0 Kudos

I tried the below. however, the output excel file could not be read by the Create Dataset , error message

  •  Used /NEW_SPREADSHEET.xlsx (244.18 KB) to parse data
  •  Failed to detect file format. Please manually fix

Even if i manually selected excel as format, it still cannot load it into Preview.

 

==code==

 

 

import io
import shutil
import dataiku
import msoffcrypto, openpyxl


# Read recipe inputs
source = dataiku.Folder("input")
source_info = source.get_info()

paths = source.list_paths_in_partition()

# Write recipe outputs
target = dataiku.Folder("output")
target_info = target.get_info()


for path in paths:
    decrypted = io.BytesIO()
    with source.get_download_stream(path) as input_file:
        with io.BytesIO() as seekable:
            shutil.copyfileobj(input_file, seekable)
            file = msoffcrypto.OfficeFile(seekable)
            file.load_key(password="xxxxx")  # Use password
            file.decrypt(decrypted)
            
            xlfile = openpyxl.load_workbook(decrypted)
            xlfile.save(decrypted)
            decrypted.seek(0)
            target.upload_stream("NEW_SPREADSHEET.xlsx", decrypted)

 

 

0 Kudos