Exporting data traceble in log files

MRvLuijpen · May 2020

Hello Dataiku Community,

I was wondering if a user exports data to outside DSS, what traces can be found in the log files.

I know that there are several levels of exports, and I was wondering what can be traced in the log:

- the main project menu "Export this project", which results in a ZIP download

- inside dataset Action menu "Export", which enables the user to download/export a dataset

- inside dataset by select & Copy/Paste

- from Python/R it is possible to export data to files which are connected to DSS.

From Data Security point of view, it is sometimes necessary to restrict / trace export of sensitive information

And the second part of this question, what are the possibilities of restricting export of data. We did find this link, but it seems this is at DSS instance level: https://doc.dataiku.com/dss/latest/security/advanced-options.html#restricting-exports

Clément_Stenac · May 2020

Hi,

User actions can be traced using the DSS audit log (https://doc.dataiku.com/dss/latest/security/audit-trail.html), i.e. the JSON files in the "run/audit" folder of DSS.

Each line will contain a msgType indicating the action and additional details about the action (like name of project, dataset, ...)

Exporting a project can be tracked by looking for "msgType": "project-export-download"
Exporting a dataset can be tracked with "msgType": "dataset-export"
Copy-pasting is a pure client-side action that cannot be tracked by any means. You will always have dataset-read-data-sample before, though
For custom code, it's structurally impossible to know what people "do" with the data they get, since it's purely arbitrary code. However, you can track which datasets were read using "msgType": "dataset-read-data"

We confirm that the restrictions that you found are the ones that are implemented. It's important to understand that it's structurally impossible to completely prevent data export, if only because users can "see" data in DSS, which at the very least allows them to take a picture of their screen. It's more a matter of having appropriate level of restrictions against errors and as much tracing as feasible, but it is necessarily incomplete in a coding environment.

MRvLuijpen · May 2020

Hello Clément,

Thank you for your response.

Exporting data traceble in log files

Best Answer

Answers

Categories

Setup Info

Tags