More intuitive security and access control for DSS

Options
tgb417
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron

User Story

As a part time administrator of a few DSS instances, I would like a more intuitive and discoverable way to setup DSS security / access control that helps me understand my use case and helps me make the adjustments in all of the various security settings spread throughout the system. This would give me more confidence in deploying important applications to users throughout my organization.

Question:

  • Does this need to be looked at from scratch?
  • Does this suggest the need for an alternate overall security module that supplements the security settings spread throughout the system.
  • Is an DSS Academy security course needed?

Notes:

This idea started as a question about:

As an administrator working with inexperienced end-users, I'd like to share an application that can have read-only shared data sets with the original project. The need to copy datasets when using an application limits the size of the dataset that is realistic to use with applications.

COS

This should be an option. The current behavior should continue to be the default behavior for backward compatibility.

Below please find a thread about setting up this use case. That ended totally else where all about a group of folks working together with read access to a dataset.

1
1 votes

New · Last Updated

Comments

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
    Options

    It depends on the specific scenario, but breaking down your template project and making use of shared datasets might be a solution for what you describe. See the pattern below:

    Screenshot 2021-07-05 at 14.44.39.png

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @Manuel
    , I think that “shared data sets” is the key idea here. I’m not completely clear what you are describing.

    In your example you have two projects. On doing the data ETL and the application project consuming some data.

    In the ETL project you are working with a connection to a PostgreSQL database by the authors of that project. I understand the idea of using a scenario to keep a dataset fresh. I don’t understand exactly know what you mean about “user group has read access to connection”. How does group security work on connections? Does one need multiple connections to the same underlying data source? Does one need multiple database accounts? I’m not clear exactly how to set something like this up.

    Is there current documentation or training materials describing this set of techniques?

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
    Options

    The use of the shared dataset is what mitigates the need for duplicating a large dataset with every application instance. https://doc.dataiku.com/dss/latest/security/exposed-objects.html

    Connections can be restricted to selected security user groups. So, when you are using a shared dataset, you need to make sure that the users that will instantiate applications have at least read access to that connection, so that they can read the shared dataset. These user's won't need access to the ETL project.

    I hope this makes it clear.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @Manuel
    ,

    I think that is making a bit more sense. I’m not in a place where I can see a DSS instance at the moment. So, I’m not clear that I fully have this understood. More if I have further questions.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @Manuel
    ,

    Going somewhat beyond this initial use case (outside the scope of applications):

    Is there a way to share a curated set of datasets not with other projects, but with specific other groups and users who have read-only access to specific datasets?

    Based on your description above Do you need to do something like:

    • Create and ETL Project. ->
    • Share specific data sets with a Sharing Project ->
    • Give folks who need to see the specific data sets access to that particular project.
  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
    Options

    To do what you now ask, ignore the application pattern:

    • To share datasets with specific user groups, use the connection's "freely usable by" configuration
    • To make the datasets read-only, use the connection's "allow write" configuration

    So, you could configure two connections pointing to the same database:

    • The first, only accessible by you, to write the curated datasets
    • The second, accessible to specific groups, to read the curated datasets

    I have never had to do this, but I hope it helps.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options
Setup Info
    Tags
      Help me…