Setting up Event Server

Options
clayms
clayms Registered Posts: 52 ✭✭✭✭

I have an enterprise Design Node instance controlled through a Fleet Manager.

I have followed these instructions: https://doc.dataiku.com/dss/latest/operations/audit-trail/eventserver.html

I have also seen several of the other posts here:

https://community.dataiku.com/t5/Setup-Configuration/DSS-event-server-setup/m-p/22948

https://community.dataiku.com/t5/Setup-Configuration/DSS-Event-server-setup-Destination/m-p/29870

I still cannot get the event server working.

What should the URL be: the IP address of the Design Node, the https URL we have registered that redirects to the Design Node, or just `localhost` ?

What should the port number be? When I go to the S3 bucket "connection" I want the logs to be sent to, I don't see any port in the URL as some posts have said where to get it from.

What should the "Write as user" be: `admin` or any user that has credentials registered in their Dataiku User's "Personal Credentials" ? If it is the user name, which user name: `dssuser_first_last`, `First.Last`, `@firstlast` , `first.last@corp-email.com` ? The user names are inconsistent across the entire platform.

What should be put in the "Path within connection" ? The connection I want to use is named the same as the bucket. Do I need to include the bucket name in the Path? Do I need to include a leading and/or trailing `/` in the Path?

Where can I see the logs of the Event Server?

The instructions tell me to Stop DSS then install the Event Server with the shell command. However, if I stop the DSS, then I can no longer ssh into the DSS to run the command. This makes no sense. This might be because our DSS is controlled by the Fleet Manager, but this is a big discrepancy in the documentation.


Operating system used: centos

Tagged:

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi @clayms
    ,

    To enable Event Server on Fleet Manager-controlled instance you can set event server through FM in the virtual network:

    Screenshot 2022-11-14 at 10.36.41.png

    For more details, please see :

    https://knowledge.dataiku.com/latest/kb/setup-admin/dss-and-aws/modify-on-aws.html

    The node name is the name of an existing node as defined in Fleet Manager( doesn't have to yet be provisioned). After enabling this the instance/s must be re-provisioned.

    Let us know if you have any issues.

  • clayms
    clayms Registered Posts: 52 ✭✭✭✭
    Options

    That was already set to the current design node. However, it still is not working. I want the logs sent to an S3 bucket. How do I do that?

    This link does not say anything about that. https://knowledge.dataiku.com/latest/kb/setup-admin/dss-and-aws/modify-on-aws.html

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    After you enable event server in the FM virtual network and reprovision the node you can .

    Then edit Settings - Event server to point to an S3 connection you defined.

    Screenshot 2022-11-16 at 09.21.28.pngScreenshot 2022-11-16 at 09.26.17.png

    This design node will then write to this location.


    For another node that you want to send events to the automatically installed event-server you can use the private IP of the node you defined as the event-server node.

    Screenshot 2022-11-16 at 09.14.03.png

    If this is not what you are seeing we may want to continue this over a support ticket.

    Hope this helps.

  • clayms
    clayms Registered Posts: 52 ✭✭✭✭
    Options

    I have done all of this before even posting this original post. It still is not working. This still leaves all of the original questions of this post:

    What should the Settings > Auditing > Targets > Destination URL be : the IP address of the Design Node, the https URL we have registered that redirects to the Design Node, or just `localhost` ?

    What should the the Settings > Auditing > Targets > Destination URL:PORT number be? When I go to the S3 bucket "connection" I want the logs to be sent to, I don't see any port in the URL as some posts have said where to get it from.

    What should the Settings > Event Server > "Write as user" be: `admin` or any user that has credentials registered in their Dataiku User's "Personal Credentials" ? If it is the user name, which user name: `dssuser_first_last`, `First.Last`, `@firstlast` , `first.last@corp-email.com` ? The user names are inconsistent across the entire platform.

    What should be put in the Settings > Event Server > "Path within connection" ? The connection I want to use is named the same as the bucket. Do I need to include the bucket name in the Path? Do I need to include a leading and/or trailing `/` in the Path?

    Where can I see the logs of the Event Server?

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi @clayms

    Thanks for the clarifications. To answer your questions about auditing setup:

    1) Settings > Auditing > Targets > Destination URL

    All 3 would work; the easier( if event server is local) would be http://localhost:10000

    You can also https://url pr or http://<private-ip>:10000

    2) If you can use admin, it would be easier. If you want to use a specific user, they would need to have write access to that S3 connection. The actual username would be the value of "login" is as seen under Administrator - Security - Users.

    3) No need to include the bucket name the "path in bucket" will be within the root of the bucket defined in the connection.

    So if you leave it blank it will create 2 folder apicall/generic in the root of the S3 bucket defined in the connection.

    If you want to define within a subpath e.g you can simply define the path in bucket as "audit-data-dss"
    Trailing/leading slashes are ignored. Slashes in between e.g auditpart1/auditpart2 are preserved

    4) To check the logs go to Administrator - Maintenance - Logs - eventserver.log

    Let me know if you have any further issues.

  • clayms
    clayms Registered Posts: 52 ✭✭✭✭
    edited July 17
    Options

    The event server log error is below. I have tried `admin` and a user that has explicit write access to that bucket saved in their user profile's connections.

    java.io.IOException: Failed to connect to S3
     at com.dataiku.dip.datasets.fs.S3FSProvider.tryCode(S3FSProvider.java:230)
     at com.dataiku.dip.datasets.fs.S3FSProvider.initClientIfNeeded(S3FSProvider.java:267)
     at com.dataiku.dip.datasets.fs.S3FSProvider.write(S3FSProvider.java:623)
     at com.dataiku.dip.output.ResplittableExtensibleFileOutputWriter.init(ResplittableExtensibleFileOutputWriter.java:130)
     at com.dataiku.dip.eventserver.targets.ConnectionPathTarget.getOrOpenPartition(ConnectionPathTarget.java:254)
     at com.dataiku.dip.eventserver.targets.ConnectionPathTarget.process(ConnectionPathTarget.java:290)
     at com.dataiku.dip.eventserver.ProcessingQueue$QueueHandler.run(ProcessingQueue.java:61)
     at java.lang.Thread.run(Thread.java:750)
    Caused by: org.springframework.beans.factory.NoSuchBeanDefinitionException: No qualifying bean of type [com.dataiku.dip.security.model.ICredentialsService] is defined
     at org.springframework.beans.factory.support.DefaultListableBeanFactory.getBean(DefaultListableBeanFactory.java:296)
     at com.dataiku.dip.server.SpringUtils.getBean(SpringUtils.java:46)
     at com.dataiku.dip.connections.ConnectionCredentialUtils.getDecryptedCredential_autoTXN(ConnectionCredentialUtils.java:17)
     at com.dataiku.dip.connections.ConnectionCredentialUtils.getDecryptedBasicCredential_autoTXN(ConnectionCredentialUtils.java:13)
     at com.dataiku.dip.connections.EC2Connection.getResolvedCredential(EC2Connection.java:319)
     at com.dataiku.dip.connections.EC2Connection.getS3Client(EC2Connection.java:436)
     at com.dataiku.dip.datasets.fs.S3FSProvider.initClientIfNeeded(S3FSProvider.java:265)
        ... 6 more

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi @clayms
    ,

    Based on the stack trace this resembles S3 permissions issues. Depending on how you set up your IAM Policy it may be related to the actual path to the S3 bucket , "Resource" which DSS is attempting to write to.

    You can test your IAM policy using https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_testing-policies.html

    Permission required by DSS would be:

    https://doc.dataiku.com/dss/latest/connecting/s3.html#required-s3-permissions

    For DSS-managed datasets in an S3 connection, it would not write in the root of the bucket instead it sues bucket/ "Managed data subpath" or bucket/ "Managed data subpath"/"Path in bucket" as defined in the connection. Perhaps your IAM policy was based on this and does not allow writing in the currently defined path for events.

    If the above doesn't help I would suggest we continue this over a support ticket with the instance diagnostics and the exact IAM Policy used for that S3 connection so we can help further.

    Thanks,

  • Abdoulaye
    Abdoulaye Dataiku DSS Core Designer, Registered Posts: 42 ✭✭✭✭✭
    Options

    Thanks for this exchanges, I've followed it and my config work well. But I want to know how to get most of these event logs exactly? Any documentation abouts them?

    Thanks @AlexT
    and @clayms

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi @abdoulaye
    ,
    Once you have event server and audit log dispatching you can leverage these using the following:

    1) Following solution https://knowledge.dataiku.com/latest/solutions/governance/solution-cru.html which will allow to track compute usage, estimated costs per project, etc.

    2) For API Query logs, you can now leverage the monitoring feedback loop from API query back to your source models so you can monitor and improve your models https://knowledge.dataiku.com/latest/mlops-o16n/model-monitoring/tutorial-api-endpoint-monitoring-basics.html

  • Abdoulaye
    Abdoulaye Dataiku DSS Core Designer, Registered Posts: 42 ✭✭✭✭✭
    Options
Setup Info
    Tags
      Help me…