using an EKS backend, can I suppress the "Reporting update of resource usage" status messages?

Peter_R_Knight
Peter_R_Knight Registered Posts: 32 ✭✭✭✭
edited July 16 in Using Dataiku

These get reported every minute and on a long job (>1 hour) means it is quite hard to see my own print/status messages? can I adjust the frequency - e.g. every 5 mins? or see these in a dashboard/graph somewhere? is there any documentation on what all the values mean?

Here is an example:

[07:03:33] [INFO] [dku.usage.computeresource.jek]  - Reporting update of resource usage: {"context":{"type":"JOB_ACTIVITY","authIdentifier":"108008065","projectKey":"GE90_ENGINE_REMOVAL_SIMULATION","jobId":"Build_engine_removal_predictions_2020-10-05T11-01-01.315","activityId":"compute_engine_removal_predictions_NP","activityType":"recipe","recipeType":"pyspark","recipeName":"compute_engine_removal_predictions"},"type":"LOCAL_PROCESS","id":"JkqpIjflMvXKEFZu","startTime":1601895693822,"localProcess":{"pid":22157,"commandName":"/u01/dataiku/dataiku-dss-8.0.2/spark-standalone-home/bin/spark-submit","cpuUserTimeMS":29880,"cpuSystemTimeMS":2070,"cpuChildrenUserTimeMS":230,"cpuChildrenSystemTimeMS":160,"cpuTotalMS":32340,"cpuCurrent":0.01199760047990402,"vmSizeMB":19579,"vmRSSMB":1058,"vmHWMMB":1058,"vmRSSAnonMB":998,"vmDataMB":19366,"vmSizePeakMB":19579,"vmRSSPeakMB":1058,"vmRSSTotalMBS":85920,"majorFaults":0,"childrenMajorFaults":0}}
[07:03:33] [DEBUG] [dku.resource]  - Process stats for pid 22157: {"pid":22157,"commandName":"/u01/dataiku/dataiku-dss-8.0.2/spark-standalone-home/bin/spark-submit","cpuUserTimeMS":29880,"cpuSystemTimeMS":2070,"cpuChildrenUserTimeMS":230,"cpuChildrenSystemTimeMS":160,"cpuTotalMS":32340,"cpuCurrent":0.01199760047990402,"vmSizeMB":19579,"vmRSSMB":1058,"vmHWMMB":1058,"vmRSSAnonMB":998,"vmDataMB":19366,"vmSizePeakMB":19579,"vmRSSPeakMB":1058,"vmRSSTotalMBS":85920,"majorFaults":0,"childrenMajorFaults":0}
[07:04:33] [INFO] [dku.usage.computeresource.jek]  - Reporting update of resource usage: {"context":{"type":"JOB_ACTIVITY","authIdentifier":"108008065","projectKey":"GE90_ENGINE_REMOVAL_SIMULATION","jobId":"Build_engine_removal_predictions_2020-10-05T11-01-01.315","activityId":"compute_engine_removal_predictions_NP","activityType":"recipe","recipeType":"pyspark","recipeName":"compute_engine_removal_predictions"},"type":"LOCAL_PROCESS","id":"JkqpIjflMvXKEFZu","startTime":1601895693822,"localProcess":{"pid":22157,"commandName":"/u01/dataiku/dataiku-dss-8.0.2/spark-standalone-home/bin/spark-submit","cpuUserTimeMS":30650,"cpuSystemTimeMS":2110,"cpuChildrenUserTimeMS":230,"cpuChildrenSystemTimeMS":160,"cpuTotalMS":33150,"cpuCurrent":0.001999600079984003,"vmSizeMB":19712,"vmRSSMB":1060,"vmHWMMB":1060,"vmRSSAnonMB":1000,"vmDataMB":19499,"vmSizePeakMB":19712,"vmRSSPeakMB":1060,"vmRSSTotalMBS":149465,"majorFaults":0,"childrenMajorFaults":0}}
[07:04:33] [DEBUG] [dku.resource]  - Process stats for pid 22157: {"pid":22157,"commandName":"/u01/dataiku/dataiku-dss-8.0.2/spark-standalone-home/bin/spark-submit","cpuUserTimeMS":30650,"cpuSystemTimeMS":2110,"cpuChildrenUserTimeMS":230,"cpuChildrenSystemTimeMS":160,"cpuTotalMS":33150,"cpuCurrent":0.001999600079984003,"vmSizeMB":19712,"vmRSSMB":1060,"vmHWMMB":1060,"vmRSSAnonMB":1000,"vmDataMB":19499,"vmSizePeakMB":19712,"vmRSSPeakMB":1060,"vmRSSTotalMBS":149465,"majorFaults":0,"childrenMajorFaults":0}

Answers

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
    edited July 17

    Hey Peter,

    Here is some information about resource usage reporting that you are seeing logged: Compute resource usage reporting.

    You can set the dku.resource and dku.usage.computeresource.jek processes to higher log levels, so that you don’t see as many log lines for each of these processes.

    The information on setting logging levels per process can be found here: Customizing log levels.

    In this case, you can copy the default installation logging config files into your DATA_DIR, and then modify them to update the log level for the referenced processes. This setup would look like:

    mkdir DATA_DIR-VERSION/resources
    mkdir DATA_DIR-VERSION/resources/logging
    cp INSTALL_DIR-VERSION/resources/logging/* DATA_DIR-VERSION/resources/logging/
    

    Now you can configure the following log levels:

    1. Set the log level for dku.resource to WARN, by editing your DATA_DIR/resources/logging/dku-log4j.properties file to include the line: log4j.logger.dku.resource = WARN (This could also be set to INFO)
    2. Create the file DATA_DIR/resources/logging/dku-jek-log4j.properties and add the line log4j.logger.dku.usage.computeresource.jek = WARN to it.
    3. Restart DSS.

    This should prevent the two log lines that you reference from showing up in your logs.

    Please note that altering the log level for processes should be used sparingly, as it can interfere with the ability to troubleshoot DSS.

    Thanks,
    Sarina

  • Peter_R_Knight
    Peter_R_Knight Registered Posts: 32 ✭✭✭✭

    Thanks for the detailed answer. I've forwarded to our admins to see if we can try that and will let you know how we get on.

    Pete

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker

    Hi @Peter_R_Knight
    ,

    I just wanted to update you that as of DSS 10, Kubernetes resource usage reporting can be disabled from the Administration > Settings > Misc. tab:

    Screen Shot 2021-12-01 at 12.31.49 PM.png

    Let me know if you have any questions about this!

    Thanks,
    Sarina 

Setup Info
    Tags
      Help me…