using an EKS backend, can I suppress the "Reporting update of resource usage" status messages?

Peter_R_Knight
Level 2
using an EKS backend, can I suppress the "Reporting update of resource usage" status messages?

These get reported every minute and on a long job (>1 hour) means it is quite hard to see my own print/status messages? can I adjust the frequency - e.g. every 5 mins? or see these in a dashboard/graph somewhere? is there any documentation on what all the values mean? 

Here is an example:

[07:03:33] [INFO] [dku.usage.computeresource.jek]  - Reporting update of resource usage: {"context":{"type":"JOB_ACTIVITY","authIdentifier":"108008065","projectKey":"GE90_ENGINE_REMOVAL_SIMULATION","jobId":"Build_engine_removal_predictions_2020-10-05T11-01-01.315","activityId":"compute_engine_removal_predictions_NP","activityType":"recipe","recipeType":"pyspark","recipeName":"compute_engine_removal_predictions"},"type":"LOCAL_PROCESS","id":"JkqpIjflMvXKEFZu","startTime":1601895693822,"localProcess":{"pid":22157,"commandName":"/u01/dataiku/dataiku-dss-8.0.2/spark-standalone-home/bin/spark-submit","cpuUserTimeMS":29880,"cpuSystemTimeMS":2070,"cpuChildrenUserTimeMS":230,"cpuChildrenSystemTimeMS":160,"cpuTotalMS":32340,"cpuCurrent":0.01199760047990402,"vmSizeMB":19579,"vmRSSMB":1058,"vmHWMMB":1058,"vmRSSAnonMB":998,"vmDataMB":19366,"vmSizePeakMB":19579,"vmRSSPeakMB":1058,"vmRSSTotalMBS":85920,"majorFaults":0,"childrenMajorFaults":0}}
[07:03:33] [DEBUG] [dku.resource]  - Process stats for pid 22157: {"pid":22157,"commandName":"/u01/dataiku/dataiku-dss-8.0.2/spark-standalone-home/bin/spark-submit","cpuUserTimeMS":29880,"cpuSystemTimeMS":2070,"cpuChildrenUserTimeMS":230,"cpuChildrenSystemTimeMS":160,"cpuTotalMS":32340,"cpuCurrent":0.01199760047990402,"vmSizeMB":19579,"vmRSSMB":1058,"vmHWMMB":1058,"vmRSSAnonMB":998,"vmDataMB":19366,"vmSizePeakMB":19579,"vmRSSPeakMB":1058,"vmRSSTotalMBS":85920,"majorFaults":0,"childrenMajorFaults":0}
[07:04:33] [INFO] [dku.usage.computeresource.jek]  - Reporting update of resource usage: {"context":{"type":"JOB_ACTIVITY","authIdentifier":"108008065","projectKey":"GE90_ENGINE_REMOVAL_SIMULATION","jobId":"Build_engine_removal_predictions_2020-10-05T11-01-01.315","activityId":"compute_engine_removal_predictions_NP","activityType":"recipe","recipeType":"pyspark","recipeName":"compute_engine_removal_predictions"},"type":"LOCAL_PROCESS","id":"JkqpIjflMvXKEFZu","startTime":1601895693822,"localProcess":{"pid":22157,"commandName":"/u01/dataiku/dataiku-dss-8.0.2/spark-standalone-home/bin/spark-submit","cpuUserTimeMS":30650,"cpuSystemTimeMS":2110,"cpuChildrenUserTimeMS":230,"cpuChildrenSystemTimeMS":160,"cpuTotalMS":33150,"cpuCurrent":0.001999600079984003,"vmSizeMB":19712,"vmRSSMB":1060,"vmHWMMB":1060,"vmRSSAnonMB":1000,"vmDataMB":19499,"vmSizePeakMB":19712,"vmRSSPeakMB":1060,"vmRSSTotalMBS":149465,"majorFaults":0,"childrenMajorFaults":0}}
[07:04:33] [DEBUG] [dku.resource]  - Process stats for pid 22157: {"pid":22157,"commandName":"/u01/dataiku/dataiku-dss-8.0.2/spark-standalone-home/bin/spark-submit","cpuUserTimeMS":30650,"cpuSystemTimeMS":2110,"cpuChildrenUserTimeMS":230,"cpuChildrenSystemTimeMS":160,"cpuTotalMS":33150,"cpuCurrent":0.001999600079984003,"vmSizeMB":19712,"vmRSSMB":1060,"vmHWMMB":1060,"vmRSSAnonMB":1000,"vmDataMB":19499,"vmSizePeakMB":19712,"vmRSSPeakMB":1060,"vmRSSTotalMBS":149465,"majorFaults":0,"childrenMajorFaults":0}

 

0 Kudos
3 Replies
SarinaS
Dataiker

Hey Peter,

Here is some information about resource usage reporting that you are seeing logged: Compute resource usage reporting.

You can set the dku.resource and dku.usage.computeresource.jek processes to higher log levels, so that you don’t see as many log lines for each of these processes.

The information on setting logging levels per process can be found here: Customizing log levels.  

In this case, you can copy the default installation logging config files into your DATA_DIR, and then modify them to update the log level for the referenced processes.  This setup would look like: 

mkdir DATA_DIR-VERSION/resources
mkdir DATA_DIR-VERSION/resources/logging
cp INSTALL_DIR-VERSION/resources/logging/* DATA_DIR-VERSION/resources/logging/

 

Now you can configure the following log levels: 

  1. Set the log level for dku.resource to WARN, by editing your DATA_DIR/resources/logging/dku-log4j.properties file to include the line: log4j.logger.dku.resource = WARN (This could also be set to INFO) 
  2. Create the file DATA_DIR/resources/logging/dku-jek-log4j.properties and add the line log4j.logger.dku.usage.computeresource.jek = WARN to it.
  3. Restart DSS.

This should prevent the two log lines that you reference from showing up in your logs.

Please note that altering the log level for processes should be used sparingly, as it can interfere with the ability to troubleshoot DSS. 

Thanks,
Sarina

Peter_R_Knight
Level 2
Author

Thanks for the detailed answer.  I've forwarded to our admins to see if we can try that and will let you know how we get on.

Pete

0 Kudos
SarinaS
Dataiker

Hi @Peter_R_Knight,

I just wanted to update you that as of DSS 10, Kubernetes resource usage reporting can be disabled from the Administration > Settings > Misc. tab: 

Screen Shot 2021-12-01 at 12.31.49 PM.png

Let me know if you have any questions about this!

Thanks,
Sarina