Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi Dataiku admins, wondering if this post/question could find its way to the backend engineers.
I often have to dig through the logs of failed Python jobs, and I've noticed that the dataiku logging could be improved - due to the presence of duplicate timestamps.
The reason for the duplication is probably because, in addition to the timestamps set by the log4j settings, a second timestamp comes from the fact that the python logging module is initialised with a 'basicConfig' (line 4 in the 'python-exec-wrapper'), with asctime in the formatting. It makes the lines quite long and redundant. Take a look at the log generated by your code below (and notice the typo, too):
[2020/03/05-16:51:53.039] [null-err-101] [INFO] [dku.utils] - 2020-03-05 16:51:53,037 INFO Running a DSS Python recipe locally, uinsetting env
I should probably also mention that the 'python-exec-wrapper' launching each recipe is a little alarming to look at. There's a lot of best-practise violation going on in there, from the argument handling - (there's standard libraries to do that stuff for you), the imports of modules all over the place, the exception handling, etc. It doesn't look like it's had a code review in a while, nor does it look testable at all in its current state.
Just wanted to put this out there without being too snarky, if there was a repo somewhere I'd gladly submit a pull request...
Hi,
Thanks for your feedback, we'll be having a look at that.