Python Recipe execution - backend improvements

Options
jmac
jmac Registered Posts: 3 ✭✭✭✭
edited July 16 in Using Dataiku

Hi Dataiku admins, wondering if this post/question could find its way to the backend engineers.

I often have to dig through the logs of failed Python jobs, and I've noticed that the dataiku logging could be improved - due to the presence of duplicate timestamps.

The reason for the duplication is probably because, in addition to the timestamps set by the log4j settings, a second timestamp comes from the fact that the python logging module is initialised with a 'basicConfig' (line 4 in the 'python-exec-wrapper'), with asctime in the formatting. It makes the lines quite long and redundant. Take a look at the log generated by your code below (and notice the typo, too):

[2020/03/05-16:51:53.039] [null-err-101] [INFO] [dku.utils]  - 2020-03-05 16:51:53,037 INFO Running a DSS Python recipe locally, uinsetting env

I should probably also mention that the 'python-exec-wrapper' launching each recipe is a little alarming to look at. There's a lot of best-practise violation going on in there, from the argument handling - (there's standard libraries to do that stuff for you), the imports of modules all over the place, the exception handling, etc. It doesn't look like it's had a code review in a while, nor does it look testable at all in its current state.

Just wanted to put this out there without being too snarky, if there was a repo somewhere I'd gladly submit a pull request...

Tagged:

Answers

Setup Info
    Tags
      Help me…