I am using DataIku to predict which of our job sites will have injuries. It has been very successful in lowering our Injury Rate and improving our safety performance. One way it has done this is by requiring bi-weekly management audits on jobs flagged as high risk for safety defects. We have data to show that high risk jobs with audits are less likely to have injuries.
Now the critical matter. Audits are a lot of work, and management is wanting to improve the model so they can do fewer audits. Management correctly argues that if a job is high risk, and they perform an audit, some of the risk has been mitigated and the chance of an injury should decrease.
I put a flag on whether a job has had an audit or not and re-trained the model. Of course, the model flagged jobs with audit as having a higher risk because... we require audits on jobs that the model already flagged as high risk for an injury.
Can you think of any way to mitigate that bias?
(If you are curious about my model, I did a user group presentation on it, you can watch here: https://community.dataiku.com/t5/Online-Events/Defect-Detection-Watch-on-Demand/ba-p/5687 )
This is a really cool use of a model. And a great example of leaking in information from unintended data sources. And a great example of where the "art" of model design can be hard.
I think that the issue is that you are not randomizing your deployment of audits because of the cost of audits.
I do not have a good answer.
That said, I'm wondering about implementing more random audits. (Maybe in the form of short-form audits or self-audits) Then building a feature that is about the successful completion of these short-form self-audits. This could be a few Likert scales about how dangerous the activity is perceived to be. How well the team is in managing the danger. And maybe a text feature asking for comments. (2-3 minutes max. and randomly deployed over time.)
Because those quick form surveys would be deployed to all projects. This would not be tied to most hazardous activities so less unexpected information is leaking into the model. This would produce an increase in general safety awareness. And might show signs of slippage in safety in an area. Particularly if scores change from one sample to the next.
Those are my $0.02. Love to hear what you end up choosing to do.
Thanks for your feedback. Your suggestion of Random Audits is well received, but above my pay grade.
In the immediate future, I've come up with a work around. It isn't perfect, but I think it makes sense.
We had 812 jobs labeled High Risk with no audits (or no audits before an event took place) since 2018, 543 of those jobs had an event (67%).
We had 171 jobs labeled High Risk with audits since 2018, 102 of those jobs had an event (60%).
That is a 7% risk reduction.
Can it be easy enough to subtract .07 from the proba_1 field when an audit is completed?
I'm no statistician, however, removing 7% seems a bit. How to say this.... Suspect? (It might be valid, I just don't really know.)
I'm wondering if some of the data scientists working at Dataiku might be able to help out by jumping on this thread.
If you feel comfortable about the results of the conversation with Dataiku staff folks. I'd love to hear back about what kinds of conclusions you have drawn about approaching this interesting problem.
Hi @AaronCrouch speaking with the Dataiker now, should be reaching out to you shortly.
Hi @AaronCrouch ,
@AndrewS, thanks for your response. It looks like option 1 is similar to an idea I had above. I'm just not sure what thresholds to use. Currently, I'm using a 35% threshold suggested by the system. Let me throw some numbers at you and see if you have any thoughts about how to adjust:
We had 812 jobs labeled High Risk with no audits (or no audits before an event took place) since 2018, 543 of those jobs had an event (66.9%).
We had 171 jobs labeled High Risk with audits since 2018, 102 of those jobs had an event (59.6%).
Taking the sum of proba_1 on high risk jobs with no audit (or no audit prior to a defect), we would predict 482.75 jobs with at least one defect; we had 543 high risk jobs with no audit with a defect (112%)
Taking the sum of proba_1 on high risk jobs WITH an audit, we would predict 99.9 jobs with at lease one defect; we had 102 jobs with a defect (102%)