In 2019, IBM reported that it could predict which employees would leave a job with 95% accuracy. The goal was to intervene with employees before they left, thereby reducing employee attrition, which in 2019, was costing companies as much as twice an employee’s annual salary when an employee needed to be replaced.
The end result was a predictive attrition program run on IBM Watson that could help any organization predict those employees who were most likely to leave at 95% accuracy, though the company refused to share where the information on employees comes from.
However, for those of us with an IT background, which expects systems to be able to remain running 99.99% of the time, or for anyone who has visited with a surgeon about a prospective operation and has been told that the likelihood of survival was 95%, with a 5% chance of dying, that benchmark seems low.
How did we decide to declare analytics programs successful once algorithms executed against high-quality data achieved 95% accuracy?
“Instinctively, we feel that greater accuracy is better and all else should be subjected to this overriding goal,” said Patrick Bangert, CEO of Algorithmic Technologies. “This is not so. While there are a few tasks for which a change in the second decimal place in accuracy might actually matter, for most tasks this improvement will be irrelevant–especially given that this improvement usually comes at a heavy cost.”
I get that, but I must confess that I wasn’t getting it very well a few years ago, when I was in charge of a financial institution’s credit card operation and one of our board members was denied credit at the checkout in a home improvement store because an analytics system issued a false positive and denied him credit.
Data science, IT, and business leaders responsible for analytics face the same quandary: To what degree of accuracy must the algorithm operating on the data perform for an analytics program to be declared “ready” for production?
The answer depends on the nature of the problem that you’re trying to solve. If you’re formulating a vaccine, you want to achieve results that exceed 95%. If you’re predicting a general trend, the low 90s or even the 80s might suffice.
Bangert elaborates: “A real-life task of machine learning sits in between two great sources of uncertainty: Life and the user. The data from life is inaccurate and incomplete. This source of uncertainty is usually so great that a model accuracy difference of tens of a percent may not even be measurable in a meaningful way. The user who sees the result of the model makes some decision on its basis, and we know from a large body of human-computer interaction research that the user cares much more about how the result is presented than the result itself. The usability of the interface, the beauty of the graphics, the ease of understanding, and interpretability count more than the sheer numerical value on the screen.”
Life and human reasoning can be capricious, and analytics operates on these. Given this, 95% seems to be a comfortable accuracy level for most, but not all, analytics deployments.
This is exactly why part of the analytics deployment process should be giving managers and users a “heads up” that a 95% accuracy goal will give you false-positive results 5% of the time—and that the company will have to deal with that. Human reviews of analytics results before the results are finalized can help reduce this 5% chance of inaccuracy, but they aren’t foolproof. If the human reviews don’t work, another strategy is defining a means to mitigate situations or business cases when analytics results fail.
In the end, Bangert said that using analytics and machine learning “should not be perfectionism but pragmatism.”
Analytics leaders should balance analytics capabilities and limitations against the precision needs of a given business case or situation, and determine if the “95%” rule is adequate. They should do this in concert with end business users and the managers who are sponsoring the application.
This guarantees that everyone arrives at a uniform decision on what is acceptable for analytics results before the analytics get placed into production and become part of a mission-critical business process.