Why engineers can be tough customers for machine learning based solutions
Throughout my career, the end-users for almost all the analytical solutions I have been involved in have been engineers. This has come both as an internal provider as part of an R&D group for the commercial operations and as an external market provider. So, when I say that engineers are a tough audience, I do claim to speak with some authority, also some pain, but most of all with insight and empathy to their concerns in the veracity and use of data-driven machine learning solutions. It has always been a push and pull between the data-driven statistical-based solutions that I would offer up as a Mathematician/Statistician and the deterministic physical modelling, with which the engineers were more comfortable. Concerns on understanding relationships, cause and effect, and whether it was “realistic” were at the center of the debate 30 years ago. Today, it appears that perhaps not too much has changed, or has it? First let’s dig a bit deeper into from where the unease may arise, especially in the pipeline and gas utility environment.
Statistical versus semantic modelling in machine learning
There is no doubt that artificial intelligence (AI) and machine learning have and will continue to change our lives at a rapid pace. I suspect few will want to argue with that! Some may argue the pluses and minuses of the impact, but that’s another topic for another day. From that point of agreement, we then venture into a minefield of disagreement by academics, practitioners, and end-users on which “flavor” of machine learning will hold water as the problems get more complex or which are better suited to specific applications. The arguments will rage. These, perhaps, are most famously illustrated between the likes of the statistical modelling proponents such as Google’s Research Director Peter Norvig and the semantic modelling supported by the renowned linguist Noam Chomsky who is considered the “father of modern linguistics.”
The essence of the argument between these behemoths in their own worlds is that Chomsky challenged that predictive accuracy as the measure of success in applications of machine learning (in natural language processing) was not science! In Chomsky’s mind, “real science” is about discovering explanatory principles that provide a real understanding of the phenomena being studied. Pertinent to this article is his apparent disdain for Bayesian statistics expressed in a statement in this debate to the effect of “… you know, Bayesian this and that…” when referring to the arguments from the statistically driven modelling advocates.
Peter Norvig picked up the gauntlet laid down by Chomsky with a robust defense of statistical driven modelling. In that defense, he brings out another important division in the schools of machine learning between identifying and fitting the appropriate model to the data and that of a more algorithmic first viewpoint where it is fundamentally assumed that the underlying relationships in the data are too complex to represent in a simple fixed model type. The latter of these is the harder concept of the two to swallow for the likes of Chomsky, but it is the one advocated and supported strongly by the likes of Norvig. It is harder to swallow because fundamentally, it says to put your trust in the data and let it have the driving seat. Implied is a message that maybe not everything can be explained away in a structured semantic form using our current knowledge, so get over it! Now, there’s fighting talk to fuel the debate!
Discomfort with data centricity slows adoption of AI and machine learning
The uneasiness that Chomsky expresses is, I believe, shared by many in the engineering and scientific community and aligns with my experience. In a prior article, I mentioned that O&G and the energy utility sectors had been slow to seize the opportunities of digital transformation. I would argue that this discomfort with the data centricity has been one of the main factors in why the uptake of AI and machine learning has somewhat lagged in industries such as O&G and gas utilities and specifically in my own environment of pipeline safety and operational performance. Fundamentally, engineers are trained to understand how things work and why they may or do go wrong. Their intuition is to strive to understand the unknown and factor it into the physical modelling of their problem. They are not alone in their discomfort. The idea that you can consistently drive better predictive performance in many tasks using data driven statistical learning compared to theoretical models and without any deep understanding of the underlying physical relationships is a tough ask for many in the broader scientific community. The wider impact of this discomfort becomes clear when you consider who fills many of the senior positions in these industries. They are filled by people who were mainly engineering practitioners and have progressed up through the system to the senior roles they fulfill today. This makes it especially tough to gain acceptance in an industry where safety and keeping risk to a minimum are the bedrock of business operations.
Extracting causality from data
But, what about “causation” and the most powerful question of all, “Why”? Prediction is largely centered on correlation, and as the adage goes, correlation is not causation. For example, the ground being wet and the fact of it having rained are highly correlated but tells us nothing of which one is causal, and which one is the result of an event. Causality analysis focusses upon the “Why?” and would tell us that the event of it raining was the cause of the ground getting wet and not that the ground being wet caused the rain (or at least that is true if we set aside condensation effects and the water cycle process for now). This need to extract causality from data is an important concept for engineers in all disciplines. It is this, for example, that lets the risk engineer identify what is driving higher levels of risk in some parts of the pipeline system and with that the guidance on what can best be addressed to lower the risk level.
The future of machine learning
This is where it again gets interesting, and of course, where there is yet another schism in the thought leadership of the future of machine learning. Geoff Hinton believes that deep learning, of which he is the acknowledged father, is equipped to deal with all future needs with the supplement of a few “breakthroughs,” as explained in a recent MIT review article. A little closer to the market is the success enjoyed by machine learning approaches based around Bayesian networks. These lend themselves intuitively as to how we, as individuals base our decisions and how we modify our thinking based upon new information. Recent focus upon these and the concept of “do-calculus” introduced by Judea Pearl, author of “The Book of Why,” has provided significant progress in the quest for a more inferential or causal friendly machine learning toolset.
Bayesian Network models the foundation for next-generation risk management
We at DNV GL have invested significantly in the utilization of Bayesian Network models as a foundation for next-generation risk management and in the advancement of pipeline leak detection systems. These, and other data driven solutions, will become an important part of our product line. In a subsequent article, I will look at the fundamentals of Bayesian Statistics and why it is such a powerful vehicle for machine learning. Most importantly, we will get into why Bayesian methods may well become an engineer’s best friend when answering their “why?” questions!