Despite giant leaps in artificial intelligence (AI) and machine learning (ML) since the 1950s, we remain a long way from machines developing two things that most humans take for granted: general intelligence and common sense. That’s the opinion of two leading researchers in the field.

Charles Simon, founder and CEO of AI/ML start-up FutureAI, believes that while AI may be more adept than humans at finding hidden patterns in data – at least, in large datasets that can be processed at speed – artificial general intelligence (AGI) remains tantalisingly out of reach. 

To this I would add that we should never write off humans’ ability to spot patterns in data and formulate testable theories from them. The thought experiments of scientists such as Albert Einstein, Paul Dirac, Richard Feynman, and Steven Hawking prove that people can look at problems in entirely new ways. Arguably, this is a form of seeing correlations in data that a machine would miss, and the result was elegant equations that describe the universe.

Yet data was only one part of those scientists’ thought processes. Their leaps of intuition also came from a deep understanding of subjects such as physics, maths, geometry, and time, from observing the world around them, and from thinking about it deeply – all characteristics of being human.

Simon broadly concurs with this viewpoint. Writing in TechTalks this month, he explains that one reason for machines lagging behind humans in general intelligence is that, while we can propose a broad definition of it – a machine’s ability to acquire knowledge and understanding for itself – this is a long way from being able to build functional AGI from scratch. 

At present, no one knows how to do that. However, we have a ready-made model for AGI in the human brain, he explains. “By studying how the brain works and building biologically plausible approaches, we should be able to get closer to actually creating AGI.”

But to take the next step on the road to genuine machine intelligence, Simon believes that AGI needs to emulate the learning abilities of an infant. “Take a look at how a three-year-old playing with blocks learns. Using multiple senses and interaction with objects over time, the child learns that blocks are solid and can’t move through each other, that if the blocks are stacked too high, they will fall over, that round blocks roll and square blocks don’t, and so on.”

The toddler has an advantage over AI in that he or she learns everything in the context of everything else they see, hear, touch, taste, smell, and imagine. “Today’s AI has no context. [To a computer] images of blocks are just different arrangements of pixels. 

“Neither image-based AI (think facial recognition) nor word-based AI (like Alexa) has the context of a ‘thing’, like the child’s block, which exists in reality, is more-or-less permanent, and is susceptible to the basic laws of physics.”

This is a core point and a real challenge when designing autonomous vehicles, for example. No computer vision system has an innate understanding that a particular arrangement of pixels is a living, breathing human being. Or that a slightly different arrangement is also a person – one running, sitting, or carrying an object, for example, or one with different coloured hair or skin, or crossing the road in bad weather conditions.

To any pixel-based system, there is also no obvious difference between a real car parked on the hard shoulder and an image of a car on a roadside billboard. Meanwhile, a robot assisting astronauts on a space station needs to be taught that a particular shape – seen from different angles and in different lights – is a screwdriver, a tool that can be picked up, rather than, say, part of a control panel that must not be touched.

The point is that humans can intuit this knowledge from very little information, but machines cannot. After millennia of evolution, we still learn about the world through play and experimentation, as well as from our DNA, a formal education, and centuries of accumulated knowledge, all of which gives us an advantage over machines.

And that’s not all. Simply modelling the human brain may not be a sensible idea, given that human intelligence has evolved through shared goals, emotions, and instincts that have one overriding purpose: our survival. By contrast, AGI “can be planned and be largely about being intelligent,” Simon claims. “Given that, AGI will be given different goals and instincts and is unlikely to be like human intelligence.”

The fact that machine intelligence will be distinct from that of its human creators may (or may not) be a comfort to those commentators who are concerned about the Terminator archetype and other dystopian futures. 

That said, it will be necessary for robots to continue their parallel development so that true AGI can come about, admits Simon. This is because AGI will need to learn about the real world in depth, and it can only do that by interacting directly with it. In other words, AGI will need eyes, hands, ears, and the ability to move about so it can acquire deep knowledge about its environment – all of which will be less of a comfort to alarmists.

“AGI ultimately will require robotics to deal with the variability and complexity that the real world presents. Once AGI has been able to develop via interaction with the real world, that ability can be cloned to static hardware and the knowledge, abilities, and understanding will be retained.”

Today’s AGI prototypes can already explore two-dimensional simulated environments and begin to understand them, paving the way for entry into a three-dimensional simulator, where “the prototype will finally begin to approach the capabilities of the average three-year-old”, he writes.

However, this suggests that, once AGI has acquired a comparable ability to learn, it may begin to do so at an accelerated rate, given that processors could analyse thousands of different simulated outcomes before arriving at one that works – at which point, machine intelligence may begin to exceed our own in many respects.

But of course, this may simply mean that humans can research subjects and test theories far more quickly than they could before, with the aid of AGI tools. Perhaps we will reach an even deeper understanding of the universe with their help.

Another expert has been exploring the topic this month. Mayank Kejriwal is Research Assistant Professor at the University of Southern California. Writing in Nextgov, he picks up Simon’s theme of machine intelligence lacking common sense.

“Despite being both universal and essential to how humans understand the world around them and learn, common sense has defied a single precise definition,” he writes. 

“GK Chesterton famously wrote at the turn of the 20th century that ‘common sense is a wild thing, savage, and beyond rules.’ Modern definitions today agree that, at minimum, it is a natural, rather than formally taught human ability that allows people to navigate daily life.”

This is why modelling or simulating it is such a challenge, he explains. “[Common sense] includes not only social abilities, like managing expectations and reasoning about other people’s emotions, but also a naïve sense of physics, such as knowing that a heavy rock cannot be safely placed on a flimsy plastic table. Naïve, because people know such things despite not consciously working through physics equations.”

It also includes background knowledge of abstract notions, such as time, space, and events, which allows people to plan, estimate, and organise “without having to be exact”, says Kejriwal. So, despite the many advances in AI, notably in game-playing and computer vision, machine common sense remains only a distant possibility at present. 

“Modern AI is designed to tackle highly specific problems, in contrast to common sense, which is vague and can’t be defined by a set of rules. Even the latest models make absurd errors at times, suggesting that something fundamental is missing in the AI’s world model.” 

The concept of a language-based AI appearing to be intelligent, but lacking any real understanding of its own actions was explored last year in this article on the MIT Technology Review.