Artificial intelligence still lacks common sense


The processing of natural language (PNL) has made great strides recently, but understands how the artificial intelligence of what you read? Less than we thought, according to researchers from the Department of Computer Science at USC.

In a recent article, Adjunct Professor Xiang Ren and PhD student Yuchen Lin found that despite advances, artificial intelligence still does not have the common sense necessary to generate plausible sentences.

"Current automatic text generation models can write an article that may be compelling to many humans, but they are basically mimicking what they have seen in the training phase," Lin said. "Our goal in this work is to study the problem of whether current next-generation text generation models can write sentences to describe natural scenarios in our daily lives."

Specifically, Ren and Lin tested the reasoning ability of the models and showed that there is a large gap between current text generation models and human performance. Given a series of common nouns and verbs, state-of-the-art NLP computer models took it upon themselves to create credible sentences that described an everyday scenario. While the models generated grammatically correct sentences, they were often logically incoherent.

For example, here is an example of a phrase generated by a state-of-the-art model that uses the words "dog, frisbee, throw, catch":

"Two dogs are throwing frisbees at each other."

The test is based on the assumption that coherent ideas cannot be generated (in this case: "a person throws a Frisbee and a dog catches it") without a deeper awareness of common sense concepts. In other words, common sense is more than the correct understanding of language, it means that you don't have to explain everything in the conversation. This is a fundamental challenge in the goal of developing generalizable artificial intelligence, but beyond the academic realm, it is also relevant to consumers.

The best natural language processing systems still generate nonsense phrases, such as "two dogs throw Frisbees at each other." (Photo: Adriana Sánchez)

Without an understanding of language, chatbots and voice assistants built on these next-generation natural language models are vulnerable to failure. It is also crucial if robots are to be more present in human environments. After all, if you ask a robot for hot milk, you expect it to know that you want a cup of milk, not the entire carton.

Common-sense reasoning, or the ability to make inferences using basic knowledge about the world - such as the fact that dogs cannot throw Frisbees at each other - has resisted the efforts of artificial intelligence researchers for decades. State- of-the-art deep learning models can now achieve around 90% accuracy, so it appears that NLP has come close to its goal.

"Human beings acquire the ability to compose sentences by learning to understand and use common concepts that they recognize in their environment," Lin said. "Acquiring this ability is considered an important milestone in human development. But we wanted to test whether machines can actually acquire such common sense generative reasoning ability.

To evaluate the different automated models, they developed a restricted text generation task called CommonGen, which can be used as a reference to test the generative common sense of machines. The researchers presented a data set consisting of 35,141 concepts associated with 77,449 phrases. They found that the best performing model only achieved an accuracy rate of 31.6% versus 63.5% for humans.

"We were surprised that the models couldn't recall the simple common sense knowledge that 'a human throwing a frisbee' should be much more reasonable than a dog throwing it," Lin said. "We found that even the strongest model, called T5, after training with a large data set, can still make silly mistakes."

It appears, the researchers said, that past tests have not sufficiently challenged the models in their common sense abilities, instead of mimicking what they have seen in the training phase.

Post a Comment

0 Comments