Natural Language Processing: how AI understands our languages

Visuel article #22 Virtual Agent conversation

Knowing how an AI understands languages is useful to build your own Intelligent Virtual Agent (IVA) or select an “off the shelf” conversational AI platform. Yes we mean Intelligent Virtual Agent and not chatbot, because Natural Language Processing skills put Virtual Agents in another league. So here are the main technical elements that enable an AI to understand a variety of written and spoken languages.

 

An introduction to NLU and NLP

Natural language processing (NLP) is a subfield of linguistics, computer science, information engineering and artificial intelligence. NLP focuses on the interactions between computers and human languages. It can use Deep learning algorithms (a subset of Machine Learning) and speech recognition to detect patterns in language.

The word “natural” relates to humans, as opposed to “artificial” or “machine” languages that are used by developers. 

Natural Language Understanding (NLU) is the first step necessary to achieve Natural Language Processing. Because NLU is in fact the science of deducing an intention (Intent) and related information (Entity) from natural conversations with information extraction. 

NLU is broken down into three linguistic levels : Syntax (understanding the grammar, in different languages), Semantics (understanding the meaning), and Pragmatics (understanding the entities and intents).

 

Here is an example

Based on a real user’s utterance (input) in a conversation with Konverso’s Virtual Agent: 

Conversation with Konverso's Intelligent Virtual Agent

The user’s intent is to “change a password” and “Windows” is an entity. The Virtual Agent has identified the right intent and then asks for a clarification about the connection mode. 

Based on the user’s answer, the Virtual Agent can send information on “How to change a Windows password over VPN”.

More complex requests can include several entities relating to objects, dates, elements of context. 

For instance software names but also names that are specific to a company, like references to an internal portal or departments names.

 

Identifying entities 

The role of entities in Natural Language Processing (NLP) is to collect specific pieces of information from the user during the conversation with the Virtual Agent. With this automatic speech recognition, a conversational AI can understand the user’s intent and its context, to determine the best answer to a request. 

Konverso’s NLP engine includes thousands of entities related to IT service desk and the Digital Workplace (for instance for our Microsoft Virtual Agent and Service Now Virtual Agent). 

We use the following approaches for identifying entities in the user utterance: Named Entity Recognition (NER) based on Machine Learning ; NER based on an ontology of entities (a formal description of knowledge as a set of concepts within a domain and the relationships that hold between them) ; and NER based on grammar rules.

 

Understanding intents

Understanding human interactions lies in this ability to identify the intention of the user, extract useful information from their utterance, and map them to relevant actions or tasks. 

Konverso has built a rich NLU model that enables our Virtual Agent to detect a user intents and entities with a very high accuracy thanks to semantic analysis. In fact Konverso’s method is unique because we use several NLP engines to process the user input against several NLU algorithms that we will rank.

As a result, the Virtual Agent will match the user’s intent with the most relevant Knowledge available in its Knowledge bases, or FAQs or any enterprise content it has been allowed to access. 

The Virtual Agent’s accuracy will also improve and integrate new Knowledge with  past conversations with users thanks to machine learning algorithms and learning models.

 

Understanding human errors 

To understand natural languages, an AI needs more than vocabulary and grammar rules. A conversational AI also needs to understand human errors, because they are inherent to human nature. 

This means that to fully understand human language, an AI needs pattern-recognition of the type of mistakes a human could make while engaging in conversation and making a request.

For instance, we need to anticipate spelling errors from the user, to enable the Virtual Agents to make automatic typographic corrections in the input. 

To solve this problem, it is possible to use a typographic approach to determine the user’s keyboard layout based on their language (QWERTY, QWERTZ, AZERTY). The AI will then adjust spelling errors based on close keys or other predictive typographic errors. 

To be really accurate, those corrections need to be done not only against standard dictionaries but also against the Virtual Agent’s built-in corpus and the enterprise’s corpus

This enterprise corpus is made of all the defined custom terms found in extracted entities and other validated textual content. For instance the name of the enterprise departments, or in-house solutions, another company jargon that users would naturally use in a conversation.

Using a generic corpus without customisation often leads to “abusive” corrections, and is a deceptive reminder for the user that he is talking with a machine.

 

Natural languages also imply Small Talk

To reach the maximum level of understanding of natural languages, and not “sound like a robot”, a Virtual Agent also needs to understand “small talk”. Because chit-chat or light conversation is a natural way to grease the wheels of an interaction and create a connection with the users.

Being able to answer “great” to a simple question like “How are you” and reply “you are welcome” to a “thank you” message contributes to make the Virtual Agent sound more natural. It is also a way to show to the user that they are truly heard and their requests are understood, not just processed. 

This ability to express empathy through social messages, also called “phatics”, is very important. Especially in contexts where users express frustration or have negative misconceptions about chatbots. 

That’s why the most advanced Virtual Agents integrate fine-tuned “phatics” and even jokes and the ability to talk about the weather depending on the user’s location.  

Small talk with Konverso's Intelligent Virtual Agent

A two-way conversation with several rounds

We detailed how AI understands languages, but let’s not forget that our real objective is to have a Virtual Agent capable of carrying a conversation.

This implies the ability to identify not just one intent but several intents and orchestrate a conversation and multiple actions in an order that is relevant. This capacity is called “dialog management”. 

A typical example of this ability to prioritize answers is how a Virtual Agent could first direct the user to a “self-service” knowledge article before proposing a corrective action on a system.

If there is a close ambiguity between multiple intents, the Virtual Agent can ask disambiguation questions, propose alternatives, and carry on the conversation to find more relevant solutions.   

Another important aspect of the “conversational intelligence” of a Virtual Agent is its ability to handle digressions

Digressions are another very human way to carry a conversation. If the user decides to skip some questions the Virtual Agent should not remain stuck in its scenario. This means that the Virtual Agent should be able to come back to clarification questions and “slot filling” questions (asking the user specific questions to narrow down the request) to stay focused on the users intent.

 

The challenges of foreign languages

Even if the English language is used widely across the world, it is important to remember that not all users can have written or voice conversations in English. To be considered like a “digital workforce” working in true partnership with human users, a conversational AI must then fluently master local languages.

In order to be multilingual, an AI can either integrate language specific versions of the NLU and NLG capabilities (training sets, entities, etc.). Or, alternatively, make a direct translation of the user input and bot responses using a Machine Translation component.

With the introduction of Neural Machine Translation (NMT) in machine translation techniques, translation quality has greatly improved in the last few years. And this language technology gets better every month. 

 

With Konverso’s virtual agents, our clients increase end-users satisfaction by 80%. 

Contact our team to know more about Konverso’s Virtual Agents.