How dialogue systems learn

July 31, 2017 Maluuba

We’re seeing more applications of conversational AI, through in-home speakers, in car assistants and on our phones. These dialogue systems are an interface in which users interact using words.

To be effective, dialogue systems need to master language to the extent that they need to understand the users’ sentences but they also need to be able to communicate and generate new sentences.

To achieve these capabilities, the research community has been exploring a two-step process. First, given dialogue history, the system learns to generate sentences that are sensible in this context. If I’m asking several questions about movies playing at my local theatre (films, times, ticket availability), the system needs to remember and use this contextual information.

Beyond this, the system learns to select between the different possible sentences depending on what it is trying to achieve. For instance, a system which tries to maximize user engagement learns to ask questions to keep the dialogue moving forward.

One can think of this process as language acquisition followed by learning to accomplish tasks through language. The first step of language acquisition is often performed by supervised learning: the system observes sentences produced by humans and learns how to map the sentence generated by one speaker to the next sentence generated by the second speaker. After observing many sentence-response pairs, the system can produce reasonable responses for a given dialogue context.

The next step consists of generating words, not only based on what would be reasonable, but also based on the system’s goal. This goal might be maximizing user engagement, helping a user to accomplish a task such as booking a movie ticket, or playing a game of 20-questions.

The system is trained to perform this step through reinforcement learning. During this step, the system learns how to plan the responses it generates based on the long-term outcome that they will have. For instance, in the case of maximizing user engagement, the system learns that asking questions results in longer dialogues compared to generating declarative responses.

A major challenge with teaching a dialogue system to plan is that a user simulator is needed: the system is going to generate new sentences compared to the ones observed in the data so it is necessary to simulate how a user would respond to these sentences to estimate their long-term outcome.

In other terms, to efficiently train a dialogue system…we need a dialogue system! To overcome this limitation, research has moved towards trying to make systems learn faster so that they can learn by communicating with humans directly.

We’re working towards a future where users program their AI assistants through language, by directly giving them feedback on how they perform.

Layla El Asri, Research Manager, Maluuba

Join 25,000+ people who read the weekly 🤖Machine Learnings🤖newsletter to understand how AI will impact the way they work.


How dialogue systems learn was originally published in Machine Learnings on Medium, where people are continuing the conversation by highlighting and responding to this story.

Previous Article
Teachers preparing children for the future workplace
Teachers preparing children for the future workplace

SourceAwesome, not awesome.#Awesome“…Whether for work, errands or recreation, driving to the store is a lar...

Next Article
Algorithms for Hire
Algorithms for Hire

Whether as an interviewer or interviewee, it is a well-known reality that the hiring process is fraught wit...