Attention-based response generation using parallel double q-learning for dialog policy decision in a conversational system

Research output: Contribution to journalArticle

Abstract

This article proposes an approach to response generation using a Parallel Double Q-learning algorithm for dialog policy decision in a conversational system. First, a new semantic representation of the user's input sentence is presented by using the CKIP parser to derive the semantic dependency sequence of the input sentence. Then, a Gated Recurrent Unit-based Autoencoder is used to obtain the user's turn representation as well as context representation. A Parallel Double Q-learning algorithm with a Deep Neural Network (PD-DQN), combining two Double DQNs in parallel for the contextual and semantic information in the user's message, respectively, are proposed to determine the dialog act. Finally, the user's input and the determined dialog act are fed to an attention-based Transformer model to generate the response template. With the generated response template, the semantic slots are filled with their corresponding values to obtain the final sentence response. This article collects a multi-turn conversation database consisting of 4186 turns in the travel domain and 447 chitchat question-answer pairs as the evaluation corpus. Five-fold cross validation is employed for performance evaluation. Experimental results show that the proposed approach based on semantic dependency for intent detection increases the accuracy by 4.3%. For dialog policy decision, the PD-DQN achieves 87.57% task success rate, which is 13.9% higher than the baseline Double DQN (73.67%). Finally, using the attention-based Transformer for response template generation obtains a Bleu score of 13.6, improved by 1.5 compared to the Sequence-to-Sequence model. In subjective evaluation, both the dialog policy and sentence generation model achieve a higher appropriateness and grammatical correctness scores than the baseline system.

Original languageEnglish
Article number8883052
Pages (from-to)131-143
Number of pages13
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume28
DOIs
Publication statusPublished - 2020 Jan 1

Fingerprint

Q-learning
semantics
sentences
learning
Semantics
Template
templates
Transformer
transformers
Learning algorithms
evaluation
Learning Algorithm
Baseline
conversation
Subjective Evaluation
messages
Cross-validation
slots
travel
Performance Evaluation

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Cite this

@article{b37ef571c61849a0b28c247711ed32b4,
title = "Attention-based response generation using parallel double q-learning for dialog policy decision in a conversational system",
abstract = "This article proposes an approach to response generation using a Parallel Double Q-learning algorithm for dialog policy decision in a conversational system. First, a new semantic representation of the user's input sentence is presented by using the CKIP parser to derive the semantic dependency sequence of the input sentence. Then, a Gated Recurrent Unit-based Autoencoder is used to obtain the user's turn representation as well as context representation. A Parallel Double Q-learning algorithm with a Deep Neural Network (PD-DQN), combining two Double DQNs in parallel for the contextual and semantic information in the user's message, respectively, are proposed to determine the dialog act. Finally, the user's input and the determined dialog act are fed to an attention-based Transformer model to generate the response template. With the generated response template, the semantic slots are filled with their corresponding values to obtain the final sentence response. This article collects a multi-turn conversation database consisting of 4186 turns in the travel domain and 447 chitchat question-answer pairs as the evaluation corpus. Five-fold cross validation is employed for performance evaluation. Experimental results show that the proposed approach based on semantic dependency for intent detection increases the accuracy by 4.3{\%}. For dialog policy decision, the PD-DQN achieves 87.57{\%} task success rate, which is 13.9{\%} higher than the baseline Double DQN (73.67{\%}). Finally, using the attention-based Transformer for response template generation obtains a Bleu score of 13.6, improved by 1.5 compared to the Sequence-to-Sequence model. In subjective evaluation, both the dialog policy and sentence generation model achieve a higher appropriateness and grammatical correctness scores than the baseline system.",
author = "Su, {Ming Hsiang} and Wu, {Chung Hsien} and Chen, {Liang Yu}",
year = "2020",
month = "1",
day = "1",
doi = "10.1109/TASLP.2019.2949687",
language = "English",
volume = "28",
pages = "131--143",
journal = "IEEE/ACM Transactions on Speech and Language Processing",
issn = "2329-9290",
publisher = "IEEE Advancing Technology for Humanity",

}

TY - JOUR

T1 - Attention-based response generation using parallel double q-learning for dialog policy decision in a conversational system

AU - Su, Ming Hsiang

AU - Wu, Chung Hsien

AU - Chen, Liang Yu

PY - 2020/1/1

Y1 - 2020/1/1

N2 - This article proposes an approach to response generation using a Parallel Double Q-learning algorithm for dialog policy decision in a conversational system. First, a new semantic representation of the user's input sentence is presented by using the CKIP parser to derive the semantic dependency sequence of the input sentence. Then, a Gated Recurrent Unit-based Autoencoder is used to obtain the user's turn representation as well as context representation. A Parallel Double Q-learning algorithm with a Deep Neural Network (PD-DQN), combining two Double DQNs in parallel for the contextual and semantic information in the user's message, respectively, are proposed to determine the dialog act. Finally, the user's input and the determined dialog act are fed to an attention-based Transformer model to generate the response template. With the generated response template, the semantic slots are filled with their corresponding values to obtain the final sentence response. This article collects a multi-turn conversation database consisting of 4186 turns in the travel domain and 447 chitchat question-answer pairs as the evaluation corpus. Five-fold cross validation is employed for performance evaluation. Experimental results show that the proposed approach based on semantic dependency for intent detection increases the accuracy by 4.3%. For dialog policy decision, the PD-DQN achieves 87.57% task success rate, which is 13.9% higher than the baseline Double DQN (73.67%). Finally, using the attention-based Transformer for response template generation obtains a Bleu score of 13.6, improved by 1.5 compared to the Sequence-to-Sequence model. In subjective evaluation, both the dialog policy and sentence generation model achieve a higher appropriateness and grammatical correctness scores than the baseline system.

AB - This article proposes an approach to response generation using a Parallel Double Q-learning algorithm for dialog policy decision in a conversational system. First, a new semantic representation of the user's input sentence is presented by using the CKIP parser to derive the semantic dependency sequence of the input sentence. Then, a Gated Recurrent Unit-based Autoencoder is used to obtain the user's turn representation as well as context representation. A Parallel Double Q-learning algorithm with a Deep Neural Network (PD-DQN), combining two Double DQNs in parallel for the contextual and semantic information in the user's message, respectively, are proposed to determine the dialog act. Finally, the user's input and the determined dialog act are fed to an attention-based Transformer model to generate the response template. With the generated response template, the semantic slots are filled with their corresponding values to obtain the final sentence response. This article collects a multi-turn conversation database consisting of 4186 turns in the travel domain and 447 chitchat question-answer pairs as the evaluation corpus. Five-fold cross validation is employed for performance evaluation. Experimental results show that the proposed approach based on semantic dependency for intent detection increases the accuracy by 4.3%. For dialog policy decision, the PD-DQN achieves 87.57% task success rate, which is 13.9% higher than the baseline Double DQN (73.67%). Finally, using the attention-based Transformer for response template generation obtains a Bleu score of 13.6, improved by 1.5 compared to the Sequence-to-Sequence model. In subjective evaluation, both the dialog policy and sentence generation model achieve a higher appropriateness and grammatical correctness scores than the baseline system.

UR - http://www.scopus.com/inward/record.url?scp=85077190562&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85077190562&partnerID=8YFLogxK

U2 - 10.1109/TASLP.2019.2949687

DO - 10.1109/TASLP.2019.2949687

M3 - Article

AN - SCOPUS:85077190562

VL - 28

SP - 131

EP - 143

JO - IEEE/ACM Transactions on Speech and Language Processing

JF - IEEE/ACM Transactions on Speech and Language Processing

SN - 2329-9290

M1 - 8883052

ER -