Follow-up question generation using pattern-based seq2seq with a small corpus for interview coaching

Ming Hsiang Su, Chung Hsien Wu, Kun Yi Huang, Qian Bei Hong, Huai Hung Huang

研究成果: Conference article

1 引文 (Scopus)

摘要

Interview is a vital part of recruitment process and is especially challenging for the beginners. In an interactive and natural interview, the interviewers would ask follow-up questions or request further elaborations when they are not satisfied with the interviewee's initial response. In this study, as only a small interview corpus is available, a pattern-based sequence to sequence (Seq2seq) model is adopted for follow-up question generation. First, word clustering is employed to automatically transform the question/answer sentences into sentence patterns, in which each sentence pattern is composed of word classes, to decrease the complexity of the sentence structures. Next, the convolutional neural tensor network (CNTN) is used to select a target sentence in an interviewee's answer turn for follow-up question generation. In order to generate the follow-up question pattern, the selected target sentence pattern is fed to a Seq2seq model to obtain the corresponding follow-up question pattern. Then the word class positions in the generated follow-up question sentence pattern is filled in with the words using a word class table obtained from the training corpus. Finally, the n-gram language model is used to rank the candidate follow-up questions and choose the most suitable one as the response to the interviewee. This study collected 3390 follow-up question and answer sentence pairs for training and evaluation. Five-fold cross validation was employed and the experimental results show that the proposed method outperformed the traditional word-based method, and achieved a more favorable performance based on a statistical significance test.

原文English
頁(從 - 到)1006-1010
頁數5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2018-September
DOIs
出版狀態Published - 2018 一月 1
事件19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India
持續時間: 2018 九月 22018 九月 6

指紋

Statistical tests
Tensors
Significance Test
N-gram
Target
Language Model
Statistical Significance
Statistical test
Corpus
Coaching
Cross-validation
Table
Fold
Tensor
Choose
Clustering
Transform
Decrease
Evaluation
Experimental Results

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

引用此文

@article{8b6f8bb995f04373b2b8adcda7fc4dec,
title = "Follow-up question generation using pattern-based seq2seq with a small corpus for interview coaching",
abstract = "Interview is a vital part of recruitment process and is especially challenging for the beginners. In an interactive and natural interview, the interviewers would ask follow-up questions or request further elaborations when they are not satisfied with the interviewee's initial response. In this study, as only a small interview corpus is available, a pattern-based sequence to sequence (Seq2seq) model is adopted for follow-up question generation. First, word clustering is employed to automatically transform the question/answer sentences into sentence patterns, in which each sentence pattern is composed of word classes, to decrease the complexity of the sentence structures. Next, the convolutional neural tensor network (CNTN) is used to select a target sentence in an interviewee's answer turn for follow-up question generation. In order to generate the follow-up question pattern, the selected target sentence pattern is fed to a Seq2seq model to obtain the corresponding follow-up question pattern. Then the word class positions in the generated follow-up question sentence pattern is filled in with the words using a word class table obtained from the training corpus. Finally, the n-gram language model is used to rank the candidate follow-up questions and choose the most suitable one as the response to the interviewee. This study collected 3390 follow-up question and answer sentence pairs for training and evaluation. Five-fold cross validation was employed and the experimental results show that the proposed method outperformed the traditional word-based method, and achieved a more favorable performance based on a statistical significance test.",
author = "Su, {Ming Hsiang} and Wu, {Chung Hsien} and Huang, {Kun Yi} and Hong, {Qian Bei} and Huang, {Huai Hung}",
year = "2018",
month = "1",
day = "1",
doi = "10.21437/Interspeech.2018-1007",
language = "English",
volume = "2018-September",
pages = "1006--1010",
journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
issn = "2308-457X",

}

TY - JOUR

T1 - Follow-up question generation using pattern-based seq2seq with a small corpus for interview coaching

AU - Su, Ming Hsiang

AU - Wu, Chung Hsien

AU - Huang, Kun Yi

AU - Hong, Qian Bei

AU - Huang, Huai Hung

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Interview is a vital part of recruitment process and is especially challenging for the beginners. In an interactive and natural interview, the interviewers would ask follow-up questions or request further elaborations when they are not satisfied with the interviewee's initial response. In this study, as only a small interview corpus is available, a pattern-based sequence to sequence (Seq2seq) model is adopted for follow-up question generation. First, word clustering is employed to automatically transform the question/answer sentences into sentence patterns, in which each sentence pattern is composed of word classes, to decrease the complexity of the sentence structures. Next, the convolutional neural tensor network (CNTN) is used to select a target sentence in an interviewee's answer turn for follow-up question generation. In order to generate the follow-up question pattern, the selected target sentence pattern is fed to a Seq2seq model to obtain the corresponding follow-up question pattern. Then the word class positions in the generated follow-up question sentence pattern is filled in with the words using a word class table obtained from the training corpus. Finally, the n-gram language model is used to rank the candidate follow-up questions and choose the most suitable one as the response to the interviewee. This study collected 3390 follow-up question and answer sentence pairs for training and evaluation. Five-fold cross validation was employed and the experimental results show that the proposed method outperformed the traditional word-based method, and achieved a more favorable performance based on a statistical significance test.

AB - Interview is a vital part of recruitment process and is especially challenging for the beginners. In an interactive and natural interview, the interviewers would ask follow-up questions or request further elaborations when they are not satisfied with the interviewee's initial response. In this study, as only a small interview corpus is available, a pattern-based sequence to sequence (Seq2seq) model is adopted for follow-up question generation. First, word clustering is employed to automatically transform the question/answer sentences into sentence patterns, in which each sentence pattern is composed of word classes, to decrease the complexity of the sentence structures. Next, the convolutional neural tensor network (CNTN) is used to select a target sentence in an interviewee's answer turn for follow-up question generation. In order to generate the follow-up question pattern, the selected target sentence pattern is fed to a Seq2seq model to obtain the corresponding follow-up question pattern. Then the word class positions in the generated follow-up question sentence pattern is filled in with the words using a word class table obtained from the training corpus. Finally, the n-gram language model is used to rank the candidate follow-up questions and choose the most suitable one as the response to the interviewee. This study collected 3390 follow-up question and answer sentence pairs for training and evaluation. Five-fold cross validation was employed and the experimental results show that the proposed method outperformed the traditional word-based method, and achieved a more favorable performance based on a statistical significance test.

UR - http://www.scopus.com/inward/record.url?scp=85054976202&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054976202&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2018-1007

DO - 10.21437/Interspeech.2018-1007

M3 - Conference article

AN - SCOPUS:85054976202

VL - 2018-September

SP - 1006

EP - 1010

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

ER -