Imitation learning for sentence generation with dilated convolutions using adversarial training

Jian Wei Peng, Min-Chun Hu, Chuan Wang Chang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this work, we consider the sentence generation problem as an imitation learning problem, which aims to learn a policy to mimic the expert. Recent works have showed that adversarial learning can be applied to imitation learning problems. However, it has been indicated that the reward signal from the discriminator is not robust in reinforcement learning (RL) based generative adversarial network (GAN), and estimating state-action value is usually computationally intractable. To deal with this problem, we propose to use two discriminators to provide two different reward signals for constructing a more general imitation learning framework that can be used for sequence generation. Monte Carlo (MC) rollout is therefore not necessary to make our algorithm computationally tractable for generating long sequences. Furthermore, our policy and discriminator networks are integrated by sharing another state encoder network constructed based on dilated convolutions instead of recurrent neural networks (RNNs). In our experiment, we show that the two reward signals control the trade-off between the quality and the diversity of the output sequences.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages435-440
Number of pages6
ISBN (Electronic)9781538692141
DOIs
Publication statusPublished - 2019 Jul 1
Event2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019 - Shanghai, China
Duration: 2019 Jul 82019 Jul 12

Publication series

NameProceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019

Conference

Conference2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019
CountryChina
CityShanghai
Period19-07-0819-07-12

Fingerprint

Discriminators
Convolution
Recurrent neural networks
Reinforcement learning
Experiments

All Science Journal Classification (ASJC) codes

  • Media Technology
  • Computer Vision and Pattern Recognition

Cite this

Peng, J. W., Hu, M-C., & Chang, C. W. (2019). Imitation learning for sentence generation with dilated convolutions using adversarial training. In Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019 (pp. 435-440). [8795047] (Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICMEW.2019.00081
Peng, Jian Wei ; Hu, Min-Chun ; Chang, Chuan Wang. / Imitation learning for sentence generation with dilated convolutions using adversarial training. Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 435-440 (Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019).
@inproceedings{0e743d6e2bdb462d90b77f41b0adc75c,
title = "Imitation learning for sentence generation with dilated convolutions using adversarial training",
abstract = "In this work, we consider the sentence generation problem as an imitation learning problem, which aims to learn a policy to mimic the expert. Recent works have showed that adversarial learning can be applied to imitation learning problems. However, it has been indicated that the reward signal from the discriminator is not robust in reinforcement learning (RL) based generative adversarial network (GAN), and estimating state-action value is usually computationally intractable. To deal with this problem, we propose to use two discriminators to provide two different reward signals for constructing a more general imitation learning framework that can be used for sequence generation. Monte Carlo (MC) rollout is therefore not necessary to make our algorithm computationally tractable for generating long sequences. Furthermore, our policy and discriminator networks are integrated by sharing another state encoder network constructed based on dilated convolutions instead of recurrent neural networks (RNNs). In our experiment, we show that the two reward signals control the trade-off between the quality and the diversity of the output sequences.",
author = "Peng, {Jian Wei} and Min-Chun Hu and Chang, {Chuan Wang}",
year = "2019",
month = "7",
day = "1",
doi = "10.1109/ICMEW.2019.00081",
language = "English",
series = "Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "435--440",
booktitle = "Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019",
address = "United States",

}

Peng, JW, Hu, M-C & Chang, CW 2019, Imitation learning for sentence generation with dilated convolutions using adversarial training. in Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019., 8795047, Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019, Institute of Electrical and Electronics Engineers Inc., pp. 435-440, 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019, Shanghai, China, 19-07-08. https://doi.org/10.1109/ICMEW.2019.00081

Imitation learning for sentence generation with dilated convolutions using adversarial training. / Peng, Jian Wei; Hu, Min-Chun; Chang, Chuan Wang.

Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 435-440 8795047 (Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Imitation learning for sentence generation with dilated convolutions using adversarial training

AU - Peng, Jian Wei

AU - Hu, Min-Chun

AU - Chang, Chuan Wang

PY - 2019/7/1

Y1 - 2019/7/1

N2 - In this work, we consider the sentence generation problem as an imitation learning problem, which aims to learn a policy to mimic the expert. Recent works have showed that adversarial learning can be applied to imitation learning problems. However, it has been indicated that the reward signal from the discriminator is not robust in reinforcement learning (RL) based generative adversarial network (GAN), and estimating state-action value is usually computationally intractable. To deal with this problem, we propose to use two discriminators to provide two different reward signals for constructing a more general imitation learning framework that can be used for sequence generation. Monte Carlo (MC) rollout is therefore not necessary to make our algorithm computationally tractable for generating long sequences. Furthermore, our policy and discriminator networks are integrated by sharing another state encoder network constructed based on dilated convolutions instead of recurrent neural networks (RNNs). In our experiment, we show that the two reward signals control the trade-off between the quality and the diversity of the output sequences.

AB - In this work, we consider the sentence generation problem as an imitation learning problem, which aims to learn a policy to mimic the expert. Recent works have showed that adversarial learning can be applied to imitation learning problems. However, it has been indicated that the reward signal from the discriminator is not robust in reinforcement learning (RL) based generative adversarial network (GAN), and estimating state-action value is usually computationally intractable. To deal with this problem, we propose to use two discriminators to provide two different reward signals for constructing a more general imitation learning framework that can be used for sequence generation. Monte Carlo (MC) rollout is therefore not necessary to make our algorithm computationally tractable for generating long sequences. Furthermore, our policy and discriminator networks are integrated by sharing another state encoder network constructed based on dilated convolutions instead of recurrent neural networks (RNNs). In our experiment, we show that the two reward signals control the trade-off between the quality and the diversity of the output sequences.

UR - http://www.scopus.com/inward/record.url?scp=85071456839&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071456839&partnerID=8YFLogxK

U2 - 10.1109/ICMEW.2019.00081

DO - 10.1109/ICMEW.2019.00081

M3 - Conference contribution

AN - SCOPUS:85071456839

T3 - Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019

SP - 435

EP - 440

BT - Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Peng JW, Hu M-C, Chang CW. Imitation learning for sentence generation with dilated convolutions using adversarial training. In Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 435-440. 8795047. (Proceedings - 2019 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2019). https://doi.org/10.1109/ICMEW.2019.00081