Conditional Sequence Generative Adversarial Learning with Attention-Based Rewards

  • 蔡 子健

Student thesis: Master's Thesis


With the significant development of the deep learning technique more and more neural networks have been proposed to solve some intricate problems about sequence generation Generative adversarial Learning is one of the most novel strategies Models applying this particular idea generally called “Generative Adversarial Nets (GANs)” consists of two parts: the Generator and the Discriminator These models use the discriminator to guide the training of the generator for improving the effectiveness of each other At the same time GANs have already achieved great contributions to image processing However the effect of GANs on text generation has been shown unstable Three major limitations cause GANs are hard to make a breakthrough in Nature Language Processing (NLP) Firstly with considering the dialogue generation problem as a kind of decision-making step the discrete outputs generated by the sampling operation is difficult to pass through the gradient from the discriminator to the generator Secondly prediction errors will be accumulated during generating sequence because of the different strategy between training and testing using the recurrent neural network (RNN) Therefore we call it “exposure bias” for short Finally yet importantly the discriminator is only able to evaluate a complete sequence which for every time steps it is harsh that to extract the current score for every partial word In summary how to deal with these series of questions has become the critical factor if we can apply GANs in the NLP field In this paper we propose a conditional sequence generative adversarial network to solve these problems by using the attention-based reward strategy We jointly train an attention mechanism and the GANs This model dynamically assigns the weights of feedback information from the discriminator back to the generator conditioned on the potential associations between words and sentences which makes the training process much more stable and computationally efficient Experimental results on synthetic data demonstrate that our model can generate better sequences Moreover we report a significant improvement of our model over the previous baselines on several real-world tasks
Date of Award2018 Jul 26
Original languageEnglish
SupervisorHung-Yu Kao (Supervisor)

Cite this