CNN and LSTM Based Facial Expression Analysis Model for a Humanoid Robot

Tzuu Hseng S. Li, Ping Huan Kuo, Ting Nan Tsai, Po Chien Luan

Research output: Contribution to journalArticle

Abstract

Robots must be able to recognize human emotions to improve the human-robot interaction (HRI). This study proposes an emotion recognition system for a humanoid robot. The robot is equipped with a camera to capture users' facial images, and it uses this system to recognize users' emotions and responds appropriately. The emotion recognition system, based on a deep neural network, learns six basic emotions: happiness, anger, disgust, fear, sadness, and surprise. First, a convolutional neural network (CNN) is used to extract visual features by learning on a large number of static images. Second, a long short-term memory (LSTM) recurrent neural network is used to determine the relationship between the transformation of facial expressions in image sequences and the six basic emotions. Third, CNN and LSTM are combined to exploit their advantages in the proposed model. Finally, the performance of the emotion recognition system is improved by using transfer learning, that is, by transferring knowledge of related but different problems. The performance of the proposed system is verified through leave-one-out cross-validation and compared with that of other models. The system is applied to a humanoid robot to demonstrate its practicability for improving the HRI.

Original languageEnglish
Article number8760246
Pages (from-to)93998-94011
Number of pages14
JournalIEEE Access
Volume7
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Robots
Neural networks
Human robot interaction
Recurrent neural networks
Cameras
Long short-term memory
Deep neural networks

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Cite this

Li, Tzuu Hseng S. ; Kuo, Ping Huan ; Tsai, Ting Nan ; Luan, Po Chien. / CNN and LSTM Based Facial Expression Analysis Model for a Humanoid Robot. In: IEEE Access. 2019 ; Vol. 7. pp. 93998-94011.
@article{3da74b7cf0014594845a84563e37ab6a,
title = "CNN and LSTM Based Facial Expression Analysis Model for a Humanoid Robot",
abstract = "Robots must be able to recognize human emotions to improve the human-robot interaction (HRI). This study proposes an emotion recognition system for a humanoid robot. The robot is equipped with a camera to capture users' facial images, and it uses this system to recognize users' emotions and responds appropriately. The emotion recognition system, based on a deep neural network, learns six basic emotions: happiness, anger, disgust, fear, sadness, and surprise. First, a convolutional neural network (CNN) is used to extract visual features by learning on a large number of static images. Second, a long short-term memory (LSTM) recurrent neural network is used to determine the relationship between the transformation of facial expressions in image sequences and the six basic emotions. Third, CNN and LSTM are combined to exploit their advantages in the proposed model. Finally, the performance of the emotion recognition system is improved by using transfer learning, that is, by transferring knowledge of related but different problems. The performance of the proposed system is verified through leave-one-out cross-validation and compared with that of other models. The system is applied to a humanoid robot to demonstrate its practicability for improving the HRI.",
author = "Li, {Tzuu Hseng S.} and Kuo, {Ping Huan} and Tsai, {Ting Nan} and Luan, {Po Chien}",
year = "2019",
month = "1",
day = "1",
doi = "10.1109/ACCESS.2019.2928364",
language = "English",
volume = "7",
pages = "93998--94011",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

CNN and LSTM Based Facial Expression Analysis Model for a Humanoid Robot. / Li, Tzuu Hseng S.; Kuo, Ping Huan; Tsai, Ting Nan; Luan, Po Chien.

In: IEEE Access, Vol. 7, 8760246, 01.01.2019, p. 93998-94011.

Research output: Contribution to journalArticle

TY - JOUR

T1 - CNN and LSTM Based Facial Expression Analysis Model for a Humanoid Robot

AU - Li, Tzuu Hseng S.

AU - Kuo, Ping Huan

AU - Tsai, Ting Nan

AU - Luan, Po Chien

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Robots must be able to recognize human emotions to improve the human-robot interaction (HRI). This study proposes an emotion recognition system for a humanoid robot. The robot is equipped with a camera to capture users' facial images, and it uses this system to recognize users' emotions and responds appropriately. The emotion recognition system, based on a deep neural network, learns six basic emotions: happiness, anger, disgust, fear, sadness, and surprise. First, a convolutional neural network (CNN) is used to extract visual features by learning on a large number of static images. Second, a long short-term memory (LSTM) recurrent neural network is used to determine the relationship between the transformation of facial expressions in image sequences and the six basic emotions. Third, CNN and LSTM are combined to exploit their advantages in the proposed model. Finally, the performance of the emotion recognition system is improved by using transfer learning, that is, by transferring knowledge of related but different problems. The performance of the proposed system is verified through leave-one-out cross-validation and compared with that of other models. The system is applied to a humanoid robot to demonstrate its practicability for improving the HRI.

AB - Robots must be able to recognize human emotions to improve the human-robot interaction (HRI). This study proposes an emotion recognition system for a humanoid robot. The robot is equipped with a camera to capture users' facial images, and it uses this system to recognize users' emotions and responds appropriately. The emotion recognition system, based on a deep neural network, learns six basic emotions: happiness, anger, disgust, fear, sadness, and surprise. First, a convolutional neural network (CNN) is used to extract visual features by learning on a large number of static images. Second, a long short-term memory (LSTM) recurrent neural network is used to determine the relationship between the transformation of facial expressions in image sequences and the six basic emotions. Third, CNN and LSTM are combined to exploit their advantages in the proposed model. Finally, the performance of the emotion recognition system is improved by using transfer learning, that is, by transferring knowledge of related but different problems. The performance of the proposed system is verified through leave-one-out cross-validation and compared with that of other models. The system is applied to a humanoid robot to demonstrate its practicability for improving the HRI.

UR - http://www.scopus.com/inward/record.url?scp=85073888886&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073888886&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2019.2928364

DO - 10.1109/ACCESS.2019.2928364

M3 - Article

AN - SCOPUS:85073888886

VL - 7

SP - 93998

EP - 94011

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

M1 - 8760246

ER -