TY - JOUR
T1 - CNN and LSTM Based Facial Expression Analysis Model for a Humanoid Robot
AU - Li, Tzuu Hseng S.
AU - Kuo, Ping Huan
AU - Tsai, Ting Nan
AU - Luan, Po Chien
N1 - Funding Information:
This work was supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 106-2218-E-153-001-MY3 and Grant MOST 106-2221-E-006-009-MY3.
Publisher Copyright:
© 2013 IEEE.
PY - 2019
Y1 - 2019
N2 - Robots must be able to recognize human emotions to improve the human-robot interaction (HRI). This study proposes an emotion recognition system for a humanoid robot. The robot is equipped with a camera to capture users' facial images, and it uses this system to recognize users' emotions and responds appropriately. The emotion recognition system, based on a deep neural network, learns six basic emotions: happiness, anger, disgust, fear, sadness, and surprise. First, a convolutional neural network (CNN) is used to extract visual features by learning on a large number of static images. Second, a long short-term memory (LSTM) recurrent neural network is used to determine the relationship between the transformation of facial expressions in image sequences and the six basic emotions. Third, CNN and LSTM are combined to exploit their advantages in the proposed model. Finally, the performance of the emotion recognition system is improved by using transfer learning, that is, by transferring knowledge of related but different problems. The performance of the proposed system is verified through leave-one-out cross-validation and compared with that of other models. The system is applied to a humanoid robot to demonstrate its practicability for improving the HRI.
AB - Robots must be able to recognize human emotions to improve the human-robot interaction (HRI). This study proposes an emotion recognition system for a humanoid robot. The robot is equipped with a camera to capture users' facial images, and it uses this system to recognize users' emotions and responds appropriately. The emotion recognition system, based on a deep neural network, learns six basic emotions: happiness, anger, disgust, fear, sadness, and surprise. First, a convolutional neural network (CNN) is used to extract visual features by learning on a large number of static images. Second, a long short-term memory (LSTM) recurrent neural network is used to determine the relationship between the transformation of facial expressions in image sequences and the six basic emotions. Third, CNN and LSTM are combined to exploit their advantages in the proposed model. Finally, the performance of the emotion recognition system is improved by using transfer learning, that is, by transferring knowledge of related but different problems. The performance of the proposed system is verified through leave-one-out cross-validation and compared with that of other models. The system is applied to a humanoid robot to demonstrate its practicability for improving the HRI.
UR - http://www.scopus.com/inward/record.url?scp=85073888886&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073888886&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2019.2928364
DO - 10.1109/ACCESS.2019.2928364
M3 - Article
AN - SCOPUS:85073888886
SN - 2169-3536
VL - 7
SP - 93998
EP - 94011
JO - IEEE Access
JF - IEEE Access
M1 - 8760246
ER -