Walking motion generation, synthesis, and control for biped robot by using PGRL, LPI, and fuzzy logic

Tzuu-Hseng S. Li, Yu Te Su, Shao Wei Lai, Jhen Jia Hu

Research output: Contribution to journalArticle

49 Citations (Scopus)

Abstract

This paper proposes the implementation of fuzzy motion control based on reinforcement learning (RL) and Lagrange polynomial interpolation (LPI) for gait synthesis of biped robots. First, the procedure of a walking gait is redefined into three states, and the parameters of this designed walking gait are determined. Then, the machine learning approach applied to adjusting the walking parameters is policy gradient RL (PGRL), which can execute real-time performance and directly modify the policy without calculating the dynamic function. Given a parameterized walking motion designed for biped robots, the PGRL algorithm automatically searches the set of possible parameters and finds the fastest possible walking motion. The reward function mainly considered is first the walking speed, which can be estimated from the vision system. However, the experiment illustrates that there are some stability problems in this kind of learning process. To solve these problems, the desired zero moment point trajectory is added to the reward function. The results show that the robot not only has more stable walking but also increases its walking speed after learning. This is more effective and attractive than manual trial-and-error tuning. LPI, moreover, is employed to transform the existing motions to the motion which has a revised angle determined by the fuzzy motion controller. Then, the biped robot can continuously walk in any desired direction through this fuzzy motion control. Finally, the fuzzy-based gait synthesis control is demonstrated by tasks and point- and line-target tracking. The experiments show the feasibility and effectiveness of gait learning with PGRL and the practicability of the proposed fuzzy motion control scheme.

Original languageEnglish
Article number5640679
Pages (from-to)736-748
Number of pages13
JournalIEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Volume41
Issue number3
DOIs
Publication statusPublished - 2011 Jun 1

Fingerprint

Fuzzy logic
Interpolation
Motion control
Fuzzy control
Polynomials
Robots
Reinforcement learning
Target tracking
Learning systems
Tuning
Experiments
Trajectories
Controllers

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

@article{02da7b74c44b40f3b04fe24b001512a2,
title = "Walking motion generation, synthesis, and control for biped robot by using PGRL, LPI, and fuzzy logic",
abstract = "This paper proposes the implementation of fuzzy motion control based on reinforcement learning (RL) and Lagrange polynomial interpolation (LPI) for gait synthesis of biped robots. First, the procedure of a walking gait is redefined into three states, and the parameters of this designed walking gait are determined. Then, the machine learning approach applied to adjusting the walking parameters is policy gradient RL (PGRL), which can execute real-time performance and directly modify the policy without calculating the dynamic function. Given a parameterized walking motion designed for biped robots, the PGRL algorithm automatically searches the set of possible parameters and finds the fastest possible walking motion. The reward function mainly considered is first the walking speed, which can be estimated from the vision system. However, the experiment illustrates that there are some stability problems in this kind of learning process. To solve these problems, the desired zero moment point trajectory is added to the reward function. The results show that the robot not only has more stable walking but also increases its walking speed after learning. This is more effective and attractive than manual trial-and-error tuning. LPI, moreover, is employed to transform the existing motions to the motion which has a revised angle determined by the fuzzy motion controller. Then, the biped robot can continuously walk in any desired direction through this fuzzy motion control. Finally, the fuzzy-based gait synthesis control is demonstrated by tasks and point- and line-target tracking. The experiments show the feasibility and effectiveness of gait learning with PGRL and the practicability of the proposed fuzzy motion control scheme.",
author = "Li, {Tzuu-Hseng S.} and Su, {Yu Te} and Lai, {Shao Wei} and Hu, {Jhen Jia}",
year = "2011",
month = "6",
day = "1",
doi = "10.1109/TSMCB.2010.2089978",
language = "English",
volume = "41",
pages = "736--748",
journal = "IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics",
issn = "1083-4419",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

Walking motion generation, synthesis, and control for biped robot by using PGRL, LPI, and fuzzy logic. / Li, Tzuu-Hseng S.; Su, Yu Te; Lai, Shao Wei; Hu, Jhen Jia.

In: IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, Vol. 41, No. 3, 5640679, 01.06.2011, p. 736-748.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Walking motion generation, synthesis, and control for biped robot by using PGRL, LPI, and fuzzy logic

AU - Li, Tzuu-Hseng S.

AU - Su, Yu Te

AU - Lai, Shao Wei

AU - Hu, Jhen Jia

PY - 2011/6/1

Y1 - 2011/6/1

N2 - This paper proposes the implementation of fuzzy motion control based on reinforcement learning (RL) and Lagrange polynomial interpolation (LPI) for gait synthesis of biped robots. First, the procedure of a walking gait is redefined into three states, and the parameters of this designed walking gait are determined. Then, the machine learning approach applied to adjusting the walking parameters is policy gradient RL (PGRL), which can execute real-time performance and directly modify the policy without calculating the dynamic function. Given a parameterized walking motion designed for biped robots, the PGRL algorithm automatically searches the set of possible parameters and finds the fastest possible walking motion. The reward function mainly considered is first the walking speed, which can be estimated from the vision system. However, the experiment illustrates that there are some stability problems in this kind of learning process. To solve these problems, the desired zero moment point trajectory is added to the reward function. The results show that the robot not only has more stable walking but also increases its walking speed after learning. This is more effective and attractive than manual trial-and-error tuning. LPI, moreover, is employed to transform the existing motions to the motion which has a revised angle determined by the fuzzy motion controller. Then, the biped robot can continuously walk in any desired direction through this fuzzy motion control. Finally, the fuzzy-based gait synthesis control is demonstrated by tasks and point- and line-target tracking. The experiments show the feasibility and effectiveness of gait learning with PGRL and the practicability of the proposed fuzzy motion control scheme.

AB - This paper proposes the implementation of fuzzy motion control based on reinforcement learning (RL) and Lagrange polynomial interpolation (LPI) for gait synthesis of biped robots. First, the procedure of a walking gait is redefined into three states, and the parameters of this designed walking gait are determined. Then, the machine learning approach applied to adjusting the walking parameters is policy gradient RL (PGRL), which can execute real-time performance and directly modify the policy without calculating the dynamic function. Given a parameterized walking motion designed for biped robots, the PGRL algorithm automatically searches the set of possible parameters and finds the fastest possible walking motion. The reward function mainly considered is first the walking speed, which can be estimated from the vision system. However, the experiment illustrates that there are some stability problems in this kind of learning process. To solve these problems, the desired zero moment point trajectory is added to the reward function. The results show that the robot not only has more stable walking but also increases its walking speed after learning. This is more effective and attractive than manual trial-and-error tuning. LPI, moreover, is employed to transform the existing motions to the motion which has a revised angle determined by the fuzzy motion controller. Then, the biped robot can continuously walk in any desired direction through this fuzzy motion control. Finally, the fuzzy-based gait synthesis control is demonstrated by tasks and point- and line-target tracking. The experiments show the feasibility and effectiveness of gait learning with PGRL and the practicability of the proposed fuzzy motion control scheme.

UR - http://www.scopus.com/inward/record.url?scp=79957534331&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79957534331&partnerID=8YFLogxK

U2 - 10.1109/TSMCB.2010.2089978

DO - 10.1109/TSMCB.2010.2089978

M3 - Article

C2 - 21095871

AN - SCOPUS:79957534331

VL - 41

SP - 736

EP - 748

JO - IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

JF - IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

SN - 1083-4419

IS - 3

M1 - 5640679

ER -