Vision-Based Robotic Object Grasping—A Deep Reinforcement Learning Approach

Ya Ling Chen, Yan Rou Cai, Ming Yang Cheng

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)


This paper focuses on developing a robotic object grasping approach that possesses the ability of self-learning, is suitable for small-volume large variety production, and has a high success rate in object grasping/pick-and-place tasks. The proposed approach consists of a computer vision-based object detection algorithm and a deep reinforcement learning algorithm with self-learning capability. In particular, the You Only Look Once (YOLO) algorithm is employed to detect and classify all objects of interest within the field of view of a camera. Based on the detection/localization and classification results provided by YOLO, the Soft Actor-Critic deep reinforcement learning algorithm is employed to provide a desired grasp pose for the robot manipulator (i.e., learning agent) to perform object grasping. In order to speed up the training process and reduce the cost of training data collection, this paper employs the Sim-to-Real technique so as to reduce the likelihood of damaging the robot manipulator due to improper actions during the training process. The V-REP platform is used to construct a simulation environment for training the deep reinforcement learning neural network. Several experiments have been conducted and experimental results indicate that the 6-DOF industrial manipulator successfully performs object grasping with the proposed approach, even for the case of previously unseen objects.

Original languageEnglish
Article number275
Issue number2
Publication statusPublished - 2023 Feb

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science (miscellaneous)
  • Mechanical Engineering
  • Control and Optimization
  • Industrial and Manufacturing Engineering
  • Electrical and Electronic Engineering


Dive into the research topics of 'Vision-Based Robotic Object Grasping—A Deep Reinforcement Learning Approach'. Together they form a unique fingerprint.

Cite this