I'm working with CoppeliaSim and PyRep, and I'm trying to set up a simplified reinforcement learning scene using only a gripper without arm similar to the setup of Breyer et al. [1], and the work of Baris Yazici [2].
The idea is to setup a scene with only a gripper equipped with an eye-in-hand depth sensor for top-down grasping. At the start of each training episode, the gripper spawns at a start height. The action the gripper can take is limited to a \(\delta x, \delta y\) translation relative to the current gripper position and a yaw rotation \(\delta \phi\) relative to the current gripper yaw orientation. The action of the gripper is represented by \([\delta x, \delta y, \delta\phi]\), and every timestep the gripper moves downward a certain amount \(\delta z\), until it reaches a suitable height where the gripper automatically closes and tries to grasp one of the floor objects. The simplified environment I'm trying to recreate is illustrated in [2, 3].
I have created a simple scene with a static, respondable Baxter gripper oriented downwards to the plane, at a certain height.
I created a small test program with PyRep, that moves the gripper downwards a small amount every timestep by calling
gripper.set_position().
Code: Select all
# python
from os.path import dirname, join, abspath
from pyrep import PyRep
from pyrep.robots.end_effectors.baxter_gripper import BaxterGripper
import numpy as np
SCENE_FILE = join(dirname(abspath(__file__)), 'scenes/gripper_rl.ttt')
pr = PyRep()
pr.launch(SCENE_FILE, headless=False)
pr.start()
gripper = BaxterGripper()
LOOPS = 2000
gripper_start_pos = gripper.get_position()
gripper_x, gripper_y, gripper_z = gripper_start_pos.tolist()
close_height = 0.1
gripper_target_pos = [gripper_x, gripper_y, close_height]
gripper_target_pos = np.array(gripper_target_pos, dtype=np.float)
dz = 0.01
# Move gripper downwards and close when gripper has reached close_height
for i in range(LOOPS):
curr_pos = gripper.get_position()
# close gripper
if np.isclose(curr_pos, gripper_target_pos).all():
gripper.actuate(1, velocity=0.5) # make sure gripper is open
pr.step()
# 0 for close and 1 for open.
gripper.actuate(0, velocity=0.5)
pr.step() # Step the physics simulation
break
# move gripper downward
diff = gripper_target_pos - curr_pos
pos = curr_pos + dz * diff
gripper.set_position(pos.tolist())
pr.step()
pr.stop()
pr.shutdown()
Is it correct to call set_matrix()/simSetObjectMatrix() directly on the gripper object to account for automatic downwards translation, as well as the \(x, y\) translation and the yaw \(\phi\) rotation constituted by the action, instead of separately calling methods for the translation and the rotation? Or is there a better way?
Highly appreciate any tips/advice on how to set up this environment.
Cheers
references:
[1] https://arxiv.org/abs/1803.04996
[2] https://github.com/BarisYazici/deep-rl-grasping
[3] https://www.youtube.com/watch?v=ii16Zejmf-E