I'm working with CoppeliaSim and PyRep, and I'm trying to set up a simplified reinforcement learning scene using only a gripper without arm similar to the setup of Breyer et al. , and the work of Baris Yazici .
The idea is to setup a scene with only a gripper equipped with an eye-in-hand depth sensor for top-down grasping. At the start of each training episode, the gripper spawns at a start height. The action the gripper can take is limited to a \(\delta x, \delta y\) translation relative to the current gripper position and a yaw rotation \(\delta \phi\) relative to the current gripper yaw orientation. The action of the gripper is represented by \([\delta x, \delta y, \delta\phi]\), and every timestep the gripper moves downward a certain amount \(\delta z\), until it reaches a suitable height where the gripper automatically closes and tries to grasp one of the floor objects. The simplified environment I'm trying to recreate is illustrated in [2, 3].
I have created a simple scene with a static, respondable Baxter gripper oriented downwards to the plane, at a certain height.
I created a small test program with PyRep, that moves the gripper downwards a small amount every timestep by calling
Code: Select all
# python from os.path import dirname, join, abspath from pyrep import PyRep from pyrep.robots.end_effectors.baxter_gripper import BaxterGripper import numpy as np SCENE_FILE = join(dirname(abspath(__file__)), 'scenes/gripper_rl.ttt') pr = PyRep() pr.launch(SCENE_FILE, headless=False) pr.start() gripper = BaxterGripper() LOOPS = 2000 gripper_start_pos = gripper.get_position() gripper_x, gripper_y, gripper_z = gripper_start_pos.tolist() close_height = 0.1 gripper_target_pos = [gripper_x, gripper_y, close_height] gripper_target_pos = np.array(gripper_target_pos, dtype=np.float) dz = 0.01 # Move gripper downwards and close when gripper has reached close_height for i in range(LOOPS): curr_pos = gripper.get_position() # close gripper if np.isclose(curr_pos, gripper_target_pos).all(): gripper.actuate(1, velocity=0.5) # make sure gripper is open pr.step() # 0 for close and 1 for open. gripper.actuate(0, velocity=0.5) pr.step() # Step the physics simulation break # move gripper downward diff = gripper_target_pos - curr_pos pos = curr_pos + dz * diff gripper.set_position(pos.tolist()) pr.step() pr.stop() pr.shutdown()
Is it correct to call set_matrix()/simSetObjectMatrix() directly on the gripper object to account for automatic downwards translation, as well as the \(x, y\) translation and the yaw \(\phi\) rotation constituted by the action, instead of separately calling methods for the translation and the rotation? Or is there a better way?
Highly appreciate any tips/advice on how to set up this environment.