## Reinforcement learning scene for gripper without arm

Typically: "How do I... ", "How can I... " questions
josdan
Posts: 1
Joined: 10 Feb 2021, 18:57

### Reinforcement learning scene for gripper without arm

Hello,

I'm working with CoppeliaSim and PyRep, and I'm trying to set up a simplified reinforcement learning scene using only a gripper without arm similar to the setup of Breyer et al. [1], and the work of Baris Yazici [2].

The idea is to setup a scene with only a gripper equipped with an eye-in-hand depth sensor for top-down grasping. At the start of each training episode, the gripper spawns at a start height. The action the gripper can take is limited to a $$\delta x, \delta y$$ translation relative to the current gripper position and a yaw rotation $$\delta \phi$$ relative to the current gripper yaw orientation. The action of the gripper is represented by $$[\delta x, \delta y, \delta\phi]$$, and every timestep the gripper moves downward a certain amount $$\delta z$$, until it reaches a suitable height where the gripper automatically closes and tries to grasp one of the floor objects. The simplified environment I'm trying to recreate is illustrated in [2, 3].

I have created a simple scene with a static, respondable Baxter gripper oriented downwards to the plane, at a certain height.
I created a small test program with PyRep, that moves the gripper downwards a small amount every timestep by calling
gripper.set_position().

Code: Select all

# python
from os.path import dirname, join, abspath
from pyrep import PyRep
from pyrep.robots.end_effectors.baxter_gripper import BaxterGripper
import numpy as np

SCENE_FILE = join(dirname(abspath(__file__)), 'scenes/gripper_rl.ttt')
pr = PyRep()
pr.start()
gripper = BaxterGripper()

LOOPS = 2000

gripper_start_pos = gripper.get_position()
gripper_x, gripper_y, gripper_z = gripper_start_pos.tolist()

close_height = 0.1
gripper_target_pos = [gripper_x, gripper_y, close_height]
gripper_target_pos = np.array(gripper_target_pos, dtype=np.float)
dz = 0.01

# Move gripper downwards and close when gripper has reached close_height
for i in range(LOOPS):
curr_pos = gripper.get_position()
# close gripper
if np.isclose(curr_pos, gripper_target_pos).all():
gripper.actuate(1, velocity=0.5)  # make sure gripper is open
pr.step()
# 0 for close and 1 for open.
gripper.actuate(0, velocity=0.5)
pr.step()  # Step the physics simulation
break

# move gripper downward
diff = gripper_target_pos - curr_pos
pos = curr_pos + dz * diff
gripper.set_position(pos.tolist())
pr.step()

pr.stop()
pr.shutdown()

I am a little bit unsure of how to properly set up this environment.
Is it correct to call set_matrix()/simSetObjectMatrix() directly on the gripper object to account for automatic downwards translation, as well as the $$x, y$$ translation and the yaw $$\phi$$ rotation constituted by the action, instead of separately calling methods for the translation and the rotation? Or is there a better way?

Highly appreciate any tips/advice on how to set up this environment.

Cheers

references:
[1] https://arxiv.org/abs/1803.04996
[2] https://github.com/BarisYazici/deep-rl-grasping

coppelia
Posts: 8450
Joined: 14 Dec 2012, 00:25

### Re: Reinforcement learning scene for gripper without arm

Hello,

I cannot speak for how things are made via PyRep, but in CoppeliaSim, if you would just manipulate a gripper, you'd have to consider mainly that you want to move a static gripper base, with dynamic content attached on top of it: this can be seen as teleportation of the gripper base, i.e. instantly changing the position/orientation of the base of the gripper, which can generate very high forces/torques between the base and the finger joints, if done over a large distance. And using sim.setObjectPosition / sim.setObjectOrientation / sim.setObjectMatrix, etc. is fine.

And just your gripper base should be set to static obviously.

If you need to instantly move the gripper to completely other position/orientation, i.e. over a large distance/orientation, then best would be to do it while simulation is not running. If simulation is already running, you should do:

Code: Select all

sim.setThreadAutomaticSwitch(false) -- only if running in a threaded script
<move gripper base>
local objs=sim.getObjectsInTree(gripperBase,sim.handle_all,1)
for i=1,#objs,1 do
sim.resetDynamicObject(objs[i])
sim.setObjectMatrix(objs[i],sim.handle_parent,initialMatrices[objs[i]])
end
sim.setThreadAutomaticSwitch(true) -- only if running in a threaded script
In the initialization phase of above script, i.e. at simulation start, memorize the pose of all gripper objects:

Code: Select all

function sysCall_init()
gripperBase=sim.getObjectHandle(...)
initialMatrices={}
local objs=sim.getObjectsInTree(gripperBase,sim.handle_all,1)
for i=1,#objs,1 do
initialMatrices[objs[i]]=sim.getObjectMatrix(objs[i],sim.handle_parent)
end
end
Resetting objects to their initial pose relative to their parent is important if you would repeat the procedure many times without restarting the simulation, since joints will have small errors that come from physics engine calculations (sometimes joints can even break apart when the errors are large because of the forces acting upon them)

Above is the way you'd do it in CoppeliaSim. From an external application or PyRep, you could call a script function that handles the tasks for you.

Cheers