I have tried to follow the Forum questions to come up with a solution for simulating the kinect behavior. What I aim is simple, there is a Kinect in my scene (the one in components) and I want to publish depth and rgb images from CoppeliaSim and process them in an external node.
In the bottom I share my code in the Coppelia and marked the question relevant part with --QRP_start and --QRP_end
tags. Additionally, I shared some code segments from the python node, that is relevant to my question. Which are reading image data with opencv and publish the read data for both rbg and depth image.
Coppelia Sim Lua Code:
Code: Select all
function sysCall_init()
-- Get some handles:
kinectRGB=sim.getObjectHandle('kinect_rgb')
kinectDepth=sim.getObjectHandle('kinect_depth')
-- Enable image publishers
if simROS then
rgb_pub=simROS.advertise('/coppelia/camera/rgb_image', 'sensor_msgs/Image')
simROS.publisherTreatUInt8ArrayAsString(rgb_pub)
depth_pub=simROS.advertise('/coppelia/camera/depth_image', 'sensor_msgs/Image')
simROS.publisherTreatUInt8ArrayAsString(depth_pub)
else
print("<font color='#F00'>ROS interface was not found. Cannot run.</font>@html")
end
end
function sysCall_sensing()
if simROS then
-- Publish the image of the active vision sensor:
local data,width,height=sim.getVisionSensorCharImage(kinectRGB)
rgb_d={}
rgb_d['header'] = {stamp=simROS.getTime(), frame_id="rgb_mert"}
rgb_d['height'] = height
rgb_d['width'] = width
rgb_d['encoding'] = 'rgb8'
rgb_d['is_bigendian'] = 1 --information on where the important side is
rgb_d['step'] = width * 3 -- channels r,g,b
rgb_d['data'] = data
simROS.publish(rgb_pub,rgb_d)
-- QRP_start
local data=sim.getVisionSensorDepthBuffer(kinectDepth+sim.handleflag_codedstring)
local res,nearClippingPlane=sim.getObjectFloatParameter(kinectDepth,sim.visionfloatparam_near_clipping)
local res,farClippingPlane=sim.getObjectFloatParameter(kinectDepth,sim.visionfloatparam_far_clipping)
nearClippingPlane=nearClippingPlane*1000 -- we want mm
farClippingPlane=farClippingPlane*1000 -- we want mm
data=sim.transformBuffer(data,sim.buffer_float,farClippingPlane-nearClippingPlane,nearClippingPlane,sim.buffer_uint16)
local res=sim.getVisionSensorResolution(kinectDepth)
width, height = res[1], res[2]
depth_d={}
depth_d['header'] = {stamp=simROS.getTime(), frame_id="depth_mert"}
depth_d['height'] = height
depth_d['width'] = width
depth_d['encoding'] = '16UC1'
depth_d['is_bigendian'] = 1
depth_d['step'] = width*2
depth_d['data'] = data
simROS.publish(depth_pub,depth_d)
--QRP_end
end
end
function sysCall_cleanup()
if simROS then
-- Shut down publisher and subscriber. Not really needed from a simulation script (automatic shutdown)
simROS.shutdownPublisher(rgb_pub)
simROS.shutdownPublisher(depth_pub)
simROS.shutdownPublisher(cameraInfo_pub)
end
end
Pieces of Python code:
Pieces from rgb image subscriber :
Code: Select all
...
rgb_image = self.bridge.imgmsg_to_cv2(rgb_data, "bgr8")
... # No processing on rgb_image
rgb_pub.publish(self.bridge.cv2_to_imgmsg(rgb_image, "bgr8"))
...
Pieces from rgb image subscriber :
Code: Select all
...
depth_image = self.bridge.imgmsg_to_cv2(depth_data, '16UC1')
... ## No processing on depth_image
depth_pub.publish(self.bridge.cv2_to_imgmsg(depth_image, '16UC1'))
...
Initially, I thought the problem is in the python script since rviz can visualize the depth image published from simulator well. However, I do not do any processing on the retrieved messages besides using an opencv bridge to transfer from message and to message. And I was not sure with why I doing some parts of my code, therefore I wanted to ask some questions.
1) Does my code for the simulator have any flaws? I thought it might be possible that rviz may be skipping some info or use default values, so even if it shows a good image, it does not necessarily mean my publisher is correct.
2)In case there are no flaws:
a) Why do we use 16UC1 encoding?
b)Is all ros publishers are treating variables as uint8 by default?
Am I correct in thinking three statemnets below?
i) the reason we get data as string from depth buffer is to make it faster. (line relatet to getVisionSensorDepthBuffer) (Apparently not regular string but encoded string which I cannot get type or length by print)
ii)The reason we use transform function is two-fold. First we have float values as encoded strings but we need to transform them to uint16, second since we have string we cannot do mathematical operations on the retrieved buffer.
iii) What is the role of this line: simROS.publisherTreatUInt8ArrayAsString(depth_pub) I was thinking it was to make it faster, but removing it causes the publisher to crash, so I suppose it is also related to the fact that the data is already in some string format. A
3)An alternative without traiting uint8 as string in publisher would be by removing +handleflag_codedstring part and calculating the distances manually on the float table which is already normalized. I can do that in a for loop for each element, however, if I would want it to be done the same way for all the entries I could benefit from some matrix multiplication.But I am not sure if in Coppelia I am able to do that.
4) Any help related to opencv would also be appreciated.
Sorry for the long post, I wanted it not to neglect any part.
Simulator Scene:
Rviz Depth image from topic published by the simulator
Rviz Depth image from topic published by the python node