ROS kinect depth image encoding problem

Typically: "How do I... ", "How can I... " questions
ljklonepiece
Posts: 105
Joined: 10 Oct 2013, 14:51

ROS kinect depth image encoding problem

Post by ljklonepiece »

Hi developers,

I am currently publishing the simulated kinect depth stream as a ros topic as follows:

depthCam=simGetObjectHandle('kinect_depth')
depth_pub = simExtROS_enablePublisher('vrep_depth_image',1,simros_strmcmd_get_vision_sensor_image,depthCam,0,'')

I realize the encoding of published depth image of type sensor_msgs/Image is RGB8, while the real kinect publishes depth image with encoding 16UC1.

Is there a way to publish depth image with encoding 16UC1 in V-REP?

The reason is that I want to preserve the original absolute value in the depth image, while RGB8 encoding only gives me the relative value among pixels.

Thanks a lot for your help!

coppelia
Site Admin
Posts: 7966
Joined: 14 Dec 2012, 00:25

Re: ROS kinect depth image encoding problem

Post by coppelia »

Hello,

it seems that you are still using the old ROS plugin. We highly recommend that you use the new ROS interface, which is more flexible, extendable, and naturally duplicates the ROS c++ API. It can also run parallel to the old ROS plugin.

With the new RosInterface, you would do this with something like:

Code: Select all

if (sim_call_type==sim_childscriptcall_initialization) then
    activeVisionSensor=simGetObjectHandle('Vision_sensor')

    -- Enable an image publisher and subscriber:
    pub=simExtRosInterface_advertise('/image', 'sensor_msgs/Image')
    simExtRosInterface_publisherTreatUInt8ArrayAsString(pub) -- treat uint8 arrays as strings (much faster, tables/arrays are kind of slow in Lua)
end

if (sim_call_type==sim_childscriptcall_sensing) then
    -- Publish the image of the active vision sensor:
    local data=simGetVisionSensorDepthBuffer(activeVisionSensor+sim_handleflag_codedstring)
    local res=simGetVisionSensorResolution(activeVisionSensor)
    d={}
    d['header']={seq=0,stamp=0, frame_id="a"}
    d['height']=res[2]
    d['width']=res[1]
    d['encoding']='32SC1' -- ? 
    d['is_bigendian']=1
    d['step']=res[1]*res[2]
    d['data']=data
    simExtRosInterface_publish(pub,d)
end
With above code (and with the last V-REP version, i.e. V3.3.2), the data returned by simGetVisionSensorDepthBuffer will be a string (i.e. buffer) where each depth value is packed in 4 bytes.

Question: how is the data organized in the 16UC1 encoding? Or better: how would the C/C++ code look like to produce that encoding?

Cheers

ljklonepiece
Posts: 105
Joined: 10 Oct 2013, 14:51

Re: ROS kinect depth image encoding problem

Post by ljklonepiece »

Hi,

Thanks for your great help!

Well, I am not so sure about how to write code to organize data into 16UC1, but according to http://wiki.ros.org/openni_camera#openn ... hed_Topics,

the openni encodes depth info into uint16 depths in mm.
Maybe is there a way to convert the string representation in V-REP into uint16? Please advise if any.

Thanks a lot!

coppelia
Site Admin
Posts: 7966
Joined: 14 Dec 2012, 00:25

Re: ROS kinect depth image encoding problem

Post by coppelia »

I have prepared a revision version of V-REP (i.e. V3.3.2 rev3) that should be online within 2-3 days. With the new version, you will be able to do something like:

Code: Select all

if (sim_call_type==sim_childscriptcall_initialization) then
    activeVisionSensor=simGetObjectHandle('Vision_sensor')

    -- Enable an image publisher and subscriber:
    pub=simExtRosInterface_advertise('/image', 'sensor_msgs/Image')
    simExtRosInterface_publisherTreatUInt8ArrayAsString(pub) -- treat uint8 arrays as strings (much faster, tables/arrays are kind of slow in Lua)
end

if (sim_call_type==sim_childscriptcall_sensing) then
    -- Publish the image of the active vision sensor:
    local data=simGetVisionSensorDepthBuffer(activeVisionSensor+sim_handleflag_codedstring)
    local res,nearClippingPlane=simGetObjectFloatParameter(activeVisionSensor,sim_visionfloatparam_near_clipping)
    local res,farClippingPlane=simGetObjectFloatParameter(activeVisionSensor,sim_visionfloatparam_far_clipping)
    nearClippingPlane=nearClippingPlane*1000 -- we want mm
    farClippingPlane=farClippingPlane*1000 -- we want mm
    data=simTransformBuffer(data,sim_buffer_float,farClippingPlane-nearClippingPlane,nearClippingPlane,sim_buffer_uint16)
    local res=simGetVisionSensorResolution(activeVisionSensor)
    d={}
    d['header']={seq=0,stamp=0, frame_id="a"}
    d['height']=res[2]
    d['width']=res[1]
    d['encoding']='16UC1' 
    d['is_bigendian']=1
    d['step']=res[1]*res[2]
    d['data']=data
    simExtRosInterface_publish(pub,d)
end
above is not tested, but it should be along that direction.

Cheers

ljklonepiece
Posts: 105
Joined: 10 Oct 2013, 14:51

Re: ROS kinect depth image encoding problem

Post by ljklonepiece »

Thank you so much!

Looking forward to the next revision release!

ljklonepiece
Posts: 105
Joined: 10 Oct 2013, 14:51

Re: ROS kinect depth image encoding problem

Post by ljklonepiece »

The previous code works perfectly on V-REP V3.3.2 rev3

Thanks for the help!

Billie1123
Posts: 62
Joined: 08 Jun 2016, 22:47

Re: ROS kinect depth image encoding problem

Post by Billie1123 »

The new simTransformBuffer() function works great.

Nevertheless, I would like to make an observation: in the rosInterfaceTopicPublisherAndSubscriber scene, the step is defined as h*w*3 (3 RGB bytes) and in the example code above it has been defined as res[1]*res[2] (w*h). According to the sensor_msgs/Image message definition, the step is the row length in bytes. So, for a rgb8 image step should be w*3 and for a 32FC1 image step should be w*4 (distance is given by a 4-byte float). I know this is not a V-REP issue, but I wanted to point it out since it can be missleading. I have observed that some visualization tools like RViz work fine even with wrong step sizes as long as it is bigger than the values specified above. Correct me if I am wrong.

Another thing I have observed is that if I have a depth vision sensor and I add a "flip work image vertically" filter, the published image will remain the same. Why is this being ignored? My filters are:
Original depth image to work image
Flip work image vertically
Intensity scale work image
Work image to output image
Regards

coppelia
Site Admin
Posts: 7966
Joined: 14 Dec 2012, 00:25

Re: ROS kinect depth image encoding problem

Post by coppelia »

You are perfectly right, thanks!

About the depth vision sensor sensor with the vertical flip filter: I cannot reproduce that behaviour (i.e. it works here). Are you sure that filter component is enabled?

Cheers

Billie1123
Posts: 62
Joined: 08 Jun 2016, 22:47

Re: ROS kinect depth image encoding problem

Post by Billie1123 »

Hello,

Yes, my filters are enabled. I think that when simGetVisionSensorDepthBuffer() is called, it retrieves the original depth image, that must be why my published image remains the same (i.e: from a floating viewer in V-REP, the image does take into account the flip, but from an image viewer in RViz, the image won't change).

I need to flip my images because V-REP uses a different coordinate system to RViz. Is there a way to change the frame of the vision sensor? According to the image message definition:
# +x should point to the right in the image
# +y should point down in the image
# +z should point into the plane of the image
The only workaround I come up with is writing a plugin in which I manage all these issues.

Furthermore, the simTransformBuffer() function treats areas in which no point has been detected as points with value "1". That is interpreted as a detected point at the farthest distance. Shouldn't pixels with value "1" be omitted?

coppelia
Site Admin
Posts: 7966
Joined: 14 Dec 2012, 00:25

Re: ROS kinect depth image encoding problem

Post by coppelia »

You are right, there is currently no way to write the modified image back to the depth buffer. In next release there will be a filter item called Work image to output depth image that will do exactly that.

The simTransformBuffer function doesn't know about what the buffer contains, i.e. is it an image, a depth buffer, or a random buffer? So I guess your best option would be to do this inside of a plugin. It is always convenient anyway to have a plugin that can host various helper functions you might need in your project, so that you don't have to write another plugin each time.

Cheers

Post Reply