## Conversion of Depth Map to 3D point cloud using Kinect

Typically: "How do I... ", "How can I... " questions
hassan
Posts: 10
Joined: 20 Aug 2013, 12:31

### Conversion of Depth Map to 3D point cloud using Kinect

Hello,

I have a question regarding the conversion of depth values into 3D point clouds and transformation to the world reference frame. I am using a Kinect Vision sensor. I set the angle of view using the focal length which i got by calibrating a real camera e.g. for a camera with focal length in pixels=534 i use the formula Angle=2*atan(640/ (534*2)), converting into degrees it came to about 61.86 degrees.

After that I retrieved the depth buffer, I write the depth buffer to a .txt file as explained in the other post a few days ago. Now using MATLAB i want to convert this depth map into a point cloud using something like:

Code: Select all

``````%%dImage contains the depth values retrieved from v-rep
for m=1:480
for n=1:640
Z=dImage(m,n)*1000;%%from mm to m
PointCloud(1,Count)=(n-cx)*(Z/fx);%%X
PointCloud(2,Count)=(m-cy)*(Z/fy);%%Y
PointCloud(3,Count)=Z;
Count=Count+1;
end
end
``````
I also get the absolute position of the sensor using 'simGetObjectMatrix'. Now i apply this transformation matrix to the computed point cloud for two different sensors. I should actually get a well registered point cloud from both the sensors but instead i get weird transformations. I know there is something wrong in my interpretation fo the coordinate system. But i am not quite able to interpret what exactly is it?

I hope you will be able to help.

best regards

coppelia
Posts: 6766
Joined: 14 Dec 2012, 00:25

### Re: Conversion of Depth Map to 3D point cloud using Kinect

Hello,

before going further: did you have a look at the simulation model Models/components/sensors/3D laser scanner Fast.ttm ? That model is doing exactly what you are describing, but much faster (since the coordinate calculation is done in a vision sensor filter (Extract coordinats from work image)).

To customize your vision sensor to match the resolution and view angle of the kinect, just take the vision sensor fast3DLaserScanner_sensor, and in its script simulation parameters, adjust its scan angle (scanAngle). The open the vision sensor's vision sensor dialog, then open the filter dialog, double-click Extract coordinates from work image, and adjust the resolution (Point count along X/Y). Just make sure that the vision sensor resolution is higher than the filter resolution. Also, in the child script attached to the vision sensor, make sure you have a buffer large enough for the points to draw into the scene:

Code: Select all

``points=simAddDrawingObject(sim_drawing_spherepoints,0.01,0,-1,bufferSize,nil,nil,nil,red)``
Cheers

hassan
Posts: 10
Joined: 20 Aug 2013, 12:31

### Re: Conversion of Depth Map to 3D point cloud using Kinect

Hello,

Thanks a lot for your reply. I just saw it. Ok i understand the idea. I have one question though, i am reading the script associated with the 3D laser scanner Fast and i don't understand the transformation its carrying out. It gets the position matrix of the vision sensor and then it gets position matrix of the object the script is attached to which i am assuming is the laser scanner, it then inverts the second matrix and then multiplies it with the first. After that before each 3D point is put in the table it is also transformed using the resultant matrix.

Can you please briefly explain what is happening? Are the points which we get after applying the filter initially in the vision sensor's reference frame? What does this transformation do? It seems like it just transforms the points first to world from vision sensor's reference frame and then transforms them back to laser scanner's reference frame. Is that right?

Best regards,
Hassan

coppelia
Posts: 6766
Joined: 14 Dec 2012, 00:25

### Re: Conversion of Depth Map to 3D point cloud using Kinect

Hello Hassan,

the 3D points returned by the filter are relative to the vision sensor object. But:
• We need to display those 3D points in the scene, so we multiply them (the original points) with the vision sensor transformation matrix
• We need to send those 3D points to other applications or scripts. And in that case, we send the points expressed relative to the vision sensor model (i.e. not the vision sensor object). But here you are free to send them relative to another reference frame.
Cheers

ahundt
Posts: 112
Joined: 29 Jan 2015, 04:21

### Re: Conversion of Depth Map to 3D point cloud using Kinect

I've got a floating point depth image and an rgb image on the python API, what would be the best way to display it as a point cloud?

Right now it looks like calculating the xyz and rgb arrays in python, printing all the data as a lua string that calls simCreatePointCloud and simInsertPointsIntoPointCloud, then passing that string to simxCallScriptFunction as detailed below might be the best option.

Code: Select all

``````    # 3. Now send a code string to execute some random functions:
code="local octreeHandle=simCreateOctree(0.5,0,1)\n" \
"simInsertVoxelsIntoOctree(octreeHandle,0,{0.1,0.1,0.1},{255,0,255})\n" \
"return 'done'"
res,retInts,retFloats,retStrings,retBuffer=vrep.simxCallScriptFunction(clientID,"remoteApiCommandServer",vrep.sim_scripttype_childscript,'executeCode_function',[],[],[code],emptyBuff,vrep.simx_opmode_blocking)
if res==vrep.simx_return_ok:
print ('Code execution returned: ',retStrings[0])
else:
print ('Remote function call failed')
``````
Perhaps there is a way things could be more efficiently passed via the buffer parameter?

Ideally it would be cool if there were a function to specify a depth image, an rgb image, and the parameters of the intrinsic matrix.

coppelia
Posts: 6766
Joined: 14 Dec 2012, 00:25

### Re: Conversion of Depth Map to 3D point cloud using Kinect

Hello,

why are you sending a code string instead of sending the float data?
I would have a hard-coded function inside of a child script that receives a buffer of ints and a buffer of floats. The ints indicate the size of the respective floats that belong to the depth points or the image data. Of course you can also send the code to execute as a string, but I wouldn't go the way of converting all the data to a string and executing that string, since this would represent too much data to send.
Maybe I misunderstood you...

Cheers

ahundt
Posts: 112
Joined: 29 Jan 2015, 04:21

### Re: Conversion of Depth Map to 3D point cloud using Kinect

coppelia wrote:Hello,

why are you sending a code string instead of sending the float data?
I would have a hard-coded function inside of a child script that receives a buffer of ints and a buffer of floats. The ints indicate the size of the respective floats that belong to the depth points or the image data. Of course you can also send the code to execute as a string, but I wouldn't go the way of converting all the data to a string and executing that string, since this would represent too much data to send.
Maybe I misunderstood you...

Cheers
Perhaps there is some lua example code I missed for the point cloud or the docs could be updated with an example? I did a text search for the two API simCreatePointCloud and simInsertPointsIntoPointCloud functions but only saw an example in C.

You understood right, yes your suggestion sounds much better, just the code examples are better for this slower approach. It had occurred to me but I'm not sure how to access/split the buffers in lua perhaps you know of some reference code I could use to understand the lua API component better?

Also, the point cloud function does not seem super clear about the data layout of the points and colors. Should they be float [x,y,z,x,y,z...] and char [r,g,b,r,g,b...]?

I'm also trying to find an example of configuring simint options... tried the vrep docs google search tool for "simint options example" and skimmed all the tutorials but couldn't find lua examples. There also don't seem to be any point cloud object parameter id definitions for those options, unless I'm misunderstanding the use case for those definitions.

Hopefully I didn't just miss everything... thanks for your help!

coppelia
Posts: 6766
Joined: 14 Dec 2012, 00:25

### Re: Conversion of Depth Map to 3D point cloud using Kinect

It is true that Lua is not very fast when you have to manipulate large arrays. So there are several directions you can use:
• You can use LuaJit to make it run faster. As far as I can tell this functionality is only working on Windows (in system/usrset.txt, set useExternalLuaLibrary to true. Then all Lua functionality will go via the very thin wrapper v_repLua.dll that by default links to the LuaJit. On other platforms things have not been tested /tested enough.
• Instead of sending arrays of floats, you can also send the floats packed inside of strings. This way you could send a string for each specific data buffer you have to transmit, then unpack the floats with simUnpackFloatTable (which is fast).
The API function simInsertPointsIntoPointCloud expects a simple array of xyz values, i.e. {x1,y1,z1,x2,y2,z2,...,xn,yn,zn}.
If bit 1 (i.e. 2) of options is set, then the color argument should be {r1,g1,b1,r2,g2,b2,...,rn,gn,bn}. If that bit is not set, then you can specify a single triplet that defines the color of all points: {R,G,B}.

I am not sure what you mean with simint options... is this in relation with simInsertPointsIntoPointCloud?

Cheers

ahundt
Posts: 112
Joined: 29 Jan 2015, 04:21

### Re: Conversion of Depth Map to 3D point cloud using Kinect

coppelia wrote:It is true that Lua is not very fast when you have to manipulate large arrays. So there are several directions you can use:
• You can use LuaJit to make it run faster. As far as I can tell this functionality is only working on Windows (in system/usrset.txt, set useExternalLuaLibrary to true. Then all Lua functionality will go via the very thin wrapper v_repLua.dll that by default links to the LuaJit. On other platforms things have not been tested /tested enough.
• Instead of sending arrays of floats, you can also send the floats packed inside of strings. This way you could send a string for each specific data buffer you have to transmit, then unpack the floats with simUnpackFloatTable (which is fast).
I'm trying to send the arrays of floats packed inside of strings, but it appears there is a bug in vrep.py when running python with an ascii encoding. It appears the current code might expect the default encoding to be utf-8.

I decided to give that a try and pack everything into a string. I went to encode a numpy array of float32 values with the normal numpy array tostring() function, with the default ascii encoding (I'm in the USA on OSX 10.11.6 with python 2.7.13 and numpy 1.12.1) and I get the following:

Code: Select all

``````
File "/Users/athundt/source/costar_plan/costar_google_brainrobotdata/vrep_grasp.py", line 215, in _visualize_one_grasp_attempt
self.create_point_cloud(point_cloud_display_name, point_cloud, base_to_camera_vec_quat_7, rgb_image, parent_handle)
File "/Users/athundt/source/costar_plan/costar_google_brainrobotdata/vrep_grasp.py", line 137, in create_point_cloud
v.simx_opmode_blocking)
File "/Users/athundt/source/costar_plan/costar_google_brainrobotdata/vrep/vrep.py", line 1375, in simxCallScriptFunction
a=a.encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 35: ordinal not in range(128)
``````
Then I tried making this change to simxCallScriptFunction to deal with ascii vs unicode:

Code: Select all

``````    concatStr=unicode('', 'utf-8').encode('utf-8')
for i in range(len(inputStrings)):
a=unicode(inputStrings[i], 'utf-8')
a=a+'\0'
if type(a) is str:
a=a.encode('utf-8')
concatStr=concatStr+a
#c_inStrings  = (ct.c_char*len(concatStr))(*concatStr)
c_inStrings  = ct.c_char_p(concatStr)
``````
And I get this error:

Code: Select all

``````
File "/Users/athundt/source/costar_plan/costar_google_brainrobotdata/vrep/vrep.py", line 1377, in simxCallScriptFunction
c_inStrings  = (ct.c_char*len(concatStr))(*concatStr)
TypeError: one character string expected
``````
I also tried

Code: Select all

``````from __future__ import unicode_literals
``````
which got the same one character string error. After that I tried:

Code: Select all

``````import sys
sys.setdefaultencoding("utf-8")
``````
which didn't even start running.

Here is where I looked for the solutions:
https://stackoverflow.com/questions/213 ... -0-ordinal
https://stackoverflow.com/questions/418 ... g-to-utf-8

for reference here is my lua function to create point clouds, but I'm not getting to this point because sending the strings fails:

Code: Select all

``````

createPointCloud_function=function(inInts,inFloats,inStrings,inBuffer)
-- Create a dummy object with specific name and coordinates
if #inStrings>=2 and #inFloats>=3 then

local cloudHandle=simGetObjectHandle(inStrings[1])
if cloudHandle == -1 then
cloudHandle=simCreatePointCloud(0.01, 10, 0)
end
local parent_handle=inInts[1]
local errorReportMode=simGetInt32Parameter(sim_intparam_error_report_mode)
simSetInt32Parameter(sim_intparam_error_report_mode,0) -- temporarily suppress error output (because we are not allowed to have two times the same object name)
result = simSetObjectName(cloudHandle,inStrings[1])
if result == -1 then
simDisplayDialog('Setting object name failed',inStrings[1],sim_dlgstyle_ok,false)
end
simSetInt32Parameter(sim_intparam_error_report_mode,errorReportMode) -- restore the original error report mode
simSetObjectPosition(cloudHandle,parent_handle,inFloats)
if #inFloats>=7 then
local orientation={unpack(inFloats, 4, 7)} -- get 4 quaternion entries from 4 to 7
simSetObjectQuaternion(cloudHandle,parent_handle,orientation)
end
local cloud = simUnpackFloatTable(inStrings[2])
local colors = nil
-- bit 1 is 1 so point clouds in cloud reference frame
options = 1
if #inStrings > 2 then
-- bit 2 is 1 so each point is colored
options = 3
colors = simUnpackUInt8Table(inStrings[3])
end
simInsertPointsIntoPointCloud(cloudHandle, options, cloud, colors)
return {cloudHandle},{},{},'' -- return the handle of the created dummy
end
end
``````
Any suggestions?
I am not sure what you mean with simint options... is this in relation with simInsertPointsIntoPointCloud?
Cheers
I was trying to ask if there is a clean way to set the bits other than just using numbers that happen to set the bits. For now I'm just using 1 2 3 as appropriate.

coppelia