ppf simulation problems

emmanuelluque · Post by **emmanuelluque** » 21 May 2020, 10:21

Hello,

I am having problems with the ppf simulation settings, I am trying to develop a deep-reinforcement-learning algorithm to teach a mobile robot how to behave in a circuit to follow it. So in order to increase the velocity of the training process I tried to use the ppf simulation settings, but the higher the ppf value is, the higuer are the information losses that I have in my plots, which are showing the number of decisions that my robot takes in a trial. So if I use a ppf of 2, instead of a ppf of 1 (no ppf) with exactly the same conditions in both cases (only changing the ppf value), I can see in my plots the middle of the decisions taken by my robot. So I conclude that I am losing information. Could you help me to solve this problem? I understood that the ppf settings would not change the execution of the code at all...

I am using a Python remote script in synchronous mode.

coppelia · Post by **coppelia** » 22 May 2020, 12:54

Hello,

what you describe about that ppf parameter is correct: normally that ppf (simulation passes per frame) should not have an influence on simulation. Just on display (half of the frames are displayed with ppf=2).
What is exactly going on in your situation/set-up I cannot say. But maybe you can check directly at the source of plot data what happens... How is the data acquired, and when? What if you print a debug message when a data point is acquired, is that debug print correctly behaving, or presenting the same kind of behaviour?
Can you provide a minimalistic, self-contained scene that illustrates your problem only in CoppeliaSim, to make sure it is a CoppeliaSim problem?

Cheers

emmanuelluque · Post by **emmanuelluque** » 23 May 2020, 12:05

Hello,
Thank you for you reply.

I have tried to debug the code and I have taken two captures to show you what is wrong with ppf mode.

First capture: ppf = 1

https://upcomillas-my.sharepoint.com/:i ... Q?e=KxYdYc

Second capture: ppf = 2

https://upcomillas-my.sharepoint.com/:i ... A?e=mr9yOI

In the first capture with ppf = 1 you can see on the terminal window that in the moment that my robot touches the wall he is in his 47th step . By the other hand, in the second capture with ppf = 2 you can see that in the moment that my robot touches the wall he is in his 23th step, what is around the middle than in the first capture.

Code: Select all

for trial in range(trials):
            print('trial:',trial)
            cur_state = env.reset()[:12]
        
            if num_steps > 0:
        	num_steps_evolution.append(num_steps)
            if len(num_steps_evolution) > 0:
        	distance_mean_num.append(len(num_steps_evolution))
        	max_reward_list.append(116)
            num_steps = 0

            for step in range(trial_len):
            print('step',step)
            action = dqn_agent.act(cur_state)
            print('accion: ')
            print(action)
            new_state, reward, done, info = env.step(action)
            #if step > 0:
            	#trial_reward += reward
            if num_steps > 0:
            	distance_step.append(min(new_state[:10]))
            	#robot_position_x.append(new_state[-2])
            	#robot_position_y.append(new_state[-1])
            num_steps += 1
            for i in range(0, len(new_state[:10])):
            	if new_state[i] != 10000:
               		distance_step_full.append(new_state[i]) 

            #print(new_state,reward,done,info)

            # reward = reward if not done else -20     suyo
            new_state = new_state[:12]
            dqn_agent.step+=1
            #dqn_agent.remember(cur_state, action, reward, new_state, done, info)
            
            #dqn_agent.replay()       # internally iterates default (prediction) model     
            #dqn_agent.target_train() # iterates target model

            cur_state = new_state
            print(cur_state,reward,done,info)
            if done:
            	dqn_agent.trial+=1
            	dqn_agent.step = 0
            	break
            	
            if info == {'is_success': False}:
            print("Failed to complete in trial {}".format(trial))
            dqn_agent.epsilon *= dqn_agent.epsilon_decay
            if trial == 30:
            	plt.figure()
            	plt.xlabel('Trial')
            	plt.ylabel('Steps')
            	plt.plot(distance_mean_num, num_steps_evolution)
            	mean_num_steps_evolution=sum(num_steps_evolution)/len(num_steps_evolution)
            	valores = [[max(num_steps_evolution),mean_num_steps_evolution,(num_steps_evolution.index(max(num_steps_evolution))+1)]]
            	etiquetas_col = (u'Max', u'Mean',u'Max Trial')
            	plt.table(cellText=valores, colLabels = etiquetas_col, loc='top')
            	plt.savefig("/home/cpsr/Graficos/Num_steps_evolution_{}.png".format(name_exp))

This is the loop where I update the steps and when my robot arrives to the trial 30 I make the plot with the information that I have recorded. I am acquiring the data with 12 ultrasonic sensors that my robot (Pioneer-P3DX) has and he is measuring the distance to the wall, but this is working well. The part that is not working is the number of steps that my robot takes to follow the circuit with ppf more than 1, and I am measuring it in the loop that I have showed you.

Thank you very much for your help.

coppelia · Post by **coppelia** » 26 May 2020, 10:19

My guess is that you are not correctly using the synchronous mode. I cannot reproduce your situation.

Are you using the legacy remote API, or the B0-based remote API?

Cheers

emmanuelluque · Post by **emmanuelluque** » 29 May 2020, 10:53

Hello,

I have tried to check the synchronous mode procedure with my project directors and it seems that it is all right... I am using the legacy remote API.

Thank you for your kind reply.

CoppeliaSim forums

ppf simulation problems

ppf simulation problems

Re: ppf simulation problems

Re: ppf simulation problems

Re: ppf simulation problems

Re: ppf simulation problems