Advances in humanoid control and perception
      
      
        
      
      
      
      
      
      
      
      
      
      
      
      
        143 p
        
        
      
      
      
      
      
      
      
      Thèse de doctorat: Università della Svizzera italiana, 2016
      
      
      
      
      
      
      
       
      
      
      
        
        English
        
        
        
          One day there will be humanoid robots among us doing our boring, time-consuming, or  dangerous tasks. They might cook a delicious meal for us or do the groceries. For this  to become reality, many advances need to be made to the artificial intelligence of  humanoid robots. The ever-increasing available computational processing power  opens new doors for such advances. In this thesis we develop novel algorithms for  humanoid control and vision that harness this power. We apply these methods on an  iCub humanoid upper-body with 41 degrees of freedom. For control, we develop  Natural Gradient Inverse Kinematics (NGIK), a sampling-based optimiser that applies  natural evolution strategies to perform inverse kinematics. The resulting algorithm  makes very few assumptions and gives much more freedom in definable constraints  than its Jacobian-based counterparts. A special graph-building procedure is introduced  to build Task-Relevant Roadmaps (TRM) by iteratively applying NGIK and storing the  results. TRMs form searchable graphs of kinematic configurations on which a wide  range of task-relevant humanoid movements can be planned. Through coordinating  several instances of NGIK, a fast parallelised version of the TRM building algorithm is  developed. To contrast the offline TRM algorithms, we also develop Natural Gradient  Control which directly uses the optimisation pass in NGIK as an online control signal.  For vision, we develop dynamic vision algorithms that form cyclic information flows that  affect their own processing. Deep Attention Selective Networks (dasNet) implement  feedback in convolutional neural networks through a gating mechanism that is steered  by a policy. Through this feedback, dasNet can focus on different features in the image  in light of previously gathered information and improve classification, with state-of-the- art results at the time of publication. Then, we develop PyraMiD-LSTM, which  processes 3D volumetric data by employing a novel convolutional Long Short-Term  Memory network (C-LSTM) to compute pyramidal contexts for every voxel, and  combine them to perform segmentation. This resulted in state-of-the-art performance  on a segmentation benchmark. The work on control and vision is integrated into an  application on the iCub robot. A Fast-Weight PyraMiD-LSTM is developed that  dynamically generates weights for a C-LSTM layer given actions of the robot. An  explorative policy using NGC generates a stream of data, which the Fast-Weight  PyraMiD-LSTM has to predict. The resulting integrated system learns to model the  effects of head and hand movements and their effects on future visual input. To our  knowledge, this is the first effective visual prediction system on an iCub.
        
        
       
      
      
      
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        - 
          Language
        
- 
          
        
- 
          Classification
        
- 
          
              
                
                  Computer science and technology
                
              
            
          
        
- 
          License
        
- 
          
        
- 
          Identifiers
        
- 
          
        
- 
          Persistent URL
        
- 
          https://n2t.net/ark:/12658/srd1318642