robot-vision-image

3-D scans of objects used in new perception algorithm
CREDIT: BEN BURCHFIEL

In this new age of more sophisticated AI and machine learning algorithms, more autonomous robots are being worked on everyday.  But in the past 50 years, most robotic implementations only took on performing a specific set of tasks.  Most if not all industrial robots over that time exclusively performed repetitive tasks, so those that had camera sensors only required 2-D perception. But a new breed of collaborate robot will have more complicated responsibilities, like being able to pickup-and-place a variety of different sized and shaped objects.  Many robots will need to be able to map free space in 3-D to position themselves and their movements without collision in constrained environments.  So the challenge to build autonomous robots, is to make it capable of performing helpful tasks in a real-world setting. To do this 3-D machine vision is often needed to enable a robot to sense variations in its physical environment and adapt accordingly.

Duke University Masters Candidate Ben Burchfiel and George Konidaris, an assistant professor of computer science at Brown University, have built a new software algorithm that enables machines to make better interpretations of 3-D objects in a deeper more human-like way.  People can glance at a new object and intuitively know what it is, and how it is oriented in space, relative to our position, even when the view of the object is partially obstructed.  The researcher’s 3-D perception algorithm can identify what a new object is, and it’s orientation in space, without first studying it from multiple views. It can also project and reconstruct any elements that are partially obscured. The team showcased their new 3-D learning design at the recent 2017 Robotics: Science and Systems Conference in Cambridge, Massachusetts.

The big positives of this new approach:

  • The 3-D perception algorithm makes fewer mistakes and is three times faster than the best current methods.
  • The robot only needs a limited number of training examples, and uses them to generalize to new objects.
  • The algorithm learned categories of objects by scanning through full 3-D images utilizing a version of a technique called probabilistic principal component analysis.
  • The 3-D system fills in the blind spots in its field of vision, to reconstruct the parts it can’t see.
  • When a robot tries to identify something new, it has the power to generalize, from prior examples, what characteristics similar objects tend to have.
  • The programmed robot recognizing objects that were rotated in 3-D space in various ways, which the best competing approaches can’t do.

Where this new approach is better but still lacking:

  • While the system takes about a second to process objects, it is still much slower than human vision.
  • While the best alternative makes a mistake almost half the time, the team’s approach makes a mistake a little less than 1/4 of the time.

What Next:

The researchers are working on scaling up their 3D perception approach algorithm from hundreds of objects, to distinguish between thousands of types of objects at a time.