Electrical engineers at the University of Nebraska-Lincoln have written the world's top-ranked algorithm for a 3-D imaging process poised to enhance robotic surgery, navigate driverless cars and assist rescue missions.
In October, the UNL algorithm earned the best overall score to date on the Middlebury Stereo Benchmark, a widely accepted measure of algorithmic accuracy and speed.
The algorithm directs a process called stereo matching, which allows computers to mimic the depth perception of human eyesight by using images from two video cameras to construct a three-dimensional equivalent.
"The amount of data being processed by our brains is actually kind of remarkable," said Eric Psota, a research assistant professor of electrical engineering who co-wrote the algorithm with doctoral candidate Jedrzej Kowalczuk. "The problem of stereo matching in the digital age is: How do you get a computer to do what human beings do?"
The human visual system employs multiple methods to provide a sense of three-dimensional space, Psota said. These methods include convergence — the ability of both eyes to focus on the same object — a principle that also forms the core of stereo matching.
"The amount of convergence that's required for our eyes is proportional to the depth of the thing that we're looking at," Psota said. "If you're looking at the moon, there's really no convergence of the eyes. If you're looking at your finger, your eyes converge a great deal. Determining the amount of convergence and calculating the corresponding depth from that is the overall goal of stereo matching."
Stereo matching emulates eyesight by calculating the position of a pixel in the left camera relative to its location in the right, Psota said. An algorithm's ability to map the distance between respective pixels from each camera dictates how well it captures the dimensionality of a given scene.
The Middlebury Stereo Benchmark tasks an algorithm with matching multiple high-resolution images under various conditions, then calculates a weighted score by averaging its performance across 15 image pairs. The benchmark ranked UNL's algorithm ahead of those from researchers from North America, Asia and Europe, many of which were also submitted in 2014.
The team's stereo-matching algorithm uses parallel processing to independently yet simultaneously compute the depth of individual pixels, Psota said, which contributes to its efficiency. The algorithm can also use this data to make probabilistic inferences about the depth of more ambiguous areas, such as blank white walls in the background of an image.
"We establish an initial set of matches, and then we try to share this information among pixels to refine the initial estimate of depth in the following iterations," said Kowalczuk, whose dissertation centers on the algorithm. "In every iteration, we're able to isolate very reliable matches for which we can say, with high confidence, 'This must be the true match for a given pixel.' This is the information we pass along to the pixels we deem less reliable."
Though iterative methods typically perform more slowly than their counterparts, the UNL algorithm represents one of the few capable of generating imagery in real time, Psota said. This greatly broadens the range of applications for which it might be used, he said.