Earlier this year, car manufacturer Volvo announced that by the year 2020, they will have developed automotive technology that will make it such that "no one will be killed or seriously injured in one of its new vehicles." A lofty goal, to be sure, but how?
For one thing, they want to reduce the damage done and lives lost that come when a car crashes into an animal crossing the road. One of the mechanisms for improving the safety of their vehicles comes from a system through which a camera at the front of the car would detect the presence of a large animal in the road – horses, cows, deer, elk, moose – and respond accordingly. (No word, unfortunately, on whether the system would detect any road-crossing chickens…)
According to a Volvo press release, around 200 people each year are killed in the US as the result of car accidents with wild animals, mainly deer. The numbers are higher in other countries. Canada reports 40,000 car-animal collisions each year, and in 2010, Sweden counted 47,000. If Volvo can train their cars to identify large animals in the road and automatically hit the brakes, lives would surely be saved, both for humans and non-human animals.
Of course, Volvo isn't the only company that is experimenting with this sort of augmented reality technology. Google Goggles is an app for smartphones that identifies text and objects via a phone's camera. And Google's latest announcement, Project Glass, promises to take the Google Goggles approach and install it on a pair of…well, goggles.
On its surface, object recognition and identification might seem like trivial tasks for a machine, given the ease in which our own minds and brains solve them. Try imagining something you're familiar with, such as your dog (or, your cat, or a friend's cat). Now picture it from the side. Or from the top, looking down. Or from the bottom, looking up. Picture Fido on your lap, on a table, in your car, in a flowerpot. Imagine your dog resting on top of a horse. Despite the fact that you have (probably) never seen your dog riding atop a horse, you probably had no trouble imagining what it might look like.
If you were to build up a database of all possible viewpoints of your dog, from all possible viewing angles, in all possible surroundings, you'd never finish the task, because your dog can exist in what is essentially an infinite number of settings. It is despite the computational complexity of this problem that your mind is able to solve it with relative ease and effortlessness.
One clue that this process – called object recognition by cognitive and computational neuroscientists – is more complicated than it might at first seem comes from the fact that around half of the cerebral cortex (that thin layer of grey matter covering the brain) in primates is devoted to processing the information that enters the brain through the eyes.
And this, essentially, is the process that scientists and engineers working on projects like Google's Project Glass and Volvo's animal detection system must attempt to build into a computer. What's easy for your mind is still a challenge for a machine.
Cognitive scientists call this the problem of invariance. That is, an efficient object recognition program must be able to recognize an object despite differences in its position, size, orientation, or background context, just like you can imagine your dog in an endless variety of environments.
In a 2012 paper in the journal Neuron, cognitive scientists James J. DiCarlo and colleagues write, "in the real world, each encounter with an object is almost entirely unique," because of what they call "identity-preserving image transformations." For example, your brain is able to precisely identify your car (or dog, or phone, or TV remote control) despite the wide variety of circumstances in which it appears.
From the perspective of your eye, your car can appear at any location. It can appear near the top of your visual field, or near the bottom. If you're standing close to your car, your car will appear very large. If you're far away, your car will appear very small. The image of your car that forms on your retina – the information transmitted to your brain – will be very different depending on whether you're looking at the front of your car or at the side or at any of the angles in between. Your car appears different in the dark and in the light, yet your brain is able to identify it as identical. Your brain can identify your car whether it is in a parking lot, a grassy field, or your own garage. Even more impressively, your brain can recognize your car even if its overall shape changes, such as when the doors are opened.
Scientists have made a great deal of progress in understanding how it is that the human mind – your mind – solves this and other problems of object identification, though the picture is still incomplete. Researchers have a working knowledge of object recognition at the more abstract, psychological level. They understand how your mind determines what constitutes an object, and how it addresses the problems of occlusion and invariance. All this, despite the fact that these processes operate outside of conscious awareness or control.
The main challenge moving forward is in determining how those psychological mechanisms emerge out of the basic biology of the brain. As our understanding of the brain improves, entrepreneurs will continue attempting to leverage these and other technologies in an effort to surpass the capabilities of the human mind and body. It is therefore perhaps paradoxical that the more we wish to transcend the human experience, the more we have to understand just what it means to be human in the first place.
Technology is "learning" a great deal from the human brain. How does this blurring of mind and machine make you feel?
Car graphics adapted from DiCarlo et al. (2012) Neuron.
photo of moose by jhoc