ROSALIND PICARD'S eyes were wide open. I couldn't blame her. We were sitting in her office at the Massachusetts Institute of Technology's Media Lab, and my questions were stunningly incisive. In fact, I began to suspect that I must be one of the savviest journalists she had ever met.
Then Picard handed me a pair of special glasses. The instant I put them on I discovered that I had got it all terribly wrong. That look of admiration, I realised, was actually confusion and disagreement. Worse, she was bored out of her mind. I became privy to this knowledge because a little voice was whispering in my ear through a headphone attached to the glasses. It told me that Picard was "confused" or "disagreeing". All the while, a red light built into the specs was blinking above my right eye to warn me to stop talking. It was as though I had developed an extra sense.
The glasses can send me this information thanks to a built-in camera linked to software that analyses Picard's facial expressions. They're just one example of a number of "social X-ray specs" that are set to transform how we interact with each other. By sensing emotions that we would otherwise miss, these technologies can thwart disastrous social gaffes and help us understand each other better. Some companies are already wiring up their employees with the technology, to help them improve how they communicate with customers. Our emotional intelligence is about to be boosted, but are we ready to broadcast feelings we might rather keep private?
We project many subtle facial expressions that mirror our feelings. In the 1970s, US psychologist Paul Ekman identified a basic set of seven: happiness, sadness, fear, anger, disgust, contempt and surprise. They became the foundation for a theory of lie detection, which posited that involuntary micro-expressions can briefly unmask deception before the liar restores a facade of honesty to their face. Though the theory was later debunked, the principle wasn't entirely unsound.
In conversation, we pantomime certain emotions that act as social lubricants. We unconsciously nod to signal that we are following the other person's train of thought, for example, or squint a bit to indicate that we are losing track. Many of these signals can be misinterpreted - sometimes because different cultures have their own specific signals.
More often, we fail to spot them altogether. During a face-to-face conversation, thousands of tiny indicators on a person's face - arching the brow, puckering or parting the lips - add up to a series of non-verbal hints that augment our verbal communication. Blink, and you'll miss them.
The idea that technology could amplify these signals was first explored by Rana el Kaliouby at the University of Cambridge, UK. She wanted to help autistic people, who find it particularly hard to pick up on other people's emotions.
El Kaliouby realised that the "Ekman seven" would not be particularly helpful for enhancing the average conversation: after all, how often do you expect to see expressions of contempt or disgust? So in 2005, she enlisted Simon Baron-Cohen, also at Cambridge, to help her identify a set of more relevant emotional facial states. They settled on six: thinking, agreeing, concentrating, interested - and, of course, the confused and disagreeing expressions with which I had become so uncomfortably familiar in my conversation with Picard. To create this lexicon, they hired actors to mime the expressions, then asked volunteers to describe their meaning, taking the majority response as the accurate one.
To build the prototype glasses that could exploit these signals, el Kaliouby worked with Picard, who is an electrical engineer. Inside the glasses is a camera the size of a rice grain connected to a wire snaking down to a piece of dedicated computing machinery about the size of a deck of cards. The camera tracks 24 "feature points" on your conversation partner's face, and software developed by Picard analyses their myriad micro-expressions, how often they appear and for how long. It then compares that data with its bank of known expressions (see diagram).
While I spoke with Picard in her office, a computer screen coached me about her evolving facial expressions. But I could also get a summary of this information via an earphone and the glasses with the light on the lens. "If I were smiling and nodding, you'd get the green light," Picard said. I cringed a little when I realised how often the red light had appeared during our conversation.
The prototype proved popular with autistic people who were invited to test it. "They approached people and tested out new facial expressions on themselves and tried to elicit facial expressions from other people," Picard says. Eventually, she thinks the system could be incorporated into a pair of augmented-reality glasses, which would overlay computer graphics onto the scene in front of the wearer.
When Picard and el Kaliouby were calibrating their prototype, they were surprised to find that the average person only managed to interpret, correctly, 54 per cent of Baron-Cohen's expressions on real, non-acted faces. This suggested to them that most people - not just those with autism - could use some help sensing the mood of people they are talking to. "People are just not that good at it," says Picard. The software, by contrast, correctly identifies 64 per cent of the expressions.
Picard and el Kaliouby have since set up a company called Affectiva, based in Waltham, Massachusetts, which is selling their expression recognition software. Their customers include companies that, for example, want to measure how people feel about their adverts or movie. And along with colleague Mohammed Hoque, they have been tuning their algorithms to pick up ever more subtle differences between expressions, such as smiles of delight and frustration, which can look very similar without context. Their algorithm does a better job of detecting the faint differences between those two smiles than people do. "The machines had an advantage over humans in analysing internal details of smiles," says Hoque.
Affectiva is also in talks with a Japanese company that wants to use their algorithm to distinguish smiles on Japanese faces. The Japanese have given names to 10 different types of smile - including bakushu (happy smile), shisho (inappropriate giggle), and terawari (acutely embarrassed smile).
Picard says the software amplifies the cues we already volunteer, and does not extract information that a person is unwilling to share. It is certainly not a foolproof lie detector. When I interviewed Picard, I deliberately tried to look confused, and to some extent it worked. Still, it's hard to fool the machine for long. As soon as I became engaged in the conversation, my concentration broke and my true feelings revealed themselves again.
Subconscious cues
Some of the cues we give off during conversations are much harder to fake. In addition to facial expressions, we radiate a panoply of involuntary "honest signals", a term identified by MIT Media Lab researcher Alex Pentland in the early 2000s to describe the social signals that we use to augment our language. They include body language such as gesture mirroring, and cues such as variations in the tone and pitch of the voice. We do respond to these cues, but often not consciously. If we were more aware of them in others and ourselves, then we would have a fuller picture of the social reality around us, and be able to react more deliberately.
To capture these signals and depict them visually, Pentland worked with MIT doctoral students Daniel Olguín Olguín, Benjamin Waber and Taemie Kim to develop a small electronic badge that hangs around the neck. Its audio sensors record how aggressive the wearer is being, the pitch, volume and clip of their voice, and other factors. They called it the "jerk-o-meter". The information it gathers can be sent wirelessly to a smartphone or any other device that can display it graphically.
It didn't take the group long to notice that they had stumbled onto a potent technology. For a start, it helped people realise when they were being either obnoxious or unduly self-effacing. "Some people are just not good at being objective judges of their own social interactions," Kim says. But it isn't just individual behaviour that changes when people wear these devices.
In a 10-day experiment in 2008, Japanese and American college students were given the task of building a complex contraption while wearing the next generation of jerk-o-meter - which by that time had been more diplomatically renamed a "sociometric badge". As well as audio, their badge measured proximity to other people.
At the end of the first day they were shown a diagram that represented three things: speaking frequency, speaking time, and who they interacted with. Each person was indicated by a dot, which ballooned for loquacious individuals and withered for quiet ones. Their tendency for monologues versus dialogue was represented by red for Hamlets and white for conversationalists. Their interactions were tracked by lines between them: thick if two participants were engaged in frequent conversation and hair-thin if they barely spoke.
"We were visualising the social spaces between people," Kim says. The results were immediately telling. Take the case of "A", whose massive red dot dominated the first day. Having seen this, A appeared to do some soul-searching, because on the second day his dot had shrivelled to a faint white. By the end of the experiment, all the dots had gravitated towards more or less the same size and colour. Simply being able to see their role in a group made people behave differently, and caused the group dynamics to become more even. The entire group's emotional intelligence had increased (Physica A, vol 378, p 59).
The first test of the badges in the real world revealed that there was money to be made from these revelations. Pentland and his team upgraded the sociometric badges to analyse the speech patterns of customer service representatives at Vertex Data Science in Liverpool, UK, which provides call centre services for a number of companies. This revealed that it is possible to identify units of speech that make a person sound persuasive, and hence to teach them how to sound more persuasive when talking to customers (International Journal of Organisational Design and Engineering, vol 1, p 69). The team claims that the technology could increase telephone sales performance by as much as 20 per cent.
With results like these, it's not hard to see why businesses are taking a keen interest in the badges. Last month, Kim and Olguín used the results of their doctoral research to found a start-up called Sociometric Solutions that already has several customers, including Bank of America, a bank in the Czech Republic and a consulting firm.
Some of our body's responses during a conversation are not designed for broadcast to another person - but it's possible to monitor those too. Your temperature and skin conductance can also reveal secrets about your emotional state, and Picard can tap them with a glove-like device called the Q Sensor. In response to stresses, good or bad, our skin becomes clammy, increasing its conductance, and the Q Sensor picks this up.
Physiological responses can now even be tracked remotely, in principle without your consent. Last year, Picard and one of her graduate students showed that it was possible to measure heart rate without any surface contact with the body. They used software linked to an ordinary webcam to read information about heart rate, blood pressure and skin temperature based on, among other things, colour changes in the subject's face (Optics Express, vol 18, p 10762).
It's not too much of a stretch to imagine that these sensors could combine to populate the ultimate emotion-reading device. How would the world change if we could all read each other's social signals accurately? Baron-Cohen has already seen the benefits in his studies of people with Asperger's syndrome at the Autism Research Centre in Cambridge, UK. "It's not a miracle cure," he says, but people equipped with Picard's technology were learning extra social skills. Baron-Cohen says the wearers retained some ability to read emotions accurately after they removed the glasses. Such enhancements for the rest of the population might increase emotional intelligence through the generations.
But giving people unfettered access to each other's emotions has dangers too, Baron-Cohen warns. "The ability to read someone's emotions doesn't necessarily come with empathy," he says.
Picard is keen to stress that her technologies should not be used covertly, and that people should always be asked whether they wish to use them, rather than being forced to do so. Use of her gloves is by their very nature voluntary - you have to choose to wear them - but remote heart-rate monitoring does not require consent. Pentland takes a similar view on the need for privacy. Data generated by the sociometric badge data should only be visible to an employee, he says, and not be shared with an employer without the employee's consent.
I got a taste of how it can feel to have my most private thoughts exposed when I slipped on one of Picard's Q Sensor gloves to measure my skin conductance. A purple neoprene band pressed two electrodes into the palm of my hand, measuring subtle moisture changes on my skin when my stress levels changed. I watched a trace on Picard's screen, reminiscent of a seismogram. "OK, now just think about anything that will make your heart beat faster," she told me. I immediately suppressed my first intrusive thought because I found it just too embarrassing - and stared in horror as the scribble nevertheless exploded into a vertical spike. "Wow," Picard said, her eyes widening. "What was that?" I felt my face go beetroot red.
Picard considered my reaction for a second. She didn't need a headset to know that if I aired this particular thought it might make our conversation rather awkward.
"Never mind," she said, "I don't want to know."
New reality goggles
Augmented-reality glasses already make it possible to visualise what was once unseeable, by overlaying graphics onto your field of view. Add the right sensors and apps, and here are some of the things you might one day see through those specs
That guy's name
In Rio de Janeiro and Sao Paolo, police officers can decide whether someone is a criminal just by looking at them. Their glasses scan the features of a face, and match them against a database of criminal mugshots. A red light blinks if there's a match.
So far, tech like this is available mainly to governments, but that won't last. Facebook has recently turned on face recognition for its automatic photo tagging. AR software will likely soon interface with social networks, so that you will never again forget a name at a party - but anonymity in a crowd may become a thing of the past.
Digestive danger
The recent E. coli outbreak in Germany shows how important it can be to know exactly what you are eating.
One way to analyse food could be via "hyperspectral" imaging, which uses special sensors to image far more of the electromagnetic spectrum than the small window visible light normally allows. Bacteria and spoilt food give off specific signatures. And the proper sensors can identify bacteria, insect repellent, mysterious smells or contaminants on the chopping board. "It's like doing CSI on the fly," says Paul Lucey, who works on hyperspectral vision at the University of Hawaii.
The past
Thad Starner at Georgia Institute of Technology in Atlanta wears a small device he has built that looks like a monocle. It can retrieve video, audio or text snippets of past conversations with people he has spoken with, and even provide real-time links between past chats and topics he is currently discussing. In principle, it could also be used to view other people's conversations, if that person allowed them to be shared.
See through walls
The US military has built a radar-imaging device that can see through walls to capture 3D images of people and objects beyond. In similar vein, Yaser Sheikh of Carnegie Mellon University in Pittsburgh, Pennsylvania, is building an augmented-reality system for carsMovie Camera that lets drivers see round a blind corner. Taking video from a camera focused on the part of the road hidden from the driver, it distorts the image so that it appears on the AR display, letting the driver essentially see through the obstacle.
Subtitles
You could use microphones and speech-recognition technology to convert the voice of someone you are conversing with to scrolling, real time subtitles. Thad Starner uses a basic version of this technology on his wearable set. A more sophisticated app would include subtitles that appear under specific people in the form of speech bubbles.
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment