They Can Hear Your Words With Wi-Fi!

Now and again a research project comes along that is simply jaw-droppingly amazing.

This month Guanhua Wang and colleagues from Shenzhen University published a study that demonstrates that “We Can Hear You with Wi-Fi!”

Whoa! Really?

The idea is to observe the speaker’s mouth, to infer what he or she is saying. Basically, it uses wifi signals as focused radar to implement sophisticated lip reading!

This use of radar is similar to other gesture detection systems, though the trick is much harder. They have to detect the subtle motions of the mouth in a sea of much larger body movements and extraneous signals. The researchers report that they can do this, and can distinguish up to three different speakers simultaneously. Wow!

By the way, it’s radar, so it works through walls!

The paper gives lots of details, going far beyond my own paltry understanding of wifi [1].

This is really cool, and it is interesting to think about the implications.

I don’t think this is necessarily a serious threat to privacy if only because it is so easy to overhear conversations in so many ways. If you can secretly hack the wifi, then you probably can plant bugs and/or hack local mobile devices, and so on.

If snooping turns out to be an issue, I guess you could cover your mouth with a faraday cage (a la “stealthwear”). That would block out the radio waves (and other radiation), making your vocal communication detectable only aurally.

On the other hand, this is an interesting possibility for a ubiquitous interface, one that works across the whole wifi network without installing (and securing) sensors. You have to install and secure the wireless network anyway, but you don’t need additional sensor nets. I could imagine a system which lets you walk up and introduce yourself verbally, to establish your presence and start the connection. No extra equipment needed, you could even have your devices turned off and stowed.

Thinking about it, I wonder how well this system can identify individual people. If I understand their learning algorithms, the system should be able to identify someone it has “met” before, shouldn’t it?

But just how foolproof is it? Could a skilled mimic imitate a person well enough to fool the system? That would be an interesting experiment.

Another possible use case might be speech therapy or language training. The radar could measure the person’s mouth as they speak, cross correlate to the sound they make, and generate feedback to help the person learn to make the desired sounds. This would be based on detailed understanding of just how they are shaping their mouth, which could make the diagnosis and suggestions much more precise than heuristics based on the sounds alone.

For that matter, one could learn to silently speak, mouthing the words but not speaking them. The radar could pick it up (in the dark, through the wall!), but microphones would hear very little.

I wonder if one could also use this to add yet another “voice” to an artistic performance. In addition to the sound generated by the singer, the shape of the mouth itself could be used as digital input to the synthesized sound, lights, or other effects.

Very interesting stuff!

  1. Guanhua Wang, Yongpan Zou, Zimu Zhou,Kaishun Wu, and Lionel M. Ni, We Can Hear You with Wi-Fi! IEEE Transactions on Mobile Computing, 15 (11):2907-2920, 2016.

