Category Archives: science and technology

Inaudible Speech Commands Hack Your Home

I’m not a huge fan of speech interfaces, or of Internet connected home assistants a la Alexa in general.

I have already complained that these devices are potentially nasty invaders of privacy and likely to have complicated failure modes not least due to a plethora of security issues. (Possibly a plethora of plethoras.) I’ve also complained about the psychology of surveillance and instant gratification inherent in the design of these things. Especially for childred.

Pretty much exactly what you don’t want in your home.

This fall a group at Zhejiang University report on yet another potential issue: Inaudible Voice Commands [1].

Contemporary mobile devices have pretty good speakers and microphones, good enough to be dangerous. Advertising agencies and other attackers have begun using inaudible sound beacons to detect the location of otherwise cloaked devices. It is also possible to monkey with the motion sensors on a mobile device, via inaudible sounds.

Basically, these devices are sensitive to sound frequencies that the human user can’t hear, which can be sued to secretly communicate with and subvert the device.

Zhang, Guoming and colleagues turn this idea onto speech activated assistants, such as Alexa, or Siri  [1]. They describe a method to encode voice commands into innocent sounds. The humans can’t hear the words, but the computer decones it and takes it as a voice command.

These devices are capable of almost any operation on the Internet. Sending messages, transferring money, downloading software. The works.

In other words, if this attack succeeds, the hacker can secretly load malware or steal information, unbeknownst to the user.


Combine this with ultrasound beacons, and the world becomes a dangerous place for speech commanded devices.

The researchers argue that

The root cause of inaudible voice commands is that microphones can sense acoustic sounds with a frequency higher than 20 kHz while an ideal microphone should not.

This could be dealt with by deploying better microphones or by software that filters out ultrasound, or detects the difference between voiced commands and the injected commands.

I would add that a second root cause is the number of functions of these devices, and the essentially unimodular design of the system. Recent voice activated assistants are installed as programs on general purpose computers with a complete operating system, and multiple input and output channels, including connections to the Internet. In general, any program may access any channel and perform any computation.

This is a cheap and convenient architecture, but is arguably overpowered for most individual applications. The general purpose monolithic device requires that the software implement complicated security checks, in an attempt to limit privileges. Worse, it requires ordinary users to manage complex configurations, usually without adequate understanding or even awareness.

One approach would be to create smaller, specialized hardware modules, and require explicit communication between modules. I’m thinking of little hardware modules, essentially one to one with apps. Shared resources such as I/O channels will have monitors to mediate access. (This is kind of “the IoT inside every device”.)

This kind of architecture is difficult to build, and painful in the extreme to program. (I’m describing what is generally called a secure operating system.) It might well reduce the number of apps in the world (which is probably a good thing) and increase the cost of devices and apps (which isn’t so good). But it would make personal devices much harder to hack, and a whole lot easier to trust.

  1. Guoming Zhang, Chen Yan, Xiaoyu Ji, Taimin Zhang, Tianchen Zhang, and Wenyuan Xu, DolphinAtack: Inaudible Voice Commands. arxive, 2017.


IOTA’s Cart Is Way, Way Before the Horse

Earlier I commented on SatoshiPay microcrasactions switching from Bitcoin to IOTA. Contrary to early hopes, Bitcoin has not been successful as a medium for microtrasactions because transaction fees are too high and latency may be too long.

IOTA is designed for Internet of Things, so it uses a different design than Nakamoto, that is said to be capable of much lower latency and fees. SatoshiPay and other companies are looking at adopting IOTA for payment systems.

The big story is that IOTA is reinventing Bitcoin from the ground up, with its own home grown software and protocols. I described it (IOTA) as “funky” in my earlier post.

It is now clear that this funkiness extended to the implementation, including the cryptographic hashes used [1,2]. This is not a good idea, because you generally want to use really well tested crypto algorithms.

So when we noticed that the IOTA developers had written their own hash function, it was a huge red flag.

Unsurprisingly, Neh Haruda reports that their home grown hash function is vulnerable to a very basic attack, with potentially very serious consequences.

The specific problems have been patched, but the fact remains that IOTA seems to be a home made mess of a system.

Narula also notes other funkiness.  For some reason they use super-funky trinary code which, last time I checked, isn’t used by very many computers. Everything has to be interpreted by their custom software which is slow and bulky. More important, this means that their code is completely incompatible with any other system, precluding the use of standard libraries and tools. Such as well tried crypto libraries and software analysis tools.

I have no idea why you would do this, especially in a system that you want to be secure and trusted.

The amazing thing is not the funkiness of the software. There is plenty of funky software out there. The amazing thing is that lots of supposedly competent companies have invested money and adopted the software. As Narula says, “It should probably have been a huge red flag for anyone involved with IOTA.

How could they get so much funding, yet only now people are noticing these really basic questions?

It is possible that these critiques are finally having some effect. Daniel Palmer reports that the exchange rate of IOTA’s tokens (naturally, they have their on cryptocurrency, too) has been dropping like a woozy pigeon [3].  Perhaps some of their partners have finally noticed the red flags.

The part I find really hard to understand is how people could toss millions of dollars at this technology without noticing that it has so many problems. Aren’t there any grown ups supervising this playground?

I assume IOTA have a heck of a sales pitch.

Judging from what I’ve seen, they are selling IOTA as “the same thing as Bitcoin, only better”. IOTA certainly isn’t the same design as Bitcoin, and it also does not use the same well-tested code.  I note that a key selling point is “free” transactions, which sounds suspiciously like a free lunch. Which there ain’t no.

IOTA’s claims are so amazingly good, I fear that they are too good to be true.

Which is the biggest red flag of all.

  1. Neha Narula, Cryptographic vulnerabilities in IOTA, in Medium. 2017.
  2. Neha Narula, IOTA Vulnerability Report: Cryptanalysis of the Curl Hash Function Enabling Practical Signature Forgery Attacks on the IOTA Cryptocurrency. 2017.
  3. Daniel Palmer, Broken Hash Crash? IOTA’s Price Keeps Dropping on Tech Critique Coindesk.September 8 2017,
  4. Dominik Schiener, A Primer on IOTA (with Presentation), in IOTA Blog. 2017.


Cryptocurrency Thursday

Citizen Science: NoiseCapture App

Contemporary digital technology offers many opportunities for collecting scientific data. Millions of people are carrying highly capable networked computers (mobile phones), with cameras, microphones, and motion sensors. Most personal devices have capabilities available only in a few laboratories twenty years ago.

Furthermore, these devices are in the hands of “civilians”. It is now possible to do “citizen science” for real, using personal devices to collect data and aggregate it through network services.

This has been used for environmental sensing (microbe populationsmicrobe assays, weather, air pollution, particulates,, odors), earthquake detection, food quality, detecting poachers, and wildlife observations (pollinators.  bird watching, bird song, insect song).

As I have remarked before, simply collecting data is not actually that useful scientifically. It also invites misguided pseudoscicence, if data is not carefully analyzed or misinterpreted.

What is needed is the rest of the picture, including data cleaning, careful models and analysis, and useful , valid visualization and reports.  You know, the “science” part.

This summer, a team from several French research institutions are releasing the NoiseCapture app , which allows anyone tomeasure and share the noise environnement [sic]”.

Specifically, this app measures noise in a city, as the user moves through ordinary activities. The microphone records the sounds, and GPS tracks the local of the device. (There are plenty of tricky details, see their papers [1, 2].)

The collected data is transmitted to the project’s server, where it is analyzed and cross-calibrated with other data. Any given measurement isn’t terribly meaningful, but may data points from many phones combine to create a valid estimate of a noise event. They incorporate these data into a spatial model of the city, which creates an estimate of noise exposure throughout the area [1].

Ii is very important to note that estimating noise exposure from a mobile phone microphone is pretty complicated (see the papers). Crowdsourcing the data collection is vital, but the actual “science” part of the “citizen science” is done by experts.

I’m pleased to see that the researchers have done some careful development to make the “citizen” part work well. The system is designed to record readings along a path as you walk. The app gives visual indications of the readings and the rated hazard level that is being observed. The data is plotted on interactive digital maps so that many such paths can be seen for each city. The project also suggests organizing a “NoiseCapture Party” in a neighborhood, to gather a lot of data at the same time.

Overall, this is a well thought out, nicely implemented system, with a lot of attention to making the data collection easy for ordinary people, and making high quality results available to the public and policy makers.

This research is primarily motivated by a desire to implement noise control policies, which are written with detailed technical standards. Much of the work has been aimed to show that this crowdsourced consumer device approach can collect data that meets these technical standards.

That said, it should be noted that technical noise standards are not the same thing as the subjective comfort or nuisance value of an environment. One person’s dance party is another person’s aural torture. A moderately loud conversation might be unnoticed on a loud Saturday night, but the same chat might be very annoying on the following quiet Sunday morning.

I also have to say that I was a little disappointed that the “environment” in question is the urban streetscape. For instance, the app is not useful for indoors noise (where we spend a lot of time).

Also, I would love to have something like this to monitor the natural soundscape in town and country. When the machines and people aren’t making so much noise, there is still plenty to hear, and I would love to be able to chart that. These voices reveal the health of the wildlife, and it would be really cool to have a phone app for that.

This is what “dawn chorus” folks are doing, but they don’t have nearly as nice data analysis (and non Brits can’t get the app).

Finally, I’ll note that simply detecting and recording noise is only a first step.  In the event that the neighborhood is plagued by serious noise pollution, you’re going to need more than a mobile phone app to do something about it. You are going to need responsive and effective local and regional government.  There isn’t an app for that.

  1. Erwan Bocher, Gwendall Petit, Nicolas Fortin, Judicaël Picaut, Gwenaël Guillaume, and Sylvain Palominos, OnoM@p : a Spatial Data Infrastructure dedicated to noise monitoring based on volunteers measurements. PeerJ Preprints, 4:e2273v2, 2016/09/28 2016.
  2. Gwenaël Guillaume, Arnaud Can, Gwendall Petit, Nicolas Fortin, Sylvain Palominos, Benoit Gauvreau, Erwan Bocher, and Judicaël Picaut, Noise mapping based on participative measurements, in Noise Mapping. 2016.


Robot Funeral Rituals? Augmenting Religious Practice

One of the most prominent aspects of human life that has been little affected by the internet and robots is religion, especially formal religious practices. Church, temple, or mosque, religious practice is a bastion of unaugmented humans.

There are obvious reasons for this to be the case. Religion is conservative with a small “C”, embodying as it does cultural heritage in the present day. Traditional ideas and practices are at the psychological core of religious practice. Religious practice is not generally about “disruption” or “move fast and break things” (at least not in the thoughtless way Silicon Valley disrupts things.)

Another obvious reason is that much of religious teaching is about human behavior and human relations. Emphasis on the “human”. From this perspective, augmenting humans or virtualizing human relations is at best irrelevant and at worst damaging to proper human conduct.

But this will surely change. Religious traditions are living cultures which adopt new technology. It will be interesting to watch how human augmentation is incorporated into religious practices, not least because it may create some interesting, humane modes of augmented living.

Obviously, many people have already adopted digital communications and social media in spiritual and religious life. Heck, even the pope is on twitter. But this is the tip of the iceberg, little more than the twenty first century version of pamphlets and sermons.

What else might be coming?

For one thing, virtual worlds will surely need to be converted.

I recall some science fiction story (quite possibly by William Gibson, but I don’t remember) that had a brief vignette about a devout Catholic who loaded his personality into an avatar in a virtual world. This splinter of his consciousness (soul?) kneels in a virtual chapel and prays 24/7. In the story, this practice is approved by the church. I think the notion is that he receives indirect credit for this pious exercise, which is sort of analogous to other practices such as hiring a mass for a deceased parent.

For another, robots and cyborgs need to be incorporated into both theology and practice.

Along these lines, Evan Ackerman reports this month on a service in Japan that offers a robot to perform Buddhist funeral rites [1].  The “humanoid robot, suitably attired in the robe of a Buddhist monk” reads the sutras and bows at the appropriate moments.

The robot is much cheaper than a human, is programmed for alternative versions of the ritual, and can live stream the affair to remote mourners. (It can probably work much longer and faster than puny Carbon-based priests, too.)

It isn’t clear how this will be accepted or how popular it may be. To the degree that the funeral is for the comfort of the living, much will depend on how the mourners like it. A robot is not a sympathetic and soothing as a person, so I don’t really know.

There are, of course, theological questions in play. Do the words count if they are said by a machine? (Would they count if a parrot recited them and bowed?) There are certain to be differences of opinion on this question.

Thinking about this, I note another interesting possibility: a robot can also be remotely operated. A human priest could very well supervise the ceremony from a distance, with various levels of control. The robot could, in principle, be anywhere on Earth, in orbit, or on Mars; extending the reach of the holy man. Would this remote augmentation of the priest’s capabilities be “more authentic” than an autonomous robot programmed to do the ceremony?

Such a remote operation would have advantages. The robot would add a level of precision to the fallible priest—the robot could check and correct the performance. The robot can operate in hazardous conditions, such as a disaster area or war zone (imagine remote chaplains for isolated military posts). The remote avatar might bring a measure of comfort to people otherwise out of reach of conventional pastoral care.

Human priests would not have to travel, and could perform more work. For that matter, a single priest could operate multiple remote robot avatars simultaneously, significantly augmenting the sacred productivity.

Taking this idea of a priestly “remote interface” seriously for a moment, we can speculate on what other rituals might be automated this way. Something like Christian traditions such as baptism or communion certainly could be done by robots, especially supervised robots. Would this be theologically legitimate? Would it be psychologically acceptable? I don’t know.

I haven’t heard of anyone doing it, and I’m not endorsing such a thing, I’m just thinking about the possibility.

To the degree that autonomous or supervised robots are accepted into spiritual practice, there will be interesting questions about the design and certification of such robots. It might well be the case that the robot should meet specific standards, and have only approved programming. Robots could be extremely doctrinaire, or dogma could be loaded as a certified library or patch. I have no idea what these software standards might need to be, but it will be yet another frontier in software quality assurance.

There are other interesting possibilities. What if a robot is programmed for multiple religious practices, coming from competing traditions. At any one moment, it may be operating completely validly for one set of rules, and later it might switch and follow another set of rules. This is how robots work. But this is certainly not how human religions work. Carbon-based units generally cannot be certified clergy for more than one sect at a time. Will robots have to be locked-in to a single liturgical version? Or, like TV or Web Browsers, would a tele-priest be a generic device, configured with approved content as needed.

While we’re on the question of software, what about hacking? What if malicious parties hack into the sacred software, and substitute the prayers for a competing version of the rite? Or defile the word or actions? Or simply destroy the religion they dislike? Yoiks! I have no idea what the theological implications of a corrupted or enslaved robot would be, but I imagine they could be dire.

  1. Evan Ackerman, Pepper Now Available at Funerals as a More Affordable Alternative to Human Priests, in IEEE Spectrum – Automation. 2017.


Moving Beyond the Turing Test

Alan Turing is the great founding wizard of computer science. He dreamed up the field, and established its mathematical foundations. He did so much awesome, amazing stuff.

And then there’s The Turing Test, which is his most famous invention. Which is also one of the dumbest ideas ever.

At the dawn of the computer age, everyone including Turing was fascinated by all the things that might come. And everyone is fascinated by the idea of computers mimicking human behavior. Nerdy egghead Turing was particularly interested in cognitive abilities, which many considered to be the crux of “intelligence”. Could computers “think”? How could we know?

His thought experiment proposed that a machine could be considered “intelligent” if it could hold a conversation with a human, and the human couldn’t tell if it were a machine or another person. This has become “The Turing Test”, and enshrined as a definition of Artificial Intelligence.

To an American, this test appears to be modeled after the British class (and academic) system. When I talk to you, if I can’t tell you aren’t one of us, then you will be accepted as one of us.

While this idea is certainly an intriguing psychological comment on human communication, society, and identity, it is utterly useless as an actual test of anything. For one thing, conversation is hardly the only “intelligent” activity. And for another, who said that the puny Carbon based unit is “intelligent” in the first place?

Over the years, the Turing Test has been side stepped by many domain specific assessments of vision, locomotion, and so on. In the area of language, great technical advances have enabled computers to become roughly as capable as humans for some activities, such as, speech recognition and spell checking. These cool applications often beat the heck out of puny humans, though the internal details are known to have little to do with how H. sap. does the same thing.

As for the original domain Sensei Alan was thinking about—language understanding, reasoning, and common sense—there are many ways to assess this. Winning at games is flashy, but not particularly useful.

Carissa Schoenick and colleagues of the Allen Institute for Artificial Intelligence describe another idea, the Allen AI Challenge [1]. This challenge asks computers to complete eighth grade science exams, to compare them to the human teenagers upon whom these tests are regularly inflicted.

Allen AI Science Challenge, a competition to test machines on an ostensibly difficult task—answering eighth-grade science questions.” ([1], p. 60)

One interesting point is that the format of the test was substantially different for the computers than for the humans. There were zillions more questions, both practice (training) and testing (validation). The test was bloated with a vast number of dummy questions, to defeat simple cheating by the non-human participants.  Honestly, humans would have difficulty wading through 20,000 multiple choice questions! But computers don’t get bored, and they have no bladders.

The computers’ results were compared to chance, to simple online look ups, and to aggregate results from humans. The best efforts achieved around 60% correct answers. This was better than guessing (expected 25% correct), but not that much better than efficient searching (which could get 40-50% correct, just by looking things up).

These results are hardly an “A”, and many teen-aged humans do a lot better. Some of the wrong answers indicate the weaknesses of the AI. We all know that computers are pretty good at look up, but they had difficulty relating facts to each other.

[These AIs were] unable to ace an eighth-grade science exam because they do not currently have AI systems able to go beyond surface text to a deeper understanding of the meaning underlying each question, then use reasoning to nd the appropriate answer” ([1], p. 63)

Schoenick and all note that there are many question-answering systems around, but none of them are good at reasoning, especially “common sense” reasoning.

I have to say that this is pretty much the state of affairs when I was a lad sixty years ago.  Sigh.  Despite all the hooplah, Hollywood, and ahnd-wringing, common sense reasoning is still the mountain that has to be climbed.

I’ll add that it is slightly encouraging to see humans, even teens, are still undeniably better than computers at this task. AI supremicists have some ‘splainin’ to do.

One also wonders what exactly the Singularity Crowd makes of this. I mean, if you “upload” your “mind” to Silicon, are you going to be able to even graduate from Middle school?

  1. Carissa Schoenick, Peter Clark, Oyvind Tafjord, Peter Turney, and Oren Etzioni, Moving beyond the Turing Test with the Allen AI Science Challenge. Commun. ACM, 60 (9):60-64, 2017.


Interplanetary Networking

Space travel faces inevitable communication challenges. Even within the solar system, distances are light minutes to hours, which means round trip latencies that preclude easy conversation. In addition, signals decay quadradically, so there is a brutal power-to-distance relationship—and power is precious in space.

Can we do better than radio signals?

Gregory Mone reports in CACM that the answer is, lasers, man! [2]

Lasers are higher frequency and narrower beams, so they can transmit more data for the same power. This can’t eliminate the latency, but can push more data in a given time. As much as 10-100 times the data, which is worth a lot of effort to make happen.

A laser is much more directional than radio, and the receiver is a telescope. The narrow beam is a challenge, because the signal has to be aimed precisely. Given than everything is moving rapidly relative to everything else, it isn’t trivial to keep a signaling laser pointed at a very distant target.

If you were to aim a beam of radio waves back at Earth from Mars, the beam would spread out so much that the footprint would be much larger than the size of our planet. “If you did the same thing with a laser,” Biswas  [of NASA JPL] says, “the beam footprint would be about the size of California.”” ([2], p. 18)

Experiments have demonstrated space laser communication, utilizing error correcting codes (to mitigate lost signals) and advanced nanoactuators to precisely aim the laser. At very large distances, power will be at a premium, so there will be no bits to spare for error correction.

The receivers are essentially telescopes, which are a very well known technology. Receiving weaker signals from farther away means bigger telescopes. Mone says that signals from the solar system need a 10-15 meter scope.

These links will still have extremely long latencies compared to terrestrial networks. This means that our Earth bound protocols need to be redesigned for the Interplanetary Internet. (Hint: timeouts don’t work well if the round trip time for an ACK is variable and measured in hours.) This work is well underway [1].


  1. InterPlanetary Networking Special Interest Group (IPNSIG). InterPlanetary Networking Special Interest Group (IPNSIG). 2017,
  2. Gregory Mone, Broadband to Mars. Commun. ACM, 60 (9):16-17, 2017.


Robogami: “Democratizing” Robot Building?

In a recent paper, Cynthia Sung and colleagues at MIT describe their automated design system, which addresses a “long-held goal in the robotics field has been to see our technologies enter the hands of the everyman [sic].” [1]

Well, I don’t know about that. Every nerd, maybe.

The idea is a high level design system that generates simple “fold up” robotic vehicles, suitable for fabrication with ubiquitous laser cutters and other shop tools. The computer system helps the designer create the “geometry”, the 3D shape of the vehicle, and the “gait”, how it moves. The system shows the results in a simulator, so the designer can rapidly iterate. The prototype is then sent to a printer, and snapped together with appropriate motors and wires.

One of the main challenges in robot design is the inter- dependence of the geometry and motion.


As the paper makes clear, this idea was influenced by a number of current trends which I’m sure are bouncing around MIT CSAIL and everywhere esle: computational aided iterative design, rapid prototyping with personal fabrication, and, of course, Origami <<link to post>>.

The system also reports performance metrics (e.g, speed of locomotion), and helps optimize the design.

Of course, this isn’t really a general purpose robot design system. Aside from the fact that the hard part in any design is figuring out what to design (and diving into iterative prototyping often distracts from careful thought and research), useful robots have sensors and manipulators, as well as machine learning or domain knowledge or both, which is not part of this design.

This system is really only about the body and the movement: essentially, the basic shell of the robot.  Important, but really only the foundation of a working, useful robot.

“The system enables users to explore the space of geometries and gaits”

It’s cool, but not the whole story.

And, let us not forget, the appearance and sociability of the robot is increasingly important. These cute little robogamis look like toys, and are little more use than a toy. These are certainly not social robots!

Now, if you sold this as a “toy factory”, perhaps with some stickers and funny voices, you’d have a bang up product. Don’t give Suzie a doll, give her a machine to make as many dolls as she wants!  And the dolls move and talk!

Now that would be cool!

  1. Adriana Schulz, Cynthia Sung, Andrew Spielberg, Wei Zhao, Robin Cheng, Eitan Grinspun, Daniela Rus, and Wojciech Matusik, Interactive robogami: An end-to-end system for design of robots with ground locomotion. The International Journal of Robotics Research:0278364917723465, 2017.


Robot Wednesday