Category Archives: science and technology

McGraw on Software vs Security

I enjoyed Gary McGraw comments in IEEE Computer about “Six Tech Trends Impacting Software Security[1].

His main point is that software development (and I would say runtime environments, too) have changed rapidly in the last couple of decades, obsoleting many software security assurance techniques (which I would say were iffy even in their heighday).

The past few years have seen radical shifts in the way software is developed, in terms of both process and the technology stack. We must actively track these changes to ensure that software security solutions remain relevant.” ([1], p. 20)

His list includes:

  • Continuous integration and continuous development
  • “The Cloud”
  • The Internet of Things—software is in everything
  • Software containers, dynamic composition
  • AI
  • Software security leaders are newbs

These are some of the trendiest trends!

Interestingly, McGraw does not see “the cloud” as particularly troubling in itself, and he has a point. If anything, deploying software in standardized server farms is a good thing for security, compared to installing everything on a zillion platforms out in the wild world. (But see “Internet of Things”.)

As he says, continuous development is a hazard not only for security for quality and everything else. To me, continuous development is hard to distinguish from just plain hacking, and that’s not good for quality or security or anything except speed to market.

McGraw doesn’t mention documentation, but please spare a moment to have a kind thought for the poor technical writer, who is tasked with explaining the software, even as it changes from hour to hour.

I myself have already beefed about the IoT many times, which is a hazard from almost every angle. But I have to say that I don’t think it is even theoretically possible to good write code for the IoT, secure or not. And it is deployed out in the world with no one actually in charge. How can this be anything but a catastrophe?

As McGraw suggests, AI cuts both ways. It creates vast possibilities for bugs and breaches beyond human understanding, but also enables tools and processes that can greatly improve software (again, beyond human capabilities). As he says, a lot of this isn’t so much new, but there are so many cycles and gazoogabytes available to anyone, even old tricks can yield amazing results, for better or worse.

The unifying theme in all this is that systems are bigger, faster, and way, way more complicated than ever. Including the Internet, “the system” extends to every corner of the globe, encompassing zillions of nodes and links, under the control of everyone and no one . No human can understand what’s going on, what the software does, or even how the software is configured. If you can’t understand it, you can’t make it secure.

McGraw’s last point is interesting. Security professionals are not stupid, but many of them are young. From my point of view, the question is, “are they paranoid enough?” Probably not.

There are plenty of other tech trends that create security hazards. I’ll just mention my own favorite bugaboo, virtualization. Over my umpty-ump decades of software development, everything has moved to be more and more virtualized. Information hiding, standardization, and emulation are powerful technologies and, honestly, without them we’d never be able to produce software fast enough to even keep up.

But virtualization has led to the situation where even the smallest program depends on an unknowable stack of software. “Unknowable” because even if you intend to freeze the configuration, you really can’t.

Like everyone, I have see cases where developers don’t (and can’t) fix a bug, so they just roll back a magic VM to the magical last safe point where it worked, and restart. Tell me that isn’t a security problem.

The fact that software works at all is a tribute to the skill of we, the programmers. But it is difficult to be optimistic that it won’t all come tumbling down.

If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.” Gerald Weinberg’s Second Law

And if the woodpeckers are out to get us, just how long will civilization last?


  1. Gary McGraw, Six Tech Trends Impacting Software Security. Computer, 50 (5):100-102, 2017. http://ieeexplore.ieee.org/document/7924264/

 

Four Colors Still Suffice

This fall marks the 40th anniversary of the publication of the first proof of the Four Color Map Problem.

Two professors a the University of Illinois used the relative abundance of computer power at UIUC to produce a groundbreaking computer assisted proof of this perennial question.

I remember very well getting my issue of Scientific American, and there it was:

I knew immediately what it must mean. (As any Illini can tell you, there is a postal substation in the Math building. They arranged a special postal cancellation to mark the announcement.)

The essence of their 1977 proof is to enumerate all possible layouts, and systematically crunch through them all [1, 2]. For their proof they dealt with some 1400 configurations, which took months to process on the IBM 360. Today, you can probably do it in minutes on your phone. Then it took a special allocation of time on “the computer”.

The result was not without controversy. Is it a valid mathematical proof if it has not and cannot be confirmed by a human? (As an Illinois alum, I say it’s a valid proof!)

This theorem has been proved many times since 1977, so there isn’t much doubt about the result. But the first time was a heroic milestone in human knowledge.

Unfortunately, much about this major human accomplishment is effectively lost to historical and intellectual analysis

It was written in IBM assembler, and punched on Hollerith cards. (You youngsters can look those thing up.) I know that Prof. Appel still had the deck twenty five years ago because he showed it to me in a box propping open the door. Even back then there was no way to run the program (no card readers left, nor any IBM 360s).

So there are many questions that cannot be answered. Was the original program correct? What was their coding style like? Is it a pretty or clever code? And so on.

We don’t know, and it would be difficult to find out.

Still. All hail Haken and Appel, computational heroes! We are not worthy!

Giants still walk among us!

Alma Mater


  1. K. Appel, and W. Haken, Every planar map is four colorable. Part I: Discharging. Illinois J. Math., 21 (3):429-490, 1977/09 1977. https://projecteuclid.org:443/euclid.ijm/1256049011
  2. K. Appel, W. Haken, and J. Koch, Every planar map is four colorable. Part II: Reducibility. Illinois J. Math., 21 (3):491-567, 1977/09 1977. https://projecteuclid.org:443/euclid.ijm/1256049012
  3. Samantha Jones, Celebrating the Four Color Theorem, in College of Liberal Arts – News. 2017, University of Illinois Urbana. http://www.las.illinois.edu/news/article/?id=23831&/news//news/2017/fourcolortheorem17/

 

Listening for Mosquitos

The ubiquitous mobile phone has opened many possibilities for citizen science. With most citizens equipped with a phone, and many with small supercomputers in the purse or pocket, it is easier than ever to collect data from wherever humans may be.

These devices are increasing the range of field studies, enabling the identification of plants and animals by sight and sound.

One key, of course, is the microphones and cameras. Sold to be used for deals and dating, not to mention selfies, these instruments are outstripping what scientists can afford.

The other key is that mobile devices are connected to the Internet, so data uploads are trivial. This technology is sold for commerce and dating and for sharing selfies, but it is perfect for collecting time and location stamped data.

In short, the vanity of youngsters has funded infrastructure that is better than scientists have ever built. Sigh.


Anyway.

This fall the Stanford citizen science folks are talking about yet another crowd sourced data collection: an project that identifies mosquitos by their buzz.

According to the information, Abuzz works on most phones, including older flip phones (AKA, non-smart phones).

It took me a while to figure out that Abuzz isn’t an app at all. It is a manual process. Old style.

You use the digital recording feature on your phone to record a mosquito. Then you upload that file to their web site. This seems to be a manual process, and I guess that we’re supposed to know how to save and upload sound files.

The uploaded files are analyzed to identify the species of mosquito. There are thousands of species, but the training data emphasized the important, disease bearing species we are most interested in knowing about.

A recent paper reports the details of the analysis techniques [2]. First of all, mobile phone microphones pick up mosquito sounds just fine. As we all know, the whiny buzz of those varmints is right their in human hearing, so its logical that telephones tuned ot human speech would hear mosquitos just fine.

The research indicates that the microphone is good in a range of up to 100mm. This is pretty much what you would expect for a hand held phone. So, you are going to have to hold the phone up to the mosquito, just like you would pass it to a friend to say hello.

At the crux of the matter, they were able to distinguish different mosquitos from recordings made by phone. Different species of mosquito have distinct sounds from their wing beats, and the research showed that they can detect the differences from these recordings.

They also use the time and location metadata to help identify the species. For example, the geographic region narrows down the species that are likely to be encountered.

The overall result is that it should be possible to get information about mosquito distributions from cell phone recordings provided by anyone who participates. This may contribute to preventing disease, or at least alerting the public to the current risks.


This project is pretty conservative, which is an advantage and a disadvantage. The low tech data collection is great, especially since the most interesting targets for surveillance are likely to be out in the bush, where the latest iPhones will be thin on the ground.

On the other hand, the lack of an app or a plug in to popular social platforms means that the citizen scientists have to invest more work, and get less instant gratification. This may reduce participation. Obviously, it would be possible to make a simple app, so that those with smart phones have an even simpler way to capture and upload data.

Anyway, it is clear that the researchers understand this issue. The web site is mostly instructions and video tutorials, featuring encouraging invitations from nice scientists. (OK, I thought the comment that “I would love to see is people really thinking hard about the biology of these complex animals” was a bit much.

I haven’t actually tried to submit data yet. (It’s winter here, the skeeters are gone until spring). I’m not really sure what kind of feedback you get. It would be really cool to return email a rapid report (i.e., within 24 hours). It should say the initial identification from your data (or possibly ‘there were problems, we’ll have to look at it), along with overall statistics to put your data in context (e.g., we’re getting a lot of reports of Aegyptus in your part of Africa).

To do this, you’d need to automate the data analysis, which would be a lot of work, but certainly is doable.


I’ll note that this particular data collection is something that cannot be done by UAVs. Drones are, well, too droney. Even if you could chase mosquitos, it would be difficult to record them over the darn propellers. (I won’t say impossible—sound processing can do amazing things).

I’ll also note that this research method wins points for being non-invasive. No mosquitos were harmed in this experiment. (Well, they were probably swatted, but the experiment itself was harmless.) This is actually important, because you don’t want mosquitos to selectively adapt to evade the surveillance.


  1. Taylor Kubota, Stanford researchers seek citizen scientists to contribute to worldwide mosquito tracking, in Stanford – News. 2017. https://news.stanford.edu/2017/10/31/tracking-mosquitoes-cellphone/
  2. Haripriya Mukundarajan, Felix Jan Hein Hol, Erica Araceli Castillo, and Cooper Newby Using mobile phones as acoustic sensors for high-throughput mosquito surveillance. eLife. doi: 10.7554/eLife.27854 October 11 2017, https://elifesciences.org/articles/27854#info

Semantic Aware Framework for 3D Tele-Immersion

One of the latest products from Sensei Klara Nahrstedt’s teleimmersion lab is Shannon Chen’s prize-winning thesis, “Semantics-Aware Content Delivery Framework for 3D Tele-Immersion[1].

Nahrstedt’s group has been developing 3D Tele-immersion (3DTI) technology for a decade or so. 3DTI allows “full-body, multimodal interaction among geographically dispersed users,” for a decade and more now.

Chen’s dissertation is about optimizing the trade-offs that are inherent in the end-to-end transmission of 3DTI. This is a recent refinement of the quality of service concepts this group has developed over many years.

The basic challenge is that 3DTI sucks CPU, memory, and bandwidth like crazy, and user experience suffers badly from latency or inadequate bit rates. Managing the network very critical, and very difficult.

 

Chen’s contribution is to introduce semantic information into the system, to manage resource usage and trade-offs, “to bridge the gap between high-level semantics and low-level data delivery”, specifically “by injecting environmental and user-activity semantic information “.

The thesis considers several aspects of 3DTI, capture, dissemination, and receiving. In each phase, resource limitations challenge the ability to deliver a satisfactory user experience.

The semantics to be considered are computing environment, activity, and user role. From a high level understanding of these, the system can tune performance at many levels.

The overall design is a set of modules that use the elements of the semantics to adjust the parameters of the 3DTI phases.

I’ll refer the reader to the dissertation for full details. Briefly,

  • Activity semantics are used to optimize data capture, helping identify the most important data based on the user’s task and behavior.
  • User semantics are used to optimize The user’s role is used to help identify the most important flows of data and required QoS.
  • Activity + Environment semantics are used to optimize The user’s environment determines his or her view point and also the capabilities of the local device.

The thesis reports on analysis of three prototypes for different use cases that emphasize these three types of optimization.

I note that these systems he tackled present very difficult technical problems. For example, the 3DTI is not only all around (i.e., multiple simultaneous video streams), it may include on body and other sensors (that must be synchronized with the video). 3DTI can be synchronous or asynchronous, and might need to be archived for analysis and replay.

In short, the data is diverse and voluminous, and generally needs to be synchronized. The trick, of course, is that there are slews of data that might be needed at any moment, but only some of it is actually needed at any particular time in a particular part of the system. The idea is to use semantics to deliver what is needed when and where it is needed, to improve the experience.

This is a nice piece of work. He hits on a lot of important themes.

For one thing, it shows again the importance of end-to-end design. In this case, his “semantics” come from the requirements and constraints on the whole system, from human to human, though many systems and links. In my view, he could have called it “An end-to-end framework….

I also endorse his call that:

we need a formalized scripting language to describe the dynamics in the cyber-physical regime to the digital computing entities” (p. 98)

Absolutely. (See McGrath (2003) [2] : – ), which is woefully out of date, but outlines the general idea.)

More generally, I think there is a lot of use for logical description languages which can combine both manual assertions (e.g., this user is a patient or is a doctor) and automated inferences (e.g., a doctor is likely to need to access archives at full resolution if possible). These systems can also (in principle) reason and produce inferences, e.g., suggestions about optimization based on perceived similarity previous sessions.

Dr. Chen is reportedly now working with a large social media company, so I’m sure his future systems could have access to slews of interesting metadata, including social networks, and histories of digital behavior.

Nice work.

  1. Chien-nan Chen, Semantics-Aware Content Delivery Framework for 3D Tele-Immersion, in Computer Science. 2016, University of Illinois, Urbana-Champaign: Urbana. http://cairo.cs.uiuc.edu/publications/papers/Shannon_Thesis.pdf
  2. Robert E. McGrath, Semantic Infrastructure for a Ubiquitous Computing Environment, in Computer Science. 2005, University of Illinois, Urbana-Champaign: Urbana. http://hdl.handle.net/2142/11057
  3. August Schiess, CS alumnus Shannon Chen receives SIGMM Outstanding PhD Thesis Award, in CS@Illinois – News. 2017. https://cs.illinois.edu/news/cs-alumnus-shannon-chen-receives-sigmm-outstanding-phd-thesis-award

 

Bitcoin is More Evil Than Ever

From the beginning, Nakamoto style cryptocurrency was intended to enable unimpeded flows of funds [2]. Cryptocurrencies are specifically designed to be the perfect mechanism for grey and black markets; for tax evasion and for money laundering of all kinds. While crypto-enthusiasts see this as a feature, most of civilized society views this as a serious bug.

In the short history of Bitcoin, we have seen it become a medium for illicit commerce and ransomware. (Even more-or-less legitimate uses, such as digital commerce are being highjacked by a flood of scams, including preposterous “initial coin offerings”, which might as well be called “tulipware”.)

It has become evident that Bitcoin has also become a favorite tool for human smuggling and human trafficking: modern day slave trade. I’m not seeing this as a good thing in any way at all.

As reported in Coindesk [1], this issue was highlighted by Joseph Mari of the Bank of Montreal at the The Pontifical Academy of Social Sciences, Workshop on Assisting Victims of Human Trafficking: Best Practices in Legal Aid, Compensation and Resettlement [4]. (It’s not often that I cite something “Pontifical” : – )) Mari reports that, as conventional financial services move to block illicit commerce, including human trafficking, criminals have moved to use Bitcoin to collect their illicit money.

Cryptocurrency enthusiasts are quick to point out that this is pretty much exactly how Bitcoin was designed to work: it is supposed to be immune to “censorship”. Other cynics like me would also point out that the wealthy get away with this stuff without resorting to frippery like Bitcoin. (See perhaps: England, Queen of, offshore accounts of.)

Of course, the original Nakamoto design was more than a little hacky, and it isn’t completely immune to interference by determined authorities. Companies make good money selling analytics that spot suspicious transactions and, with favorable winds and some luck, might nab some bad guys.

However, this mostly retroactive data mining is hardly adequate. Detecting this stuff after the fact doesn’t stop, prevent, or deter it.

Worse, the tiny successes so loudly touted are technically obsolete, as the dark web moves to far more opaque cryptocurrencies.

Mari is right to be concerned, and it is good to educate conventional banks and other authorities about this technology. But I’m really not sure that there is anything that can be done, at least until quantum computing takes it all down.


  1. Michael del Castillo, Vatican Address to Highlight Bitcoin Use in Slave Trade. Coindesk.November 2 2017, https://www.coindesk.com/vatican-address-highlight-bitcoin-use-human-slave-trade/
  2. Satoshi Nakamoto, Bitcoin: A Peer-to-Peer Electronic Cash System. 2009. http://bitcoin.org/bitcoin.pdf
  3. Darryn Pollock , Jamaican Police Take Aim at Human Traffickers’ Bitcoin Pockets, in Cointelegraph. 2017. https://cointelegraph.com/news/jamaican-police-take-aim-at-human-traffickers-bitcoin-pockets
  4. The Pontifical Academy of Social Sciences, Workshop on Assisting Victims of Human Trafficking: Best Practices in Legal Aid, Compensation and Resettlement. 2017: Vatican City. http://www.pass.va/content/scienzesociali/en/events/2014-18/resettlement.html

 

Cryptocurrency Thursday

 

BeePi: Open Source Hardware

OK, I have my reservations about the Internet of Things (AKA the Internet of Way Too Many Things, or the Internet of Things That Don’t Work Right).  And I have also expressed concerns about DIY environmental sensing, which is usually long on sensing and short on validity.

But let’s combine IoT concepts with useful environmental monitoring, and validate the measurements, and I’m all for it.

Plus, I’m really worried about the bees.

So I am very interested in Vladimir Kulyukin’s BeePi, a Respberry Pi based bee hive monitor. Over the past decade, his team has developed low cost sensors and in situ data analysis that measures the sound, sight, and temperature of a bee hive. The sensors are minimally invasive, and collect data more or less continuously.

Vladimir Kulukin downloads data from a BeePi system at a honey bee hive in Logan on Monday afternoon. The USU computer science professor started a Kickstarter campaign for the device and surpassed his goal within the first two weeks. John Zsira/Herald Journal

Unlike bogus “Pigeon backpack” projects, this group has actually developed, validated, and published analytics that turn the sensor traces into potentially useful data about the behavior of bees. (E.g. see [1].)

The sound recordings can, in principle, give clues about the number and activity of the bees. At the coarsest level, they have easily documented the daily cycle of activity. I.e, they have confirmed day and night.

The visual imagery is used to detect bees entering and leaving the hive. This is an important indicator of foraging activity and overall health of the colony, and might give early warning of trouble in the hive.

The temperature measures correlate with overall activity, and abberant readings would indicate serious problems inside the colony.

The researchers aim to publish their hardware and software designs, so others can build and improve the idea. (It isn’t immediately clear what kind of licensing is intended, other than it is open source.)

In a sad sign of the times, they are doing a kickstarter to raise money ($1,000 !?) to build some more prototypes. In a sane world, funding agencies and companies would be beating down their doors trying to give them research support. And it would be many tens of thousands.

Another sign of the times is that the kickstarter is the most complete information about the project. Get a web page, guys!!


This project is pretty cool, and made me think.

As a distributed systems guy, the need for manual downloads is just too crude. A future version should have some kind of low power networking that, ideally, will automatically upload data to archives, e.g., in a cloud. A concomitant upgrade would be to beef up the data formats (they need to be documented, and would be better with standard metadata). It would be nice to have standard APIs for pushing and grabbing the data.

Bee hives tend to be scattered and far from networks, though. But perhaps a small UAV “data harvester”, might fly around, hover a couple of meters away to suck out the data through a short range link, and return to base after its rounds. Sort of “sneaker net” in the age of ubiquitous drones.  Such a drone might be useful for many environmental sensing tasks.

On the sensor front, I would think that humidity sensors would be a simple and important addition to the system. I think (but I’m not sure) that humidity is linked to some possible colony problems.

And what about lidar or sonar? The cost of lidar and sonar is crashing, so you might be able to add these to the sensors. Combined with the imagery, this would give even better bee counts (and in all weather, assuming the bees are active in all weather, which I’m not sure about).

Finally, I would suggest that the creators define how they want to share their system and data from it. Creative commons would be a place to look for ideas. <<link>> I would think that the plans and software might be shared through some existing maker community archive. E.g., Instructables, SparkFun, or AdaFruit would be plausible possibilities.  (Call me.)


This is a good example of low cost environmental sensing.  They are doing the hard work of validating the measurements.

There is a lot of work that could be done to make this a slicker and easier to use open source project. Documentation, publishing the design, and setting up a data archive are pretty straightforward, but would make a huge difference.  (Call me.)


  1. Vladimir Kulyukin and Sai Kiran Reka, A Computer Vision Algorithm for Omnidirectional Bee Counting at Langstroth Beehive Entrance, in nternational Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV’16). 2016: Las Vegas. p. 229-235. http://worldcomp-proceedings.com/proc/p2016/IPC3835.pdf
  2. John Zsiray, USU professor hopes ‘BeePi’ hive sensors will help honeybees, in The Herald Journal – HJNews.com. 2017. https://news.hjnews.com/agriculture/usu-professor-hopes-beepi-hive-sensors-will-help-honeybees/article_7c205ecb-1d64-51d3-96c1-b1754de27d6f.html

Ad Servers Are—Wait For It–Evil

The contemporary Internet never ceases to serve up jaw-dropping technocapitalist assaults on humanity. From dating apps through vicious anti-social media, the commercial Internet is predatory, amoral, and sickening.

This month, Paul Vines and colleagues at the University of Washington report on yet another travesty—ADINTUsing Ad Targeting for Surveillance” [1].

Online advertising is already evil (you can tell by their outrage at people who block out their trash), but they are also completely careless of the welfare of their helpless prey. Seeking more and more “targeted” advertising, these parasites utilize tracking IDs on every mobile device to track everyone of us. There is no opt in or opt out, we are not even informed.

The business model is to sell this information to advertisers who want to target people with certain interests.  The more specific the information, the higher the bids from the advertiser.  Individual ID is  combined with location information to serve up advertisements in particular physical locations. The “smart city” is thus converted into “the spam city”.

Vines and company have demonstrated that it is not especially difficult to purchase advertising aimed at exactly one person (device). Coupled with location specific information, the ad essentially reports the location and activity of the target person.

Without knowledge or permission.

As they point out, setting up a grid of these ads can track a person’s movement throughout a city.

This is not some secret spyware, or really clever data minig. The service is provided to anyone for a fee (they estimate $1000) . Thieves, predators, disgruntled exes, trolls, the teens next door. Anyone can stalk you.

The researchers suggest some countermeasures, though they aren’t terribly reassuring to me.

Obviously, advertisers shouldn’t do this. I.e., they should not sell ads that are so specific they identify a single person. At the very least, it should be difficult and expensive to filter down to one device. Personally, I wouldn’t rely on industry self-regulation, I think we need good old fashioned government intervention here.

Second, they suggest turning off location tracking (if you are foolish enough to still have it on), and zapping your MAID (the advertising ID). It’s not clear to me that either of these steps actually works, since advertisers track location without permission, and I simply don’t believe that denying permission will have any effect on these amoral blood suckers. They’ll ignore the settings or create new IDs not covered by the settings.

Sigh.

I guess the next step is a letter to the States Attorney and representatives. I’m sure public officials will understand why it’s not so cool to have stalkers able to track them or their family through online adverts.


  1. Paul Vines, Franziska Roesne, and Tadayoshi Kohno, Exploring ADINT: Using Ad Targeting for Surveillance on a Budget — or — How Alice Can Buy Ads to Track Bob. Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, 2017. http://adint.cs.washington.edu/ADINT.pdf