Category Archives: Computer Programming

McGraw on Software vs Security

I enjoyed Gary McGraw comments in IEEE Computer about “Six Tech Trends Impacting Software Security[1].

His main point is that software development (and I would say runtime environments, too) have changed rapidly in the last couple of decades, obsoleting many software security assurance techniques (which I would say were iffy even in their heighday).

The past few years have seen radical shifts in the way software is developed, in terms of both process and the technology stack. We must actively track these changes to ensure that software security solutions remain relevant.” ([1], p. 20)

His list includes:

  • Continuous integration and continuous development
  • “The Cloud”
  • The Internet of Things—software is in everything
  • Software containers, dynamic composition
  • AI
  • Software security leaders are newbs

These are some of the trendiest trends!

Interestingly, McGraw does not see “the cloud” as particularly troubling in itself, and he has a point. If anything, deploying software in standardized server farms is a good thing for security, compared to installing everything on a zillion platforms out in the wild world. (But see “Internet of Things”.)

As he says, continuous development is a hazard not only for security for quality and everything else. To me, continuous development is hard to distinguish from just plain hacking, and that’s not good for quality or security or anything except speed to market.

McGraw doesn’t mention documentation, but please spare a moment to have a kind thought for the poor technical writer, who is tasked with explaining the software, even as it changes from hour to hour.

I myself have already beefed about the IoT many times, which is a hazard from almost every angle. But I have to say that I don’t think it is even theoretically possible to good write code for the IoT, secure or not. And it is deployed out in the world with no one actually in charge. How can this be anything but a catastrophe?

As McGraw suggests, AI cuts both ways. It creates vast possibilities for bugs and breaches beyond human understanding, but also enables tools and processes that can greatly improve software (again, beyond human capabilities). As he says, a lot of this isn’t so much new, but there are so many cycles and gazoogabytes available to anyone, even old tricks can yield amazing results, for better or worse.

The unifying theme in all this is that systems are bigger, faster, and way, way more complicated than ever. Including the Internet, “the system” extends to every corner of the globe, encompassing zillions of nodes and links, under the control of everyone and no one . No human can understand what’s going on, what the software does, or even how the software is configured. If you can’t understand it, you can’t make it secure.

McGraw’s last point is interesting. Security professionals are not stupid, but many of them are young. From my point of view, the question is, “are they paranoid enough?” Probably not.

There are plenty of other tech trends that create security hazards. I’ll just mention my own favorite bugaboo, virtualization. Over my umpty-ump decades of software development, everything has moved to be more and more virtualized. Information hiding, standardization, and emulation are powerful technologies and, honestly, without them we’d never be able to produce software fast enough to even keep up.

But virtualization has led to the situation where even the smallest program depends on an unknowable stack of software. “Unknowable” because even if you intend to freeze the configuration, you really can’t.

Like everyone, I have see cases where developers don’t (and can’t) fix a bug, so they just roll back a magic VM to the magical last safe point where it worked, and restart. Tell me that isn’t a security problem.

The fact that software works at all is a tribute to the skill of we, the programmers. But it is difficult to be optimistic that it won’t all come tumbling down.

If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.” Gerald Weinberg’s Second Law

And if the woodpeckers are out to get us, just how long will civilization last?


  1. Gary McGraw, Six Tech Trends Impacting Software Security. Computer, 50 (5):100-102, 2017. http://ieeexplore.ieee.org/document/7924264/

 

Four Colors Still Suffice

This fall marks the 40th anniversary of the publication of the first proof of the Four Color Map Problem.

Two professors a the University of Illinois used the relative abundance of computer power at UIUC to produce a groundbreaking computer assisted proof of this perennial question.

I remember very well getting my issue of Scientific American, and there it was:

I knew immediately what it must mean. (As any Illini can tell you, there is a postal substation in the Math building. They arranged a special postal cancellation to mark the announcement.)

The essence of their 1977 proof is to enumerate all possible layouts, and systematically crunch through them all [1, 2]. For their proof they dealt with some 1400 configurations, which took months to process on the IBM 360. Today, you can probably do it in minutes on your phone. Then it took a special allocation of time on “the computer”.

The result was not without controversy. Is it a valid mathematical proof if it has not and cannot be confirmed by a human? (As an Illinois alum, I say it’s a valid proof!)

This theorem has been proved many times since 1977, so there isn’t much doubt about the result. But the first time was a heroic milestone in human knowledge.

Unfortunately, much about this major human accomplishment is effectively lost to historical and intellectual analysis

It was written in IBM assembler, and punched on Hollerith cards. (You youngsters can look those thing up.) I know that Prof. Appel still had the deck twenty five years ago because he showed it to me in a box propping open the door. Even back then there was no way to run the program (no card readers left, nor any IBM 360s).

So there are many questions that cannot be answered. Was the original program correct? What was their coding style like? Is it a pretty or clever code? And so on.

We don’t know, and it would be difficult to find out.

Still. All hail Haken and Appel, computational heroes! We are not worthy!

Giants still walk among us!

Alma Mater


  1. K. Appel, and W. Haken, Every planar map is four colorable. Part I: Discharging. Illinois J. Math., 21 (3):429-490, 1977/09 1977. https://projecteuclid.org:443/euclid.ijm/1256049011
  2. K. Appel, W. Haken, and J. Koch, Every planar map is four colorable. Part II: Reducibility. Illinois J. Math., 21 (3):491-567, 1977/09 1977. https://projecteuclid.org:443/euclid.ijm/1256049012
  3. Samantha Jones, Celebrating the Four Color Theorem, in College of Liberal Arts – News. 2017, University of Illinois Urbana. http://www.las.illinois.edu/news/article/?id=23831&/news//news/2017/fourcolortheorem17/

 

HFOSS – Humanitarian Free and Open Source Software

Open source software is a good thing, and humanitarian applications are a good thing, too.

So Humanitarian Free and Open Source Software should be a really good thing, no? It’s even got an acronym, HFOSS.

This fall, Gregory W. Hislop and Heidi J. C. Ellis discuss a related point, the potential value of Humanitarian Open Source Software in Computing Education. [1]

For one thing,, any open source project is a potential arena for students to learn about real life software development. By definition, FOSS projects are open and accessible for anyone, including students. An active and successful FOSS project will have a community of people contributing in a variety of roles, and usually will have open tasks that students might well take up. In addition, the decision making process is visible, and, as Hislop and Ellis note, the history of the project is available. A sufficiently motivated student could learn a lot.

(We may skip over the question of whether FOSS projects represent best or even common practices for all software projects. I.e., FOSS isn’t necessarily a “real world” example for many kinds of software.)

Humanitarian projects are interesting for other reasons. For one thing, by definition, a successful humanitarian project of any kind is focused on problem solving for people other than programmers, college students. Simply figuring out how and even whether technical solutions actually help the intended targets is a valuable exercise, in my opinion.

In addition, real life humanitarian software generally addresses large scale, long term problems, with non-trivial constraints. They are excellent challenge problems, all the more so because the price point is zero dollars and the IP must be robustly open to everyone.

Hislop and Ellis make some interesting observations about ways in which these projects can be sued in computing education.

They encourage thinking about all the roles in a technology project, not just coding or testing. (Hear, hear!) Documentation, planning, above all, maintenance not only consume most of the work effort, but are usually the difference between success and failure of a software project. Get good at it, kids!

(I’ll also point out that designing a solution involves so much more than whacking out software–you need to understand the problem from the user’s point of view.)

They also point out the value of connecting the digital problems solving with an understanding of the actual, on the ground, problems and customers. Technological glitz generally does not survive contact with the customer, especially if the customer is an impoverished mission-oriented organization. Good intentions are only the starting point for actually solving real world humanitarian problems.

This last point is actually the main distinction between FOSS and HFOSS. There is just as much practical value in participating in most FOSS projects. And, for that matter, there is a long tradition of service learning, much of it “humanitarian”. HFOSS is the intersection of these educational opportunities, and it is actually pretty tiny. Most FOSS isn’t “humanitarian”, and most human service or humanitarian problems don’t need software.

In fact, engagement with actual community organizations and initiatives is highly likely to teach students that humanitarian problems don’t have technological solutions, especially software solutions. Digital technology may be able to help, at least a little. But humanitarianism is really a human-to-human thing.

If I were supervising a HFOSS class, I would probably want to try to get the students to think about a number of philosophical points relevant to their potential careers.

First off all, students should observe the personal motivations of participants in an HFOSS project, and compare them to motivations for people doing the same kind of work—the exact same kind of work—for other contexts (e.g., large corporation, personal start-up, government agency). Working on something with the goal to make someone else’s life better is kinda not the same thing as angling for a big FU payout.

The second thing that students will need to learn is just how problematic it can be to try to help “them” to solve “their” problems. However great your Buck Rogers tech might be, swooping in from on high to “fix the world” isn’t likely to garner a lot of enthusiasm from the people you mean to help. In fact, “they” may not think they need wheeze-bang new software at all.

Working with real people to understand and solve real problems is rewarding. And in some cases, a bit of HFOSS might be a home run. But HFOSS for the sake of HFOSS cannot possibly succeed. And that is a lesson worth learning.


  1. Gregory W. Hislop and Heidi J. C. Ellis, Humanitarian Open Source Software in Computing Education. Computer, 50 (10):98-101, 2017. http://ieeexplore.ieee.org/document/805731

Survey of Quantum Computing Software

If we are going to build Quantum Computers (and we definitely are going to build them a soon as possible), what kind of software will then need?

I admit that my math skills aren’t really up to groking the details of QC, but software and software tools I do understand.

There are several important QC projects already in progress by Google [4], IBM [1], Microsoft [3], and probably others. These companies even aspire to sell time on their QC, which means they have to have some way for people to actually use them, no?

So how do you program these marvels?

This month Frederic T. Chong, Diana Franklin, and Margaret Martonosi survey the challenges of Programming languages and compiler design for realistic quantum hardware [2].

Kewwl!

They point out that QC is in a state similar to classical computing in the 19050s: hardware is rare and expensive, and every iota of performance counts. However, unlike ENIAC days, we have deep and broad “toolchains” with the concomitant abstractions and optimizing technology. The trick, then, is to work out the best ways to let humans specify programs, and compilers generate code for QC.

To make quantum programming manageable and quantum systems practical, we must expose the appropriate set of low-level details to upper software layers. The trick is to pick the best ones, those that will allow programmers to express programs while producing software that gets the most computation out of physical machines.” ([2], p. 180)

This is what we spent several decades doing for classical computers, and then again for multicomputer systems, and along the way, for problem specific languages, such as logic design. With QC, it’s back to basics, except we have a whole lot of experience at this game now.

One interesting factor is that QC (at least today) is similar to both general purpose computing (i.e., the programmer describes an algorithm) and also resembles hardware definition languages (the programmer needs to specify strict resource constraints). This is, so far, done in a “quantum co-processor” model, and most systems support QASM assembler code (about which I know nothing at all).

(Note that these early systems run on conventional computers so they, by definition, cannot simulate or debug realistic quantum programs. Eventually, the compiler and debugger will need to run on a quantum machine.)

One of the interesting requirements is that the quantum physics of the computer must be enforced by the compiler. (This kind of physical constraint is true of classic computing tools as well, but we just don’t notice any more. For example, the compiler “knows” that time is discrete and runs forward, that registers have only one value at a time which does not change without external input. Etc.) In the case of QC, there is quantum weirdness that must be respected.

These physical details of the hardware are not yet “virtualizable” in the way that classical computing provides a simplified, abstract virtual machine.  Perhaps that will come someday, which will be interesting to see, and also will be yet another significant contribution to human knowledge.

The Nature paper describes some mind-blowing challenges, such a Qubit “garbage collection”, and runtime checking that operations on a Qubit are kosher with respect to other Qubits that are entangled. Again, it is theoretically impossible to fully (or even close to fully) simulate the computation, so these optimizations are done through heuristics and run time checks.

The paper describes quantum programming languages that have been developed, both functional and imperative. The authors note that current languages do not support verification (e.g., specification of correctness conditions). In QC there is also a need for knobs to specify error tolerance and precision, at the level of each individual output!

Compiling quantum programs is a bit different. For one thing QC programs are not as general as classical programs, and it is often the practice to compile the reprogram separately for each input value. This means that the compiler knows a lot about the program path, and can, for instance, aggressively unroll loops. (This stuff is certainly a juicy problem for the compiler jocks out there!)

Other interesting tasks are managing run time compilation for certain operations, and managing the interactions between a classical computer controller and the quantum coprocessor. (This sounds pretty hairy to me.)

Finally, bugs. Bugs, bugs, bugs. Whole new categories of bugs!

when a new computer is built and tested, it will have errors. How will we distinguish between hardware and software bugs?” ([2], p.186)

Bearing in mind that conventional computers cannot simulate quantum calculations in any meaningful sense, how do we detect bugs? I note that some of my favorite debugging techniques, such as single stepping and repeatedly running a program with slight variations will not be very useful or even possible on a QC.  Heck, you can’t even copy a value or otherwise snoop on the execution. What’s a poor programmer to do, ma?

We are going to need new and probably weirdly magical debugging techniques.

Awesome.

This is an amazing and inspiring survey. It also makes my brain hurt. : – )


  1. Davide Castelvecchi, IBM’s quantum cloud computer goes commercial. Nature, 543:159, March 6 2017.
  2. Frederic T. Chong, Diana Franklin, and Margaret Martonosi, Programming languages and compiler design for realistic quantum hardware. Nature, 549 (7671):180-187, 09/14/print 2017. http://dx.doi.org/10.1038/nature23459
  3. Steve Dent, Microsoft’s new coding language is made for quantum computers. Engadget.September 28 2017, https://www.engadget.com/2017/09/26/microsoft-new-coding-language-is-made-for-quantum-computers/
  4. Masoud Mohseni, Peter Read, Hartmut Neven, Sergio Boixo, Vasil Denchev, Ryan Babbush, Austin Fowler, Vadim Smelyanskiy, and John Martinis, Commercialize quantum technologies in five years. Nature, 543:171–174, March 9 2017.

Videla On “Metaphors We Compute By”

Alvaro Videla writes this summer about “Metaphors We Compute By” [1]. His title is a nod to Lakoff and Johnson’s influential “Metaphors We Live By” (1980). This isn’t exactly a new topic, every programmer is taught the importance of metaphors and abstraction in general.

Programmers must be able to tell a story with their code, explaining how they solved a particular problem. Like writers, programmers must know their metaphors.” ([1], p. 45)

Much of the article is about readability, about explaining your code to another programmer. It’s not obvious to me that this is the only or even the most important use of metaphors by computer programmers, but clarity is certainly a good thing.

My own view is that a good metaphor is probably just as important to help understand your own code. In addition, as Videla points out, using a good abstraction (metaphor) can activate existing knowledge to solve the problem. For example, treating a problem as a “graph” opens the way to apply the vast mathematical theory of graphs, as well as high quality code libraries to manipulate graphs.


I’m not sure I really like his “programming is storytelling” metaphor very much. It is true that “a program is just another succession of bits” and a programmer “gives meaning to those bits”. But metaphor is only one of many forms of abstraction programmers may use, unless you define all meaning as “metaphor”. (That’s a useful, if circular concept.)

But, is programming really just another form of storytelling? I dunno about that. And Videla takes this metaphor pretty far when he suggests, for example,

Types are the characters that tell the story of your program; without types, you just have operations on streams of bytes.” ([1], p. 45)

Well maybe, but these characters are literally, inhuman, and often “one dimensional”.  And, I might add, that great mischief can ensue from imputing human emotions or motives to code.


Also, I think Viela overlooks one really interesting point. Computer programs are a very different kind of story. While you may think about a program as conveying the story of how to solve the problem to other programmers (and to yourself), it is more than that. It is executable. And it is expressed in code, which is more-or-less a formal mathematical representation of the story.

Thus, we programmers think and talk about “queues” or “lists”, but these are converted into an unambiguous sequence of logical operations. The former may help the puny Carbon-based life forms grok the story, but the latter is not only executable in a way that mere words never can be, it is a whole new level of meaning.

An executable program lends all sorts of fascinating properties to the “story”. A program can be measured for size, speed, and efficiency. Two programs can be compared in interesting quantitative ways.

And a program can be found to be correct or incorrect! Try to do that with human language!

In short, if computer programs are explained by metaphors, they also explain the metaphor in a unique way. Not only does the metaphor of “a queue” help us understand a program, the program itself is a detailed, formal explanation of that metaphor means.

Cool!  No wonder I find programming much more interesting than storytelling.


Viela finishes off with a great exhibition of how deeply embedded (yet another metaphor) and almost unnoticed metaphors are in computer systems.

Whenever nodes need to agree on a common value, a consensus algorithm is used to decide on a value. There’s usually a leader process that takes care of making the final decision based on the votes it has received from its peers. Nodes communicate by sending messages over a channel, which might become congested because of too much traffic. This could create an information bottleneck, with queues at each end of the channels backing up. These bottlenecks might render one or more nodes unresponsive, causing network partitions. Is the process that’s taking too long to respond dead? Why didn’t it acknowledge the heartbeat and trigger a timeout?” ([1], p.45)


  1. Alvaro Videla, Metaphors we compute by. Communications of the ACM, 60 (10):42-45, 2017. http://queue.acm.org/detail.cfm?id=3127495

 

Side effects – Xtreme End-to-End thinking!

I think that the one of the most important habits of good design and engineering is end-to-end thinking. Solving problems is fine, but they need to be the right problems, and this means they need to be solutions all the way to the actual ends of the system.

Pat Helland writes this month about an even more funky, extreme version of this principle: Side-Effect-Thinking.

He focuses on digital transactions which are the backbone of the digital world.

The basic observation is that digital systems are composed of many pieces and layers which interact through APIs that are specifically designed to hide the TMI. These systems are also connected to real world systems, which may act in response to a transaction.

The main point is, of course,

One system’s side effect is another’s meat and potatoes.” ([1], p. 39)

The whole idea of APIs is to hide how the system works. By design, we have no idea exactly what happens when we place an order, or change an order, or whatever. Lot’s of stuff goes on behind the scenes, and we neither know nor need to know about it.

Helland points out that a side-effect of this design strategy is that a transaction may have knock on effects much wider than we might guess. He describes a scenario of reserving a hotel room. This may trigger the hotel to increase staff for that period, to order additional supplies, reschedule maintenance, and so on. Suppliers may respond with new orders and reserve capacity. And so on. All I asked for is a room!

It becomes more interesting when you cancel the room request. In one sense, the transaction is reversed, and all effects undone—in the reservation systems. But the side effects are not and some cannot be undone.

“cancelled’ transactions don’t undo side effects

He also points out that “idempotent” operations are idempotent only from the viewpoint of the transaction system. Each execution may produce the same result, but each will have side effects. If nothing else, the traffic will be logged (many times and at many levels of the system).

(Historic note: in the early days of the web, it wasn’t uncommon for web servers to swamp their logging and other ‘hidden’ systems, even when there was only a few pages that never changed. The information delivered was trivial, all the work was side effects.)

As I said, all this stems from the completely sound and reasonable design principle of information hiding. It would be impossible to build distributed systems any other way. On the other hand, designers implementing the system would do well to think about side effects that the callers might cause.

I would note that side-effects like those described by Helland have been the source of break-ins and denial of service attacks. Any of the scenarios in the article could, if repeated many times, become a denial of service attack. Some of them would be weird and unprecedented attacks, such as creating local petrol shortages by spamming hotel reservation systems.

Side effects and their ramifications will be even more problematic as the Internet of Things deploys. I don’t really know what the IoT will actually be, but most visions of it imagine that everyday actions will trigger waves of automatic transactions among the “Things”, which will also cause real world actions. TMI to the TMth power!

Following the spirit of Helland’s scenarios, imagine that when you cancel your hotel reservation (triggering mischief in France and memory fragmentation in some database), your home infers that you will be home that day instead of away. The “smart home” had planned to stand down many systems and conduct maintenance, but now anticipating what you will need, and proactively purchases extra electricity and utilities, cancels a maintenance visit, and orders food. These orders in turn trigger a wave of transactions and just-in-time orders by the services. And so on.

If this sounds like a chaotic system, it might very well be.

Remind again, me why the IoT is a good idea?


  1. Pat Helland, Side effects, front and center. Communications of the ACM, 60 (7):36-39, 2017. http://queue.acm.org/detail.cfm?id=3099561

Orchestrating Internet of Things Services

Zhenyu Wen and colleagues write in IEEE Internet Computing about “Fog Orchestration for Internet of Things Service[1]

Don’t you thing “Fog Orchestra” is a great name for a band?

After laughing at the unintentionally funny title, I felt obliged to read the article.

The basic topic is about the “Internet of Things”, which are “sensors, devices, and compute resources within fog computing infrastructures” ([1], p. 16) As Arieff quipped, this might be called “The Internet of Too Many Things”.

Whether this is a distinct or new technology or architecture is debatable, but the current term or art, “fog computing” is, for once, apt. It’s kind of like Cloud Computing, only more dispersed and less organized.

Wen and colleagues are interested in how to coordinate this decentralized fog, especially, how to get things done by combining lots of these little pieces of mist. Their approach is to create a virtual (i.e., imaginary) centralized control, and use it to indirectly control pieces of the fog. Basically, the fog and its challenges is hidden by their system, giving people and applications a simpler view and straight forward ways to make things happen. Ideally, this gives the best of both worlds, the flexibility and adaptability of fog, and the pragmatic usability of a monolithic application.

(Pedantic aside: almost anything that is called “virtual” something, such as “virtual memory” or a “virtual machine” or a “virtual private network”, is usually solving this general problem. The “virtual” something is creating a simpler, apparently centralized, view for programmers and people, a view that hides the messy complexity of the underlying system.

Pedantic aside aside: An exception to this rule is “Virtual Reality”, which is “virtual” in a totally different way.)

The authors summarize the key challenges, which include:

  1. scale and complexity
  2. security
  3. dynamicity
  4. fault detection ans handling

This list is pretty much the list of engineering challenges for all computing systems, but solving them in “the fog” is especially challenging because it is loosely connected and decentralized. I.e., it’s so darn foggy.

On the other hand, the fog has some interesting properties. The components of the system can be sprinkled around wherever you want them, and interconnected in many ways. In fact, the configuration can change and adapt, to optimize or recover from problems. The trick, of course, is to be able to effectively use this flexibility.

The researchers refer to this process as “orchestration”, which uses feedback on performance to optimize placement and communication of components. They various forms of envision machine learning to automatically optimize the huge numbers of variables and to advise human operators. This isn’t trivial, because the system is running and the world is changing even as the optimization is computed.

I note that this general approach has been applied to optimizing large scale systems for a long time. Designing networks and chips, optimizing large databases, and scheduling multiprocessors use these kinds of optimization. The “fog” brings the additional challenges of a leap in scale, and a need for continuous optimization of a running system.

This is a useful article, and has a great title!


  1. Zhenyu Wen, Zhenyu, Renyu Yang, Peter Garraghan, Tao Lin, Jie Xu, and Michael Rovatsos, Fog Orchestration for Internet of Things Services. IEEE Internet Computing, 21 (2):16-24, 2017. https://www.computer.org/internet-computing/2017/05/05/fog-orchestration-for-internet-of-things-services/