Category Archives: Digital Humanities

Four Colors Still Suffice

This fall marks the 40th anniversary of the publication of the first proof of the Four Color Map Problem.

Two professors a the University of Illinois used the relative abundance of computer power at UIUC to produce a groundbreaking computer assisted proof of this perennial question.

I remember very well getting my issue of Scientific American, and there it was:

I knew immediately what it must mean. (As any Illini can tell you, there is a postal substation in the Math building. They arranged a special postal cancellation to mark the announcement.)

The essence of their 1977 proof is to enumerate all possible layouts, and systematically crunch through them all [1, 2]. For their proof they dealt with some 1400 configurations, which took months to process on the IBM 360. Today, you can probably do it in minutes on your phone. Then it took a special allocation of time on “the computer”.

The result was not without controversy. Is it a valid mathematical proof if it has not and cannot be confirmed by a human? (As an Illinois alum, I say it’s a valid proof!)

This theorem has been proved many times since 1977, so there isn’t much doubt about the result. But the first time was a heroic milestone in human knowledge.

Unfortunately, much about this major human accomplishment is effectively lost to historical and intellectual analysis

It was written in IBM assembler, and punched on Hollerith cards. (You youngsters can look those thing up.) I know that Prof. Appel still had the deck twenty five years ago because he showed it to me in a box propping open the door. Even back then there was no way to run the program (no card readers left, nor any IBM 360s).

So there are many questions that cannot be answered. Was the original program correct? What was their coding style like? Is it a pretty or clever code? And so on.

We don’t know, and it would be difficult to find out.

Still. All hail Haken and Appel, computational heroes! We are not worthy!

Giants still walk among us!

Alma Mater

  1. K. Appel, and W. Haken, Every planar map is four colorable. Part I: Discharging. Illinois J. Math., 21 (3):429-490, 1977/09 1977.
  2. K. Appel, W. Haken, and J. Koch, Every planar map is four colorable. Part II: Reducibility. Illinois J. Math., 21 (3):491-567, 1977/09 1977.
  3. Samantha Jones, Celebrating the Four Color Theorem, in College of Liberal Arts – News. 2017, University of Illinois Urbana.


“Games For Change” 2017 Student Challenge

And speaking of mobile apps with a social purpose….

The upcoming annual Games For Change (G4C) meeting has a lot of interesting stuff, on the theme “Catalyzing Social Impact Through Digital Games”. At the very least, this gang is coming out of the ivory tower and up off their futons, to try to do something, not just talk about it.

Part of this year’s activities is the Student Challenge , which si a competition that

“invites students to make digital games about issues impacting their communities, combining digital storytelling with civic engagement.

This year’s winners were announced last month, from local schools and game jams in NYC, Dallas, and Pittsburg. (Silicon Valley, where were you?) Students were asked to invent games on three topics,

  • Climate Change (with NOAA),
  • Future Communities (with Current by GE), and
  • Local Stories & Immigrant Voices (with National Endowment for the Humanities).

Eighteen winners were highlighted.

The “Future Cities” games mostly are lessons on the wonders of “smart cities”, and admonitions to clean up trash. One of them has a rather compelling “heart beat” of Carbon emissions, though the game mechanics are pretty obscure, doing anything or doing nothing at all increases Carbon. How do I win?

The “Climate Change” also advocates picking up trash, as well as planting trees. There is also a quiz, and an Antarctic Adventure (though nothing even close to “Never Alone”)

The “local stories” and “immigrant stories” tell stories about immigrants, past and present. (This kids are from the US, land of immigration.) There are two alarming “adventures” that sketches how to illegally enter the US, which is a dangerous undertaking with a lot of consequences. Not something I like to see “gamified”.

Overall, the games are very heavy on straight story telling, with minimal game-like features. Very much like the “educational games” the kids no doubt have suffered through for years. And not much like the games everyone really likes to play. One suspects that there were teachers and other adults behind the scenes shaping what was appropriate.

The games themselves are pretty simple technically, which is inevitable given the short development time and low budgets. The games mostly made the best of what they had in the time available.

I worry that these rather limited experiences will give the students a false impression of both technology and story telling. The technology used is primitive, they did not have realistic market or user testing, and the general game designs are unoriginal. That’s fine for student projects, but not really a formula for real world success, and has little to do with real game or software development.

Worse, the entire enterprise is talking about it. One game or 10,000 games that tell you (again) to pick up trash doesn’t get the trash picked up. If you want to gamify neighborhood clean up, you are going to need to tie it to the actual physical world, e.g., a “trashure hunt”, with points for cleaning up and preventing litter.

These kids did a super job on their projects, but I think the bar was set far too low. Let’s challenge kids to actually do something, not just make a digital story about it. How would you use game technology to do it? I don’t know. That’s what the challenge is.

  1. Games for Change, Announcing the winners of the 2017 G4C Student Challenge, in Games For Change Blog. 2017.


“The technology of touch”

I have frequently blogged about haptics (notably prematurely declaring 2014 “the year of remote haptics”), which is certainly a coming thing, though I don’t think anyone really knows what to do with it yet.

A recent BBC report  “From yoga pants to smart shoes: The technology of touch”  brought my attention to a new product from down under, “Nadi X”, “fitness tights designed to correct your form”. Evidently, these yoga pants are programmed to monitor your pose, and offer subtle guidance toward ideal position via vibrations in the “smart pants”.

(I can’t help but recall a very early study on activity tracking, with the enchanting title, “What shall we teach our pants?” [2]  Apparently, this year the answer is, “yoga”.)

Source: Wearable Experiments Inc.
Source: Wearable Experiments Inc.

It’s not totally clear how this works, but it is easy to imagine that the garment can detect your pose, compute corrections, and issue guidance in the form of vibrations from the garment. Given the static nature of yoga, detecting and training for the target pose will probably work, at least for beginners. I’d be surprised if even moderately experienced practitioners would find this much help, because I don’t know just how refined the sensing and feedback really will be.  (I’m prepared to be surprised, should they choose to publish solid evidence about how well this actually works.)

Beyond the “surface” use as a tutor, the company suggests a deeper effect: it may be that this clothing not only guides posture but can create “a deeper connection with yourself”. I would interpret this idea to mean, at least in part, that the active garment can promote self-awareness, especially awareness of your body.

I wonder about this claim. For one thing, there will certainly be individual differences in perception and experience. Some people will get more out of a few tickles in their trousers than others do. Other people may be distracted or pulled away from sensing their body by the awareness of their garment groping them.

Inevitably, touch is sensual, and quickly leads to, well, sex. I’m too old not to be creeped out by the idea of my clothing actively touching me, especially under computer control. Even worse, when the computer (your phone) is connected to the Internet, so we can remotely touch each other via the Internet.

Indeed, the same company that created Nadi X created a product called “fundawear” which they say is, “the future of foreplay” (as of 2013).  Sigh. (This app is probably even more distracting than texting while driving….)

Connecting your underwear to the Internet—what could possibly go wrong? I mean, everything is private on your phone, right?  No one can see, or will ever know. Sure.

I’m pretty sure fundawear will “work”, though I’m less certain of the psychological effects of this kind of “remote intimacy”.  Clearly, this touching is to physical touching like video chat is to face to face. Better than nothing, perhaps, but most people will prefer to be in person.

Looking at the videos, it is apparent that the haptics have pretty limited variations. Only a few areas can buzz you, and the interface is pretty limited, so there are only so many “tunes” you can play. The stimulation will no doubt feel mechanical and repetitive, and probably won’t wear very well. Sex can be many things, but it shouldn’t become boring.

(As a historical note, I’ll point out that, despite their advertising claims, this is scarcely the first time this idea has ever been done. The same basic idea was demonstrated by MIT students no later than 2009 [1], and I’ll bet there have been many variations on this theme.  And the technology is improving rapidly.)

This is a very challenging and interesting area to explore. After following developments for the last decade and more, I remain skeptical about how well any sensor system can really communicate body movement beyond the most trivial aspects of posture.

My own observation is that an interesting source of ideas comes from the intersection of art and wearable technology. In this case, I argue that, if you want to learn about “embodied” computing, you really should work with trained dancers.

For example, you could do far worse than considering the works of Sensei Thecla Schiphorst, a trained computer scientist and dancer, whose experiments are extremely creative and very well documented [4].

One of the interesting points that I have learned from Sensei Thecla and other dancers and choreaographers, is how much of the experience of movement is “inside”, and not easily visible to the computer (or observer). I.e., the “right” movement is defined by how it feels, not by the pose or path of the body. Designing “embodied” systems needs to think “from the inside out”, to quote Schiphorst.

In her work, Schiphorst has explored various “smart garments” which reveal and augment the body and movement of one person, or connect to the body of another person.

Since those early says, these concepts have now appeared in many forms, some interesting, and many not as well thought out as Sensei Thecla.

  1. Keywon Chung, Carnaven Chiu, Xiao Xiao, and Pei-Yu Chi, Stress outsourced: a haptic social network via crowdsourcing, in CHI ’09 Extended Abstracts on Human Factors in Computing Systems. 2009, ACM: Boston, MA, USA. p. 2439-2448.
  2. Kristof Van Laerhoven and Ozan Cakmakci. What shall we teach our pants? In Digest of Papers. Fourth International Symposium on Wearable Computers, 2000, 77-83.
  3. Thecla Schiphorst, soft(n): toward a somaesthetics of touch, in Proceedings of the 27th international conference extended abstracts on Human factors in computing systems. 2009, ACM: Boston, MA, USA.
  4. Thecla Henrietta Helena Maria Schiphorst, THE VARIETIES OF USER EXPERIENCE: BRIDGING EMBODIED METHODOLOGIES FROM SOMATICS AND PERFORMANCE TO HUMAN COMPUTER INTERACTION, in Center for Advanced Inquiry in the Integrative Arts (CAiiA). 2009, University of Plymouth: Plymouth.

Bonus video: Sensei Thecla’s ‘soft(n)’ [3].  Exceptionally cool!


D-Lib Magazine on Software Preservation

The July issue of the DLib Magazine (“The Magazine of Digital Library Research”) is dedicated to software preservation, with several useful (if depressing) articles.

As I have frequently pointed out, almost all contemporary scholarship and science depends on digital software and data. If we wish to continue the traditional mandate to publish and preserve our studies for others to replicate, critique, and build on, then we must (somehow) publish the digital artifacts they are built on.

The special issue of DLib has good coverage of these challenges, but not much in the way of solutions. This is very hard for many reasons, and resources are scarce. Academic libraries are the front lines of this quixotic mission.

Fernando Rios of Johns Hopkins University describes a “planning” tool, which describes strategies for preserving software. This article is short on solutions, but is a pretty good survey of the things you need to think about. It also has a useful collection of references at the end.

He identifies some of the key challenges, including “identifying and capturing metadata, dependencies, support for attribution and citation, infrastructure development, and developing appropriate workflows to enable service provision.” He references other works that discuss, for example, “How to cite and describe software” and “Minimal information for reusable scientific software” and so on.

Think about it. If you needed to record exactly how you did your work, whatever it is, what do you need to write down? In the case of digital software and data, what exactly would you need to tell people, in order to tell them enough to really check your work?

My own view is that this problem is predicated on the thorny issue that we don’t know how to describe software, let alone explain which parts are “important”. Software is one of most complex artifacts humans have ever created, and every bit of software is interconnected and depends on other software. (Even isolated systems depend on the software that was used to create them.)

Where should the circle be drawn to delineate “my software” from all the rest?

Furthermore, the dependencies are difficult to describe, if they are even known. Ideally, I should know that my software used a particular library, and made specific calls to it. But it is difficult to know what the library did, or what software it depended on, or even where it may have executed. It is also unlikely that we can say how the dependent software was created, and we may not know exactly how it works.

Given the complexity and “size” of software, it isn’t even reasonable to expect any person to know these things.

Yet, a full description of my research results requires a description of these crucial tools, which I can’t really do.

Everyone knows that digital artifacts have the maddening properties of being evanescent, yet impossible to delete. Even if I describe the digital tools I used today, they will no longer exist tomorrow in precisely the same form. In many cases, it isn’t even meaningful to talk about “replicating” digital work, because it is essentially a flow of events that will never be repeated exactly.

Archivists tackle these problems by recording snapshots of digital systems, and attempting to store away a “working copy” of it. As the world change out from under it, this requires virtualization and emulation, creating replayable copies of whole systems.

Virtualization/emulation is a heroic enterprise, no doubt. But is this really achieving the goal of preservation? Yes, you can recreate some form of the original work, but can you reuse it? Can you even understand it? (I have seen far too many cases where a piece of software only works on one peculiar virtual machine that no one can figure out or recreate. This quickly degenerates into a magic black box, which is intellectually unsatisfactory and probably wrong, too.)

Other articles give yet more challenges, including the difficulty of getting the people who understand (more or less) the objects to do the work to describe and record them. In addition, various legal licenses place roadblocks in this path, as do persnickety editors and reviewers. And, of course, none of this hard work is covered in the “deliverables” that sponsors are willing to pay for.

The life of an archivist is difficult and poverty stricken. The life of a digital archivist is impossible.

Enough ranting for now.

  1. Neil Chue Hong, Minimal information for reusable scientific software. 2014.
  2. Mike Jackson., How to cite and describe software The Software Sustainability Institute, Edinburgh, 2012.
  3. Fernando Rios,  (2016) The Pathways of Research Software Preservation: An Educational and Planning Resource for Service Development. DLib Magazine, 10.1045/july2016-rios 10.1045/july2016-rios

Digital Preservation is Hard

Actually, any kind of historical or archival preservation is hard: in general, the “significance” of an entity is a complex social context.

As I have often remarked, preserving Henry Aaron’s uniform, however carefully done, cannot really portray the significance of his career, his significant in American race relations, what it meant for a black man to break Babe Ruth’s record—in Atlanta!—in the 1970s and so on.

These problems are even more acute in the case of “digital preservation”. Preserving a collection of bits is difficult enough, and rarely sufficient to be particularly meaningful.

I often point out that my own academic work from the 1990’s is mostly “preserved” somewhere, including images of official thesis papers. Many of the files are difficult to actually process (Postscript Version 1.x is not supported by contemporary tools!), and any software is completely junk: the technical context and software stack no longer exist.  (How many of you even know what “Xenix” was?)  In fact, these facts were not even documented at the time, so who knows what would be needed to recreate the context?  If you even wanted to, which no one does.

So the work is basically gone, except for whatever words and pictures I may have left behind.

David P. Anderson writes in Communications of the ACM about these same challenges when trying to “preserve” digital art works, which, he correctly terms “Hybrid Objects” [1]. By this term, he means that the overall “object” of interest has both digital and other component. As he says, “artworks are not ‘only’ digital…and any preservation approach that does not acknowledge this is doomed to failure.” (p.45)

To back up a bit, why would we even care about “preserving” digital works of any kind?

The primary reason must surely be a desire to make the culture and knowledge of the past (and present) available to future generations. It is impossible to preserve everything, and equally impossible to know that people in the future will wish we had preserved. But surely it is worth trying to preserve “significant” things, so that they might be available in the future.

In the case of digital artworks, or, I would say, most digital “objects”, the entire point is how the user (?) experiences and interacts with it. Art projects may now employ hardware and software which can, in principle, be preserved and recreated in the future. But they also may employ data from live streams, from the Internet (e.g., search results) or social media. These works may visualize contemporary life in ways that are extremely interesting or beautiful, but totally ephemeral and irreproducible. Furthermore, digital artworks can be interactive, enabling the “user” or users to participate in the expression.

Anderson’s main point is that “any inclination one may have to believe that preserving artworks is primarily a matter of developing an appropriate set of software tools and workflows, is quite mistaken.” (p. 46) He is also right on target to say that “This is a lesson that is well worth extending into other areas of preservation.” (p.46)

Actually, I would go farther to say that these problems are at the core of all efforts at cultural preservation, digital or otherwise.

Anderson’s thoughtful article also illustrates the value of cross discipline collaborations. As he indicates, his understanding of the field of digital preservation was enlightened by “Working with contemporary artists”, and “spend[ing] time exploring preservation issues” with artists.

My own experience reflects the value of these collaborations, not only for understanding preservation. If you want to understand human computer interaction, you would do well to learn from performing artists (as Brenda Laurel did so long ago [2]). And if you want to build embodied systems, then you really need to work with expert dancers [3].

  1. David P. Anderson, Preserving hybrid objects. Commun. ACM, 59 (5):44-46, 2016.
  2. Brenda Laurel, Computers as Theatre, Reading, MA, Addison Wesley, 1991.
  3. Mary Pietrowicz, Robert E. McGrath, Guy Garnett, and John Toenjes, Multimodal Gestural Interaction in Performance, in Whole Body Interfaces Workshop at CHI 2010. 2010: Atlanta.


“Sonnet Signatures” by Nicholas Rougeux

Last month Nicholas Rougeux released a series of posters titles “Sonnet Signatures”, which are computer generated visualizations of William Shakespeare’s 154 sonnets. Each visualization is unique, and they all look like abstract calligraphic symbols. Overall, they have a remarkably “Chinese” or “Japanese” look to them, which is a marvelous, if ahistorical, association for Will’s poetry.

Let’s be clear: these glyphs are very abstract, and have no obvious semantic relation to the texts, their oral performance, or their emotional significance. If anything, I would say that the the stark, pristine strokes are the antithesis of the deep, passionate emotional content of the sonnets.

I’m no expert on poetry or these sonnets, but I do know this: these works were created to be oral recitations. The ink on paper, and now the pixels on a screen, are not in any way the intended representation of the work, they are just mnemonics for oral interpretations.

By extension, the visualizations by Rougeux are yet another irrelevant representation, and in fact, as a representation based on the written text, is a second order irrelevancy. It would make more sense to visualize a digital recording of an oral reading of the sonnets, no?

In any case, curiosity drove me to wonder what techniques were used to generate these striking representations, however misguided. I was not surprised that the technique is simple and shallow. (“Simple and shallow” is not necessarily a bad thing for an algorithm, especially if it yields interesting results.)

The actual technique does simple arithmetic, coding each letter as a number (a =1, b =2, etc.), and computing the average numeric value for each line of the sonnet. So, each sonnet should have 14 numbers, representing each line. These numbers are then visualized by plotting them and tracing a line through the points in order of the poem. The line is rendered to suggest a calligraphic stroke, fat at the start and steadily thinning to disappear at the last.

It’s just that simple.

Looking at the method, we can see immediately that each visualization will be different, though this represents only the effect of the encoding scheme, not anything particular about the words themselves, or the overall poem. His comment that “No two are the same—or even similar” is vacuous.

It is also clear that this encoding not only has nothing to do with meaning, in any sense of that word. The specific letters used to spell out the words are arbitrary in the first place (for English spelling is not phonetic and generally chaotic), and the chosen encoding (“a” = “1”) is a further level of arbitrariness.  (“a” could equal any number you want, right?) Taking the average of these meaningless numbers yields a meaningless number, and one that tosses away a lot of information contained in the encoding.

The word “any” has the “average” 1+14+25 / 3 = 13.333, which equals “m” and 1/3. In turn the word “some” has an average of 13.0, which equals “m”. These numbers are pure nonsense.

Now, there are ways that this idea could be done with much more respect for the actual poetry. First of all, sonnets are composed with great attention to syllables (not letters). Perhaps one could assign numeric values to syllables, and compute a representative number for each line.

Furthermore, the lines are rhythmic, with different emphasis on the syllables, and with rhymes. I note that people have been diagramming these structures for centuries, so there are definitely ways to encode these artifacts. Wouldn’t it make more sense to apply the visualization to the poetic pattern, rather than the meaningless atomic letters?

Even better, shouldn’t the input be a digitized voice, reciting the poems? Speech analysis is much more difficult that trivial text analysis, but I’m sure that a reading could be represented by a handful of interesting features, which could then be visualized.   (It would be even cooler to generate the visualization in real time, as the recording plays. Watch the glyph unfold as the reader speaks the lines.)

Am I being unfair to Rougeux? Am I taking his work out of context, applying inappropriate technical demands, and generally being a pest?

I don’t think so.

He admits that this technique is shallow, but he thinks the results are provocative and interesting.

“Connections between the shape and the meaning of a sonnet is coincidental but a welcome interpretation. The signatures are not meant to assign meaning but to inspire others to think about them differently than before.”

Well–he inspired me to think about this stuff more carefully, so he has succeeded, and I am following his intentions.

He also suggests that,

“What’s more interesting to consider is the hidden shapes revealed by looking at centuries-old poetry through a different lens.”

I don’t know about that. These shapes aren’t so much “hidden” and “revealed” as just plain made up.

One more point.

These visualizations are actually an interesting device to illustrate the way that simple arithmetic on digital text can generate “signatures” that distinguish texts from each other. This, of course, is the principle underlying checksums (used for error checking), digital hashing (used in cryptography and digital signatures), compression and encryption.

For instance, you can immediately see how these visualizations form a digital fingerprint for each sonnet. We can immediately tell one from another by comparing the glyphs (the 14 numbers in each), and we can reject a fake sonnet or a modified version, because the signature will not match any of the
known gyphs.

This principle underlies important technologies including passwords and passphrases, document verification, and even cryptocurrencies such as Bitcoin. These visualizations are quite attractive, and they show the concept quite nicely. In other words, this is a nice approach, though not necessarily for the reasons Rougeux suggests.

“Booklet Builder”: A good Idea for Teaching Language and Heritage

Friend and Sensei Biagio Arroba sent the  news that his Booklet Builder is now available for download. (I know he’s been working to get to this stage for quite a while.)

The Booklet Builder helps with Native American language education. It is a system designed for Native American colleges, tribes, schools and community-serving organizations, to help organizations with creating and sharing bilingual content.

Built on Drupal and other open software, with extensive support for multiple writing systems, BB is the current result of Arroba’s many years of work on ways to use Web tools to help communities preserve, teach, and learn their endangered languages. BB is a twenty first century answer to the significant challenges of publishing materials in Native American or other endangered languages.

A key feature of BB is that it is designed for “community driven content”, to let people build their own materials for formal or informal education. The flexible framework has been used in a number of projects with many collaborators, including:

  • A Living, Growing Textbook, Héċet̄u Weló Student Manual (Oglala Lakota College)
  • Content Standards (Oceti Sakowin Education Consortium)
  • Children’s Readers (Ilisagvik College in Alaska)

The same platform can be used to create games, story and song libraries, in multiple languages–lot’s of things.

I know that Sensei Biagio has worked for many years and through many versions of this concept, making him one of the outstanding experts in “crowdsourcing endangered languages“.

An earlier incarnation was LiveAndTell, which was a really neat social site for sharing (mainly) Lakota language multimedia.  LiveAndTell is described in some detail in the report and paper cited below.

Archived screen shot of LiveAntTell

Booklet Builder is a unique and interesting web toolkit.  Check it out.

áta čhó (I got that from the web: I hope that is an appropriate translation for ‘nice job’!)


  1. Arobba, Biagio, Robert E. McGrath, Joe Futrelle, and Alan B. Craig, A Community-Based Social Media Approach for Preserving Endangered Languages and Culture. 2010.
  2. Arobba, Biagio, Robert E. McGrath, Joe Futrelle, and Alan B. Craig, A Community-Based Social Media Approach for Preserving Endangered Languages and Culture, in “The Changing Dynamics of Scientific Collaborations” workshop at 44th Hawaii International Conference on System Sciences. 2011.