For many years I’ve advocated multidiscipline collaborations to explore digitally enhanced performing arts. For example, I have enjoyed collaboration (at least on the fringe of) live music and dance performance, intimately connected with computers and networks, using sensors, signal processing, and wearable computing (e.g., [1, 7, 11]). In these collaborations, each new technical capability demands additional esthetic understanding of its artistic affordances; and the imaginative vision of artists usually pushes technology beyond current engineering (e.g., [8, 13]). In addition, this work leads to a deeper understanding of digitally mediated perception and interpersonal interaction.
As a computer scientist, I particularly value the insights I gain that lead to new and better digital systems, insights which could never be gained through conventional engineering studies. For example, if you want to build “embodied computing” interfaces, you really must work with dancers and musicians because they are trained experts in controlled, precise, and repeatable body motion. Of course, such an undertaking requires significant effort by everyone to understand divergent approaches and vocabularies, but the results can be worthwhile for both artistic and technical understandings. From my own collaborations I have developed a number of ideas about computer mediated human interactions.
This essay presents one idea, which is a general framework for describing networked collaborative interaction. Growing out of multiplayer games and live music performance, the key insight came to me from (non-digital) theatrical works. The next section introduces a technical problem which is encountered, among other places, in networked musical collaborations. The problem has deep connections, including multiplayer video games and a surprising link to live theater. An understanding of these links leads to a general theory with implications for both artistic and engineering design.
The General Problem
I’m particularly interested in live, multiperson artistic expressions connected with computers and networks. Clearly, making music, dance, and theater together is one of the most human and humane activities, and therefore one of the most important potential uses of digital technology.
An artistic performance can be seen as a “single” creation of the artist(s), which is realized through performance (mediated by the performers’ bodies and technology), and perceived by an audience (through various delivery means). Digital media enable the performance and the audience to be collocated or separated by large distances (or times), and there can be many performances and audiences, all connected simultaneously.
There are many interlaced technical and esthetic challenges raised by these efforts. These include interface design (how do you control the piece with body and instrument?), and system performance (can the sensors, network, and processing keep up with the flow). Given the complexity of the systems that can be built, it is challenging for anyone, let alone a creative artist, to understand what is possible and what is actually occurring. It is safe to say that software tools to make such things are in their infancy.
But what I want to consider here is more structural, and perhaps psychological. The use of digital media turn out to expose some very fundamental facts about performances, facts that can be ignored in normal, face-to-face presentations. Essentially, digital media exaggerates the logical gap between the abstract “work”, the multiple simultaneous artistic performances that realize it, and the audiences’ reception of the performances.
At the bottom of it all are the pervasive effects of latency between sender and receiver, and ultimately between two people using a network. (There is always latency, even without digital networks.) Even the “fastest” network has end-to-end delays due to data processing, switching, and transmission speed which cannot be faster than the speed of light. Typical delays are much less than a second to several seconds (at least between people located near the surface of the Earth). Such a delay may be fine for some purposes (e.g., email), tolerable for others (e.g., voice telephony), but totally disastrous for others (e.g., singing harmony).
A Huge Challenge For Live Music Ensembles
Because of network latency live music usually doesn’t work well except within a local space. If you try to play music together with someone across a digital network, there will be big problems almost immediately, because different sites will experience the music skewed in time.
Consider the scenario sketched in the diagram below. The overall picture represents a performance of a single “piece”, performed by two groups of artists, with an audience. In this picture, the piece is performed via a digital network, which transmits audio and video of the performance among the locations.
Musicians at location 1 play part of a piece and transmit over the network. Musicians at location 2 plays their own part and transmits. Location 1 hears their own music nearly immediately, and the music from location 2 perhaps a few milliseconds after it is played. Location 2 has a mirror experience, with part 1 delayed. Thus, each group of performers experiences a different overall piece, and neither can really synchronize with the other.
An audience at Location 3 hears both part 1 and part 2, with each slightly delayed (probably by a different amount). What the audience hears is different from what any musician played (or heard) and could be a miserable mess. If there were more than one audience on the network, each local group would receive a different experience.
Setting aside whether any of the locations experiences the “correct” performance, it should be clear that they all get a different performance. Furthermore, this scenario clearly shows the logical separation of (a) the theoretical “whole piece”, (b) the separate local performance of parts, and (c) what the audience experiences.
In short, If you are trying to play together in such a case, what does “together” mean? You can’t answer without considered where everyone is, and how they are connected. This problem is well known to electronic musicians (though many people rediscover the issue every day). This has led to experiments that attempt to work around or take advantage of the phenomenon. You might consult Ben Smith’s dissertation (“Telematic Composition”, University of Illinois, 2011 ), which reviews some recent approaches. Smith, a talented musician and now a professor at IUPUI, has experimented with a variety of techniques using digital networks to connect musicians and audiences spread over digital networks (see ). These experiments lead me to conclude that network latency is a fatal constraint on most artistic performances, though artists continue to search for approaches that may work.
Connections to Past Practice and Deep Issues
Latency has always been an issue for musical collaboration, and even ordinary interpersonal interaction. In everyday life we are accustomed to the tiny latencies in speech and vision, and unconsciously work around them. When we play, sing, and dance together in the same space there are latencies, but they are small. One of the skills of an expert musician or dancer is the ability to play in time with others, i.e., extremely precise management of latencies. (This skill is made readily apparent by comparing performances of the same work by students versus masters—the masters will be ever so much better coordinated with each other.)
Still, a large orchestral performance, or any performance in a large venue must deal with the speed of sound. I am told that it is standard procedure for percussion in the back row of an orchestra to play a little ahead of the beat, to ensure that the sound will arrive in the front rows properly synchronized with the violins. How ever the orchestra plays, an audience in a large hall may detect slight distortions in different seats, as the sounds from different sections of the orchestra arrive out of time with each other. Expert composers and ensembles try to work around these challenges to provide enjoyable results, and picky listeners try to get “optimum” seating.
This problem is actually very fundamental. Albert Einstein was pondering a similar problem during the period when he framed relativity theory. In this case, his thought problem was how to synchronize clocks at railway stations via telegraph: the inherent limit in the speed of the telegraph network introduces latency in all messages intended to coordinate the clocks. This thought problem is one of the challenges that led him to propose interesting theories of time and space in the face of speed limits.
One could say that Telematic art deals with multiple local frames of reference, and is seeking to create a single (non local) piece of art that makes sense from many (local) points of view. (I expect theoretical physicists will cringe at this terminology. I hope the idea is clear enough, regardless.)
An Unexpected Connection to Theatrical Performance
I found a different take on this concept from the unexpected direction of live theater performance, which is uses yet a different set of techniques to communicate an imaginative artistic vision to an audience.
In the theater there is a story, aspects of which is presented to an audience through one or more viewpoints. The playwright uses a compressed and limited presentation to imply enough for the audience to understand, perhaps unconsciously, the rest of the story. Humans make this sort of inference everyday in the course of normal life: we personally can witness only fragments of the world, from which we constantly infer a wider picture. Theater makes artistic use of this native human social-cognitive skill.
A live drama is presented to an audience at a particular time and space (the stage). Beyond the stage (and “behind the scenes”), a story occurs, with possibly many simultaneous actions happening. The theatrical stage funnels a single view of the story to the audience at a given time. There could be many such views, but it is not possible to portray them all at once. Therefore, devices such as multiple story lines and flashbacks may be used to present a story from different perspectives, organized into in a sequence of views. Given the centuries of artistic practice so familiar we may not even consciously notice these tricks. (Cinema and television allows technical mediation to manipulate the view, but essentially presents a series of views to the audience. Written texts use similar techniques, though the audience must participate in the performance.)
A deep connections of live theater to “telematic” art came to me when I recalled two remarkable plays by Alan Ayckbourn. Ayckbourn is clearly one of the most successful and active English playwrights of the late 20th century. (“Sir” Arthur has been granted a knighthood, but this blog does not traffic in such fripperies.) He has created more than 80 plays, though I want to consider just two of them here, where he tried something different, and it is really cool.
The first work of interest here is The Norman Conquests (1973). The protagonist, Norman, has a complex comic weekend with several of his female “conquests” (though, who is conquering who is less clear in the end). In the course of a couple of days, six people arrive and leave, talk, fight, and make up, and recall past affairs. The events occur in several parts of one house over a weekend. A conventional enough story.
This overall story is presented in three separate plays, which you might see on three successive nights (Table Manners, Living Together, and Round and Round the Garden). In a single play, we see only one part of the dialog and action, the part of the overall story that takes place in one of the spaces. One play takes place in the dining room, as the guests arrive, talk, and leave. Another play presents the events in the garden. And a third shows what happened in the living room.
Each play can be viewed separately, and they can be done in any order. When you see all three plays (e.g., on successive nights), you discover the whole a story. I remember seeing these plays in the television version (produced by Thames Television, broadcast on public television in the US). It was an awesome and memorable experience. By the third play, I was forced to rethink some of the things I thought I understood in the first play. The exercise is different perspectives is amazing, and was justly acclaimed.
Ayckbourn returned again to this concept in 1999, taking it to the logical extreme. Since the story is taking play simultaneously in multiple locations, why not have one play with multiple audiences at the same time? This is what is done in House & Garden (1999) . The story involves a family and it’s confusing affairs, on the day of a formal garden party. As in The Norman Conquests, this is presented through two independent plays, House and Garden, which “intended to be performed simultaneously by the same case in two adjacent auditoria. They can be seen singly and in no particular order.” (, Authors’ Note).
An audience sees one play per performance. One stage is set for the “House”, the other for the “Garden”. When an actor steps out of the house into the garden, he or she exits one play and enters the other, and vice versa. Each audience sees only one of the two plays in one performance, and may see the other play in a second performance. But the actors act out the whole story each performance.
I have never seen H&G myself, I’m sure it will rarely be produced for lack of suitable facilities. Obviously, fame and fortune enabled Ayckbourn to command an unusual level of resources for this production, which will be difficult for many companies to replicate. Nevermind—just imagining this play has influenced my thinking about performing arts and computer interfaces.
I realized that these astonishing plays manipulate what the audience sees of the (hidden) narrative using conventional techniques. The innovation that caught my imagination is staging multiple independent versions of the overall play, enabling the audience to glue it together in their own head.
As I learned about telematic performances, I perceived a connection to the technique Ayckbourn uses. I imagine a single overall musical piece, designed to be experienced in a controlled way through multiple viewpoints. You might, perhaps, listen to the same piece several times in different venues, catching different “views” of the piece, possibly inferring the hidden global structure. Phew! It is difficult for me to imagine, but I am not a composer or a playwright.
And, of Course, A Connection to Multiplayer Computer Games
I should note that video games, especially multiplayer games, give us another example of these concepts. Like the theater, a game is a story that plays out in time and space, of which the audience (player) experiences only part of the story at one time, from one viewpoint at a time. A multiplayer game is also like a musical collaboration, people perceive the actions of other players and perform actions that affect the overall “story”.
The game world resembles the backstory of a play, with the same elements (setting, characters, props, plots, etc.). The digital technology enables the play to be presented through virtual stages projected onto computer screens. A game doesn’t just read through a script, the audience participates, and the action may be generated as it runs, just as in a musical performance with some improvisation. The computer helps keep the overall story consistent even though the entire world may be imaginary, unencumbered by physical reality.
It is probably no coincidence that many musicians are also a game designer (e.g., Smith created a gamelike environment called musiverse ). A multiplayer game is similar to a musical collaboration across a network in that the actions of each player must be synchronized with the events in the story and actions of other players, so that everything makes sense at each player’s point of view. The latency issues discussed above in the context of musical synchronization are just as critical for actions in a game: a fight or a conversation between two players over the network has to make sense to everyone. Indeed, game engineers have developed techniques for managing latency and coordinating many simultaneous activities across a network [3, 4].
Just as in the case of the Ayckbourn plays, each individual player sees the overall story through one point of view, and each may well experience a different local story. In the theater, the playwright and actors make sure the dialog is consistent in each scene, in the game this is managed by software.
Bringing these different examples together, we can see a general model underlying these cases. In each case there is a single artistic “story” which is performed by multiple artists for reception by multiple audiences. The participants are distributed spatially, to the degree that some can perceive the others only via digital networks. Each participant perceives his or her local space plus digitally transmitted information from one or more other locations.
The roles of creator (author/composer/choreographer/etc.), performer, and audience have been studied for millennia. But in the scenario discussed here, these roles are drastically separated, drawing attention to some aspects of their practice. The separation caused by multiple local times and spaces forces us to consider the demands of the different roles.
The basic conclusion is: live networked performance places difficult demands on everyone involved.
The audience. As an observer, one can, of course, choose to take it as one gets it, and be satisfied with a single enjoyable performance. But if one wishes to comprehend the whole work, then one must take in as many of the perspectives as possible, and attempt to infer the underlying story. This inference process is not necessarily difficult for human oriented stories because we do it all the time whenever we interact with people for a sustained period, such as when we “get to know” someone.
It is less clear whether similar understandings of other performance, such as music or dance, will be as easy for everyone. Probably not, at least without training or experience.
The performers: The demands on performers are not as radical as for the audience. While the performers may or may not know the whole story, they (as usual) must act out their own local part of it. As always, performers must hew to the timing of their part, even when “normal” cues are not available for synchronization. For example, musicians must keep time and perform in the face of missing or, worse, delayed cues from other performances.
Improvisation may be extremely difficult in this kind of scenario because they would need to take into consideration not only the performer’s local perspective, but the point of view of the audience and other performers.
The creator(s): The greatest demand of all is placed on the creator, who must devise the story, and also the perspectives, and assure that everything works from all angles. A playwright must make each stage a coherent play in itself, while part of the larger, unseen play. A composer must make the music sensible and enjoyable at each local audience, as part of a larger whole, somehow accounting for the network’s distortions.
I also draw some lessons for computer science.
First of all, I reiterate my belief that computer system designers can still learn a lot from creative and performing artists. Computer mediated communication and human interfaces in general can benefit from, as Laurel classically put it, considering Computers as Theatre . I also maintain that developing embodied and gestural interfaces will never be successful without incorporating the deep understandings that may be gained from performing artists. For example, we can learn much from how performers manage space  and how dancers understand motion .
Finally, though improving network performance is always important, it is not possible to eliminate latency issues. As a consequence, some kinds of activities are simply not feasible over a network. A major case in point is live musical collaboration, but it applies to many kinds of “intimate” communication. We will never see, for instance, tactile interfaces that work at a great distance—latency will make them untenable for most uses.
Attempting to simulate or develop human interfaces and interaction across a network is very far from easy, in no small part because Interface and interaction design tools are so primitive.
This is no surprise to developers of parallel and distributed systems, a field where analysis, simulation, and development tools are primitive. However, this is an area where progress can and surely will occur.
Ben Smith and Mary Pietrowicz for stimulating discussions. Professors Guy Garnett, John Toenjes, and Thecla Schiphorst have provided inspirational leadership in collaborative projects. The eDream Institute provided a wonderful environment for these explorations.
1. Astral Convertible, Welcome to Astral Convertible Reimagined, in Astral Convertible@UIUC. 2009. http://astralconvertible.wordpress.com/
3. Brun, Jeremy, Farzad Safaei, and Paul Boustead, Managing Latency and Fairness in Networked Games. Communications of the ACM, 49 (11):46-51, November 2006. http://ro.uow.edu.au/cgi/viewcontent.cgi?article=1544&context=infopapers
4. Claypool, Mark and Kajal Claypool, Latency and Player Actions in Online Games. Communications of the ACM, 49 (11):40-45, November 2006. http://web.cs.wpi.edu/~claypool/papers/precision-deadline/final.pdf
5. Gardair, Colombine, Patrick G.T. Healey, and Martin Welton, Performing places, in Proceedings of the 8th ACM conference on Creativity and cognition. 2011, ACM: Atlanta, Georgia, USA. p. 51-60. http://dl.acm.org/citation.cfm?id=2069629
9. Schiphorst, Thecla Henrietta Helena Maria, THE VARIETIES OF USER EXPERIENCE: BRIDGING EMBODIED METHODOLOGIES FROM SOMATICS AND PERFORMANCE TO HUMAN COMPUTER INTERACTION, in Center for Advanced Inquiry in the Integrative Arts (CAiiA). 2009, University of Plymouth: Plymouth.
12. Smith, Benjamin D., Telematic Composition, in Music. 2011, University of Illinois, Urbana-Champaign: Urbana. http://hdl.handle.net/2142/29843