Category Archives: Interface Design

Telepresence Robot – At the zoo

These days we see a lot of exciting stories about telepresence—specifically, live, remote operation of robots. From the deadly factual reports from the battlefields of South Asia through science fiction novels to endless videos from drone racing gamers, we see people conquering the world from their living room.

One of the emerging technologies is telepresence via a remote robot that resembles ‘an ipad on a segway’. These are intended for remote meetings and things like that. There is two way video, but the screen is mobile and under the command of the person on the other end. So you can move around, talk to people, look at things.

On the face of it, this technology is both amazing (how does it balance like that?) and ridiculous (who would want to interact with an ipad on wheels?) And, of course, many of the more expansive claims are dubious. It isn’t, and is never going to be, “just like being there”.

But we are learning that these systems can be fun and useful. The may be a reasonable augmentation for remote workers, not as good as being there, but better than just telcons. And, as Emily Dreyfus comments, a non representational body is sometimes an advantage.

Last year Sensei Evan Ackerman reported on an extensive field test of one of these telepresence sticks, called the Double 2. This test drive was an interesting test because he deliberately took it out of the intended environment, which stressed the technology in many ways. The experience is a reminder of the limitations of telepresence, but also gives insights into when it might work well.

First of all, he played with it across the continental US (from Maryland to Oregon) thousands of KM apart. Second, he took it outdoors, which it isn’t designed for at all. And he necessarily relied on whatever networks were available, which varied, and often had weak signals.

As part of the test, he went to the zoo and to the beach!

Walking the dog was impossible.

Overall, the system worked amazingly well, considering that it wasn’t designed for outdoor terrain and needs networking. He found it pretty good for standing still and chatting with people, but moving was difficult and stressful at times. Network latency and dropouts meant a loss of control, with possibly harmful results.

Initially skeptical, Sensei Evan recognized that the remote control has advantages.

I’m starting to see how a remote controlled robot can be totally different [than a laptop running Skype] . . . You don’t have to rely on others, or be the focus of attention. It’s not like a phone call or a meeting: you can just exist, remotely, and interact with people when you or they choose.

Whether or not it is “just like being there”, when it works well, there is a sense of agency and ease of use, at least compared to conventional vidoe conferencing.

This is an interesting observation. Not only does everybody need to get past the novelty, but it works best when you are cohabitating for considerable periods of time. Walking the dog, visiting the zoo—not so good. Hanging out with distant family—not so bad.

I note that the most advertised use case—a remote meeting—may be the weakest experience. A meeting has constrained movement, a relatively short time period, and often is tightly orchestrated.  This takes little advantage of the mobility and remote control capabilities. You may as well as well just do a video conference.

The better use is for extended collaboration and conversation. E.g., Dreyfus and others have used it for whole working days, with multiple meetings, conversations in the hall, and so on.  Once people get used to it, this might be the right use case.

I might note that this is also an interesting observation to apply to the growing interest in Virtual Reality, including shared and remote VR environments.  If a key benefit of the telepresence robot is moving naturally through the environment, then what is the VR experience going to be like?  It might be “natural” interactions, but it will be within a virtual environment.  And if everyone is coming in virtually, then there is no “natural” intereaction at all (or rather, the digital is overlaid on the (to be ignored) physical environments. There will be lots of control, but will there be “ease”?  We’ll have to see.

  1. Evan Ackerman, Double 2 Review: Trying Stuff You Maybe Shouldn’t With a Telepresence Robot, in IEEE Spectrum – Automation. 2016.


Robot Wednesday

Facebook’s AI Led Astray By Human Behavior

I don’t follow the roiling waters of online advertising giant Facebook. Having moved fast and broke things, they are now thrashing around trying to fix stuff that they broke.

This month a team of researchers at Facebook released some findings from yet another study [3]. Specifically, the experiments (which don’t seem to have been reviewed by an Institutional Review Board) are trying to build simple AI’s that can “bargain” with humans. This task requires good-enough natural language to communicate with the carbon-based life form, and enough of a model of the situation to effectively reach a deal.

Their technical approach is to use machine learning so that bots can learn by example. Specifically, they use a collection of human-human negotiations, and tried to analyze the behavior to discover algorithms to replicate human-like interactions.

With preposterous amounts of computing power, who knows? It might work.

Unfortunately, the results were less than stunning.

Glancing at the conclusions in the paper, the good news is that method was able to learn “goal maximizing” instead of “likelihood maximizing” behaviors. This is neat, though given the constrained context (we know that the parties are negotiating) it’s less than miraculous.

The resulting bots aren’t completely satisfactory, though. For one thing these machine intelligences are, well, pretty mechanical. Specifically, they are obsessive and aggressive, “negotiating harder” than other bots.  Also, the conversation generated by the bots  made sense at the sentence level, but consecutive sentences did not necessarily make sense. (The examples sound rather “Presidential” to me.)

But the headline finding was that the silicon-based entities picked up some evil, deceptive tactics from their carbon-based role models. Sigh. It’s not necessarily “lying” (despite Wired magazine [1]), but, in line with “negotiating harder”, the bots learned questionable tactics that probably are really used by the humans exemplars.   (Again, this rhetoric certainly sounds Presidential to me.)

The hazards of trying to model human behavior–you might succeed too well!

I’m not surprised that this turned out to be a difficult task.

People have been trying to make bots to negotiate since the dawn of computing. The fact that we are not up to our eyeballs in NegotiBots™ suggests that this ain’t easy to do. And the versions we have seen in online markets are, well, peculiar.

One question raised by this study is, what is a good dataset to learn from? This study used a reasonably sized sample, but it was a convenience sample: people* recruited from Amazon Mechanical Turk. Easy to get, but are they representative? And what is the target population that you’d like to emulate?

(* We assume they were all people, but how would you know?)

I don’t really know.

But at least some of the results (e.g., learning aggressive and borderline dishonest tactics) may reflect the natural behavior of Mechanical Turk workers more than normal humans. This is a critical question if this technology is ever to be deployed. It will be necessary to make sure that it is learning culturally correct behavior for the cultures that it is to be deployed in.

I will add a personal note. I really don’t want to have to ‘negotiate’ with bots (or humans), thank you very much. The deployment of fixed prices was a great advance in retail marketing [2], and it is a mistake to go backwards from this approach.

  1. Liat Clark, Facebook teaches bots how to negotiate. They learn to lie instead. 16 2017,
  2. Steven Johnson, Wonderland: How Play Made the Modern World, New York, Riverhead Books, 2016.
  3. Mike Lewis, Denis Yarats, Yann N Dauphin, Devi Parikh, and Dhruv Batra, Deal or No Deal? End-to-End Learning for Negotiation Dialogues. eorint, 2017.


Animastage: Tangible Interactive Display

From MIT Media Lab Tangible Media folks, an odd little interactive display: Animastage [2].

The idea is “Hands-on Animated Craft on Pin-based Shape Displays”, i.e., a system that lets you create animated 3D puppet like scenes that move.

The underlying technology is from earlier TML projects [1], which were inspired by player pianos and other pre-digital technologies. This particular system is heavily influenced by puppetry

The creator makes scenes and puppets, and places them on the pin surface. The vertical movement of the pins pushes the figure like a puppet. Programming the actuators animates the scene.

The neatest effect is the “Invisible String Control”, in which the animator wiggles his or her fingers and the animation responds as if they were connected by a string. This effect uses hand tracking via a Leap Motion camera, which is mapped to the actuators.

This effect is far, far more interesting for a viewer than the other more complicated animations.

I was immediately struck by the fact that this effect mainly works—and draws our attention—because the viewer fills in the story from imagination.

The finger motions aren’t necessarily related to the animation in an obvious way (which is also true of marionettes), and we can’t really tell if the hand movements are leading or following the animation. But what we clearly see is the hands controlling the puppets.

This is a general principle of visual interaction: humans unconsciously construct stories and fill in ambiguity form their own imagination. In this case, the design makes good use of this principle, creating a very compelling and entertaining illusion from the simplest parts.

  1. Hiroshi Ishii, Daniel Leithinger, Sean Follmer, Amit Zoran, Philipp Schoessler, and Jared Counts, TRANSFORM: Embodiment of “Radical Atoms” at Milano Design Week, in Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems. 2015, ACM: Seoul, Republic of Korea. p. 687-694.
  2. Ken Nakagaki, Udayan Umapathi, Daniel Leithinger, and Hiroshi Ishii, AnimaStage: Hands-on Animated Craft on Pin-based Shape Displays, in Deisgn of Interactive Displays (DIS). 2017: Edinburgh, .

CuddleBits: Much More Than Meets The Eye

Paul Bucci and colleagues from University of British Colombia report this month on Cuddlebots, “simple 1-DOF robots” that “can express affect” [1] As Evan Ackerman says, “build your own tribble!” (Why hasn’t there been a zillion Tribble analogs on the market???)

This caught my eye just because they are cute. Then I looked at the paper presented this month at CHI. Whoa! There’s a lot of interesting stuff here.[1]

First of all, this is a minimalist, “how low can we go” challenge. Many social robots have focused on adding many, many degrees of freedom, for example, to simulate human facial expressions as faithfully as possible. This project goes the other way, trying to create social bonds with only one DOF.

“This seems plausible: humans have a powerful ability to anthropomorphize, easily constructing narratives and ascribing complex emotions to non-human entities.” (p. 3681)

In this case, the robot has programmable “breathing” motions (highly salient in emotional relationships among humans and other species). The challenge is, of course, that emotion is a multidimensional phenomenon, so how can different emotions be expressed with just breathing? And, assuming they can be created, will these patterns be “read” correctly by a human?

This is a great piece of work. They developed theoretical understanding of “relationships between robot behaviour control parameters, and robot-expressed emotion”, which makes possible a DIY “kit” for creating the robots – a theory of Tribbleology, and a factory for fabbing Tribbles!

I mark their grade card with the comment, “Shows mastery of subject”.

As already noted, the design is “naturalistic”, but not patterned after any specific animal. That said, the results are, of course, Tribbleoids, a fictional life form (with notorious psychological attraction).

The paper discusses their design methods and design patterns. They make it all sound so simple, “We iterated on mechanical form until satisfied with the prototypes’ tactility and expressive possibilities of movement.” This statement understates the immense skill of the designers to be able to quickly “iterate” these physical designs.

The team fiddled with design tools that were not originally intended for programming robots. The goal was to be able to generate patterns of “breathing”, basically sine waves, that could drive the robots. This isn’t the kind of motion needed for most robots, but it is what haptics and vocal mapping tools do.

Several studies were done to investigate the expressiveness of the robots, and how people perceived them. The results are complicated, and did not yield any completely clear cut design principles. This isn’t terribly surprising, considering the limited repertoire of the robots. Clearly, the ability to iterate is the key to creating satisfying robots. I don’t think there is going to be a general theory of emotion.

I have to say that the authors are extremely hung up on trying to represent human emotions in these simple robots. I guess that might be useful, but I’m not interested in that per se. I just want to create attractive robots that people like.

One of the interesting things to think about is the psychological process that assigns emotion to these inanimate objects at all. As they say, humans anthropomorphize, and create their own implicit story. It’s no wonder that limited and ambiguous behavior of the robots isn’t clearly read by the humans: they each have their own imaginary story, and there are lots of other factors.

For example, they noted that variables other than the mechanics and motion While people recognized the same general emotions, “we were much more inclined to baby a small FlexiBit over the larger one.” That is, the size of the robot elicited different behaviors from the humans, even with the same design and behavior from the robot.

The researchers are tempted to add more DOF, or perhaps “layer” several 1-DOF systems. This might be an interesting experiment to do, and it might lead to some kind of additive “behavior blocks”. Who knows

Also, if you are adding one more “DOF”, I would suggest adding simple vocalizations, purring and squealing. This is not an original, this is what was done in “The Trouble With Tribbles” (1967) [2].

  1. Paul Bucci, Xi Laura Cang, Anasazi Valair, David Marino, Lucia Tseng, Merel Jung, Jussi Rantala, Oliver S. Schneider, and Karon E. MacLean, Sketching CuddleBits: Coupled Prototyping of Body and Behaviour for an Affective Robot Pet, in Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2017, ACM: Denver, Colorado, USA. p. 3681-3692.
  2. Joseph Pevney, The Trouble With Tribbles, in Star Trek. 1967.


Robot Wednesday

MIstform Display

Reported this week at CHI, Mistform is “a shape changing fog display that can support one or two users interacting with either 2D or 3D content.” ([1], p. 4383)  Cool!

The basic idea of this kind of display is to generate a “fog” of water droplets in front of the person, and project information from the back. With cleaver geometry, the projection is seen by the eye as 3D objects hanging in mid air. The cool thing is that the user can reach into the fog to touch the objects hanging there.

This version from  Yutaka Tokuda and colleagues at University of Sussex, adds the wrinkle that the shape of the fog can be manipulated, to create a curved “screen” [1]. This calls for clever squared geometric computations, to account not only for the fog and the eye, but also for the curvature of the fog. The latter is computed from the position of the pipes that generate the mist.

The projection is, in principle, “mere geometry”. Working from the eye position (via head tracking), the color and brightness of each pixel is computed. Working backwards, the pixel is mapped to a region of the go, and then back to the projector. Voila.

Interacting with the display uses hand tracking with a Kinnect. The fog is segmented into regions that can be touched (“actuators”). This is coordinated with the projected objects, so the user can reach into the fog and “touch” an object in a natural motion.


This is a very nice piece of work indeed. The paper [1] gives lots of details.

This is a great example of the potential of projective interfaces, which will replace the ubiquitous screen in the coming decade or two. (If you have any doubts, take a gander at this wizardry from some Illinois alums.)

Of course, the mountain we have to climb is to make one big enough and clever enough that we can walk into it. This will also combine with haptics so the objects ‘push back’ when you tough them. Not that will be cool.

  1. Yutaka Tokuda, Mohd Adili Norasikin, Sriram Subramanian, and Diego Martinez Plasencia, MistForm: Adaptive Shape Changing Fog Screens, in Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2017, ACM: Denver, Colorado, USA. p. 4383-4395.
  2. University of Sussex. MistForm: adaptive shape changing fog screens 2017,

PS. Wouldn’t “Shape Changing Fog Screen” be a great name for a band?
Or how about,  “The Fog and the Eye“.

RoboThespian: Uncanny or Just Plain Unpleasant?

RoboThespian  is disturbing.

I think this particularly humanoid robot has climbed out of the uncanny valley of discomfort, and ambled out onto the  plain of extremely annoying coworker. Disney animatronics gone walkabout.

RoboThespian is a life sized humanoid robot designed for human interaction in a public environment. It is fully interactive, multilingual, and user-friendly, making it a perfect device with which to communicate and entertain.

Clearly, these guys have done a ton of clever work, integrating human like locomotion, speech synthesis, projection, face tracking, and serious chat bot software.

The standard RoboThespian design offers over 30 degrees of freedom, a plethora of sensors, and embedded software incorporating text-to-speech synthesis in 20 languages, facial tracking and expression recognition. The newly developed RoboThespian 4.0 will offer a substantial upgrade, adding additional motion range in the upper body and the option of highly adept manipulative hands.”

What can you do with all this? I think the key clue is that the programming is done via a GUI enviroment  Blender

which means that you basically create a computer generated scene, which is “rendered” in physical robots.

Much of the spectacular effect is due to well coordinated facial expressions, head movement, and speech. The robot also has sensors to detect people and especially faces, and to orient to them. It also has facial expression recongnition, which lets it “reproduce” facial expressions. All these effects are “uncanny”, and make the beast appear to be talking to you (or singing at you). Ick!

All this is in the pursuit of…I’m not sure what.

I grant you that this is a great effect, at least on video. But what is it for?

The title and demos suggests that it replaces human thespians (live onstage), which seems far fetched. If you want mechanized theater, you always have computer generated movies. As far as I can tell, the main use case is for advertising, e.g., trade show demos. It either replaces human presenters (demo babes) or it replaces video billboards.

They also suggest that this is a good device for telepresence, It “can inhabit an environment in a more human manner; it’s the next best thing to being there.”   I’m not at all sure about that. Humanoid appearance is not really important for effective telepresence in most cases, and there is no reason to think this humanioid is well suited for any give telepresence situation.

Let me be clear: this product is really nicely done.  I do appreciate a well crafted system, integrating lots of good ideas.

But I really don’t see that roboThespian is anything other than a flashy gimmick. (Human actors are way, way cheaper, and probably better.)

On the other hand, when I saw the first computer mouse on campus, I declared that it was a useless (and stupid) interface, and no one would ever use it.   I was wrong about mice (Boy was I wrong!), so my intuitions about humanoid chatter bots may be wildly off.

Update May 4 2017:  Corrected to indicated taht Engineering Arts does not use Blender, as the original post said. I must have seen some out of date information.  EngArt have their own environment which, if not built from Blender, is built to look just like it.  Thanks to Joe Wollaston for the correction.


Robot Wednesday