Multitouch remote for a robot: 200 lines of JavaScript & Python

[article index] [] [@mattmight] [rss]

Personal robotics is here.

Components are now cheap, flexible, standardized and easy to program.

In this article, I'll show how to use a multitouch interface to control a robot.



Not far off?

I'll discuss the Scribbler platform extended with a Fluke card; for multitouch, I assume iOS. (Of course, you could do what I describe with another platform like LEGO Mindstorms NXT + Android too.)

A Scribbler+Fluke brings a mobile robot with:

  • an ARM processor (on the Fluke);
  • a BASIC Stamp processor (on the Scribbler);
  • wireless bluetooth connectivity;
  • a digital camera;
  • two bidirectional drive servos (with stall detectors);
  • front and back infrared emitters;
  • front and back infrared receivers;
  • bottom-mounted line-detection sensors;
  • three light sensors;
  • an onboard speaker;
  • a open port in the middle;
  • a serial port for communication (used by the Fluke); and
  • several controllable LEDs.

This system allows the programmer to use higher-level languages like Python, Scheme, Java and C++. Support for Mono/.NET is on the way.

The code in this article uses multitouch gestures like swiping and twisting on an iPhone, iPad or iPod Touch to get the robot to mimic your fingers.

It also sends images back to the phone, so you can drive it out of sight.

On the iOS side, it's 100% JavaScript and HTML--no native programming!

The Python code sets up a simple web server to receive commands, so in principle, any web-connected device could control the robot.

Read on to see how it's done is less than 200 lines of code.

Video

This is what it looks like:

Components

I picked up robotics when I attended the Institute for Personal Robotics in Education (IPRE) workshop at Georgia Tech.

The IPRE workshop meta-teaches programming using robotics.

The IPRE platform has two components:

(If you don't have Bluetooth, you'll want a Bluetooth USB adapter too.)

Instructions for set up and install under Mac, Linux or Windows are available on the IPRE wiki.

The principles of the iPhone multitouch remote in this article are general, and adapt to other robotics platforms like LEGO Mindstorms NXT.

My code should work on another robotics platform with some modifications. The iOS app should also work on many Android phones.

Basics: A higher-order Cha Cha

The warm-up task from the workshop gives a sense of how easy it is to program a Scribbler+Fluke system.

The first task was to teach the robot to dance.

After half an hour of playing with it, mine was doing this:

The code is simple, and it even shows off higher-orderness:

def courtsey():
    turnLeft(1,0.05)
    turnRight(1,0.05)
    turnRight(1,0.05)
    turnLeft(1,0.05)

def chachacha(action):
    action(0.7)
    beep(0.05,800)
    stop()
    
    action(0.7)
    beep(0.05,900)
    stop()
    
    action(0.9)
    beep(0.2,800)
    stop()

curtsey()
chachacha(forward)
curtsey()
chachacha(backward)
turnLeft(0.5,0.5)
chachacha(forward)

The movement primitives for actions like rotation and translation are uniform: all take an amount and a duration.

As a result, the procedure chachacha, which encodes a beat, can "chachachaify" any movement, like turnRight or backward.

Multitouch iOS and web-based remotes

In the workshop, we each created an assignment we could give to students.

Since childhood, I've been fascinated by the prospect of having a home full of semi-autonomous minions. (No, children don't count.)

Controlling a robot with multitouch gestures seemed like a fun assignment.

I broke it into two parts:

  • an httpd that controls the robot; and
  • an iOS app in JavaScript that talks to the HTTPD.

Here's the interface for the iPhone app when it starts up:

To go forward or back, stick your finger down and drag it forward or back.

To rotate, touch two fingers and twist in the direction you want it to rotate.

When you lift your fingers, it stops moving.

To view the camera, rotate the screen.

(It wouldn't be hard to make it a continuously updated image stream.)

Web server

The web server uses BaseHTTPServer to handle HTTP GET requests.

The httpd responds to requests for /move, /spin and /stop by controlling the robot. The query parameters determine how fast the robot should move.

The file /cam.gif contains whatever is in front of the robot at that moment.


from myro import *
from os import curdir, sep
from urlparse import urlparse, parse_qs
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer

# Configuration
SerialPort = "/dev/tty.IPRE6-365906-DevB"

# Determine the mode: Test or Real 
Mode = "test" if len(sys.argv) > 1 and \
                 sys.argv[1] == "--test" else "real"

if Mode == "real":
  print "Mode: Real"
  init(SerialPort)

# Handlers for commands:
def handle_move(query):
  speed = float(query['speed'][0])
  print "move(%s)" % speed
  if Mode == "real":
    if speed < 0.0:
      backward(-speed)
    else:  
      forward(speed)

def handle_spin(query):
  speed = float(query['speed'][0])
  print "spin(%s)" % speed
  if Mode == "real":
    if speed < 0.0:
      turnLeft(-speed)
    else:
      turnRight(speed)

def handle_stop(query):
  print "stop()"
  if Mode == "real":
    stop()

# Custom HTTP request handler:
class ActionHandler(BaseHTTPRequestHandler):
  
  def do_GET(self):
    if self.path == "/":
      self.path = "/index.html"

    url = urlparse(self.path)
    action = url.path[1:]
    print "action: ", action

    query = parse_qs(url.query)
    print "query: ", query

    # Send back an image, if requested:
    if url.path == "/cam.gif":
      if (Mode == "real"):
        p = takePicture()
        savePicture(p,"cam.gif")

    if action == "move":
      handle_move(query)
    elif action == "spin":
      handle_spin(query)
    elif action == "stop":
      handle_stop(query) 
    else: # grab a file
      try:
        print "sending file: ", url.path
        f = open(curdir + sep + url.path) 
        self.send_response(200)
        if url.path.endswith('.html'):
          self.send_header('Content-type', 'text/html')
        elif url.path.endswith('.js'):
          self.send_header('Content-type', 'text/javascript')
        elif url.path.endswith('.gif'):
          self.send_header('Content-type', 'image/gif')
        self.end_headers()
        self.wfile.write(f.read()); f.close()
        return
      except IOError:
        self.send_error(404,'File Not Found: %s' % self.path)

    self.send_response(200)
    self.send_header('Content-type', 'text/html')
    self.end_headers()
    self.wfile.write('OK: ' + self.path) ;

try:
  server = HTTPServer(('0.0.0.0', 1701), ActionHandler)
  print 'Awaiting commands...'
  server.serve_forever()
except KeyboardInterrupt:
  print 'User terminated server'
  server.socket.close()

iOS remote app

The iOS app watches for dragging and twisting gestures.

It turns those multitouch gestures into asynchronous web requests that tell the robot what to do.


var UpdatePeriod = 300 ; // milliseconds
var LastTouch = 0 ;

// The Y coordinate when the finger goes down:
var InitY = 0 ;

//  Is the user gesturing or touching?
var InGesture = false ;

// Last and current command:
var LastCommand = {"action": "stop"} ;
var Command     = LastCommand ;

var Pad = document.getElementById("pad") ;

function HttpSend(url) {
  var req = new XMLHttpRequest() ;
  req.open("GET", url, true) ; 
  req.send(null) ;
}

// Called periodically to issue commands:
function UpdateRobot() {
  if ((LastCommand.action == Command.action) &&
      (LastCommand.speed  == Command.speed))
    return ;
 
  Pad.innerHTML = Command.action + ": " + Command.speed ;

  switch (Command.action) {
    case "move":
    case "spin":
      HttpSend("/"+Command.action+"?speed=" +
               encodeURIComponent(Command.speed)) ;
      break ;

    case "stop":
      HttpSend("/stop") ;
      break ;
  }

  LastCommand = Command ;
}

// Check for new actions at regular intervals:
setInterval(UpdateRobot,UpdatePeriod) ;

// Watch for touches and gestures:
Pad.ontouchstart = function (event) {
  var finger = event.touches[0] ;
  InitY = finger.clientY ;
  var newTouch = (new Date()).getTime() ;
  LastTouch = newTouch ;
} ;

document.body.onorientationchange = function (event) {
  Pad.style.backgroundImage = "url(./cam.gif?r="+Math.random()+")" ;
} ;

Pad.ontouchend = function (event) {
  Command = {"action": "stop"} ;
  var finger = event.touches[0] ;
} ;

Pad.ontouchmove = function (event) {
  if (InGesture) return ;
  var finger = event.touches[0] ;
  var speed = (InitY - finger.clientY) / window.innerHeight ; 
  Command = {"action": "move", "speed": speed} ;
} ;

Pad.ongesturestart = function (event) {
  InGesture = true ;
} ;

Pad.ongestureend = function (event) {
  InGesture = false ;
  Command = {"action": "stop"} ;
} ;

Pad.ongesturechange = function (event) {
  var rotation = event.rotation ;
  if (rotation < -180.0) rotation += 360.0 ;
  if (rotation >  180.0) rotation -= 360.0 ;
  Command = {"action": "spin", "speed": rotation / 180.0} ;
} ;

HTML interface

Grab the code.

Exercises

If you'd like some exercises, try out:

  • It's easy to extend this code to send a live stream of images back.
  • You could have the app switch to an alternate control mode in landscape mode. Each thumb could control a wheel independently, for fine-grained control over the robot's movement.
  • Currently, the iOS app syncs with the robot three times a second. Bumping up the frequency improves responsiveness, but bumping it too high floods the robot. You should modulate the frequency dynamically, based on response times to HTTP requests, so that it settles around the best possible update frequency.

More resources