Scripting Language Design and Implementation

Instructor    Matt Might
Time Tues & Thurs, 2:00pm - 3:20pm
Location WEB 1250
Appts. Office hours after each lecture, or when my door is open.
Group utah-sldi-spring-2012
Blog blog.might.net (I'll post lecture notes here.)

Navigation

Requesting permission

I'm happy to grant permission to take the course if you read the course description and send me an email.

(I didn't want anyone to sign up by accident, since this course requires more work than normal.)

Description

This course has two major co-themes:

The design theme will include a hands-on survey of scripting languages.

New languages will be introduced at a rate of one to two weeks, and assignments will allow students to experience their strengths and weaknesses by programming in them.

For the implementation theme, students will be required to implement an interpreter for a scripting language from scratch.

An understanding of models of computation is a critical prerequisite.

This course is much more difficult than the average computer science course, and may require double or triple the effort.

The time invested will be well worth it: the course will impart a deep, unified understanding of programming languages, and a fundamental appreciation for the nature of computation.

Students will learn how to quickly teach themselves and exploit new programming lanuages.

Languages covered

Topics

Schedule

Grading

Your final grade is the maximum of your projects grade and your grade on the exams.

You do not have to take the final if you are satisfied with your grade on the projects.

Projects

The first five projects count for 10% each and the final project counts for 50%.

Midterm exam

There will be a midterm exam covering concepts thus far in the course.

Everyone is required to take it.

It counts for 50% of the exam grade.

Final exam

The final exam will be impossible to finish in the alloted time.

It will be divided into two portions, each scored separately.

The first portion will cover the same material as the midterm. If you were satisfied with your midterm grade, skip this portion. If you score higher on this portion than on your midterm, it replaces your midterm grade.

The second portion will be on post-midterm material, and it will account for 50% of your final grade.

You should answer those questions which you know best.

The weight of each question is proportional to the percentage of students that failed to answer it or answered it incorrectly.

The result will be curved.

Material

I'm morally opposed to the way textbooks are priced.

There is no required textbook for the class.

A textbook also cuts against the spirit of this class, which is ultimately about teaching you how to teach yourself a new programming language.

When it comes to learning a programming language, there is a wealth of material available through Google. I will teach you how to find it!

If you'd like recommended (but not required) textbooks, try the following:

I'll link to applicable online material from here, as well as my notes, the slides and any code we use in class.

Preliminaries and prerequisites

Unix

JavaScript

Lambda calculus

Racket

Python

Lexical analysis

Parsing

Transformation

Collaboration policy

I encourage the formation of study groups to discuss the theory and algorithms behind the projects.

You may not share code.

If you want to learn it, you have to implement it.

Violators of this rule will automatically fail the course and be dealt with under the School and University academic misconduct policy.

Projects

These projects will be fleshed out in more detail before they're formally assigned. Take them as draft "project ideas" for now.

Depending on how the course evolves and student interest, I may change a later project to a different one entirely.

Project 1: Unix scripting

Due: Friday, Feb 3 2012, before midnight Utah time

In this project, you will construct scripts to do the following:

  1. Find the five folders in the current directory consuming the most space.
  2. Report duplicate MP3s (by file contents, not file name) on a computer.
  3. Take a list of names whose first and last names have been lower-cased, and properly recapitalize them.
  4. Find all words in English that have x as their second letter, and n as their second-to-last.
  5. Apply rot13 encoding/decoding to standard in.
  6. Replace all spaces in a filename with underscore for a given directory.
  7. Computes the lines common to two files.
  8. Generates a password hash file from a dictionary.
  9. Attempts to crack a password database whose passwords were hashed but not salted using the output of the previous program.
  10. Validates a username and password against a database whose passwords have been properly salted and hashed.

Recommended tools: bash or zsh, md5sum, diff, sort, uniq, cut, sed, awk, du.

To get the assignment spec, ssh into winterfell.ucombinator.org.

Click here request an account.

Once logged in, run get-p1 to git clone the stub project into $HOME/p1.

Complete the scripts provided in the scripts.

Run make test to run a simple batch of tests.

To turn in your project, have the most recent version of your project in $HOME/p1 when the due date hits.

My turn-in script will pull all repositories at that time.

Project 2: Implementing regex

Due: Friday, 2 March 2012, 11:59:59pm.

In this project, you will parse and evaluate regular expressions.

The recommended technique for this is Brzozowski's derivative.

You will be implementing this project in JavaScript.

There is a template/demo for this project available.

The EBNF grammar of regular expressions is:

    
    regex  ::= term { termop term }
    termop ::= '|' | '&'
    term   ::= { factor }
    factor ::= base { '*' }
    base   ::= [ '~' ] atom
    atom   ::= '\' char
            |  char
            |  '(' regex ')'
   

In this dialect, the operator ~ is complement (or negation) and the operator & is intersection (or and).

Requirements

To complete the assignment, modify the template to use your own regular expression implementation instead of the internal JavaScript implementation.

Matt will grade the assignment with a manual read of your code plus a suite of test inputs.

To aid the suite testing, your page should export the function DoesMatch(regex,string), returning true for an exact match and false otherwise.

Turnin

To turn in this project, submit a URL from your public html directory on winterfell to this file.

To prevent your assignment from being copied, place it it the subdirectory ~/public_html/p2/

Then, disable public reads of this directory, while allowing public execution:

    chmod o-r $HOME/public_html/p2
    chmod o+x $HOME/public_html
   

Directories with incorrect permissions set will receive an automatic 0.

Rename template.html to a random filename (ending in .html), e.g., 1701.html

The filename must have at least 20 characters.

Then, email Matt the link:

    http://winterfell.ucombinator.org/~username/p2/num.html
   

Pointers

For an implementation of regular expressions with derivatives in Java, see the slides on lexing.

For an implementation in Scheme, with an explanation of the theory, see this post.

The recursive descent parsing examples from class are also available.

Project 3: Web scripting

In this project, you will write an email analytics and automation web application that will:

After the analysis, your application will then perform the following on each item in the inbox:

Recommended tools: cron, Perl, apache, MySQL, PHP, JavaScript, CSS, HTML.

Project 4: A web-based calculator

In this project, you will create an online calculator capable of arithmetic, as well as simple operations from calculus like symbolic integration and differentiation.

Project 5: An IRC bot

In this project, you will create an IRC bot that connects to an IRC server, joins a room and then interacts with users according to a simple natural-language command set.

Final Project: Implementing an interpreter

In this project, you will implement a web-based interpreter for μScheme, a tiny subset of the Scheme language.

You can do this entirely client-side in JavaScript, or you can implement it server-side in the language of your choice.