<< Previous | Next >>

Hotel at day 1 - a new programming language

It is now sunday morning and I've spend the better part of yesterday implementing a new programming language, working title: Hotel. This is a quick writeup about hotel.

First, why? I have my reasons, but I am motivated by two things. First, I would like to teach my wife how to program, but cannot decide on a language and/or environment; they all have their drawbacks. Secondly, (This was big rant about programming languages.) I would like a, in my mind, perfect programming language.

Some things that must be in hotel: Symbolic programming; full lexical scoping with closures; Dynamically typed; Pattern matching; and simple syntax rules.

Notice I don't mention object oriented programming, it would be on the list, somewhat, but I wouldn't write my own language unless I had some new ideas to incorporate into it.

Which brings me to the working title: Hotel. Based on Onne's too easy language, adding an 'h' for brevity. But also, a hotel has lots of spaces. Because in hotel everything is a scope, and scopes can have names, making them namespaces; a function is just a scope with a name and requires arguments to activate. And hotel objects are just closures you regard as objects, with syntax sugar helping you out.

(The above is the long version of: I needed a name, and hotel does not seem to have been used for a programming language yet; not sure how I thought of it.)

example program

shape {
    area() { return 0 }
    to_string() { return lang.name(this) + ": " + area() }
}

square (width = 0, height = 0) {
    import shape
    area() { return width * height }
}

var s = new square(10, 10)
print(s) // print will invoke to_string() if it exists
// -> outputs: "square: 100"

Unsure yet about the import shape maybe it should be inherit shape or something, but I am choosing import because scoping is everything in hotel, it is also its module system. import says we want to have everything in this scope that is in the other scope (shape in this case). And square then shadows shapes area function with its own implementation, overriding the original.

Once again, modules or objects or closures or functions or namespaces, they are all the same. And just like how you can import multiple modules, you can also inherit from multiple objects. Unlike most module systems though, importing actually does lift the whole referenced scope into your own, meaning anybody importing you, will also (indirectly) import what you imported.

So the state of things

Well, this works:

var x = "hello hotel world!"
print(x)
print(x, "more printing...")

Except that currently the last line will output: more printing..., hello hotel world! oeps it has its arguments reversed!

But it does mean I have a lexer and parser/evaluator implemented to certain extends. Next I'll work on the scopes and their contents.

Will this ever be something?

Only time will tell if I have the stamina to actually see this through to the end. I don't have much free time in the first place, so it won't go very fast.

And maybe I'm just wrong on some of the ideas I want to to incorporate. I don't know. Maybe hotel will not look at all like what I described today, maybe some ideas are impossible and that is why no language works that way. Time will tell, and I'm sure I will learn a lot in between.

For instance, in the above example, shape is not an object you can instantiate, because it doesn't have a function to call (but this is by design, it is 'abstract' in java speak). However, you could easily call its to_string function: shape.to_string(). This will print something like main: 0. Is this desirable? Will it work in practice?

A second thing is import shape only works because the scope doesn't require arguments. If it would you must write it like import new shape() which would not only import its scope, but first create a new shape object. That will have a lot of implications. Is this desirable? Will this work in practice?

Also, I would love not to do semicolons at the end of statements, but this makes a function call followed by an unnamed scope, ambiguous in syntax with function declaration:

print(x)
{
    var private
    // ... do private stuff
}

So maybe an unnamed scope must still use a name, or use the anonymous variable (e.g. "_"). But I'm sure there will be more ambiguities.

And then I have the idea that you can reopen a scope, by using the same name. But I doubt it will work, as this clashes with shadowing (overriding) inherited functions. But this idea is needed for multiple constructors and is necessary for pattern matching. Food for thought ...

Final example

It doesn't exist yet, but I sure already implemented a nice program in hotel, a full calculator. This is of-course the area where ocaml and haskell shine, and hotel mustn't do less (this is actually based on the ocaml manual, except I implement the lexer and parser myself, in 122 lines of code, edited a bit to fit on the website):

// define the type of expressions that we can evaluate
type expr = 
    | const: num
    | var:   string
    | add:   expr expr
    | min:   expr expr
    | times: expr expr
    | div:   expr expr

// evaluation function, using pattern matching
eval (env, const n)   { return n }
eval (env, var s)     { if (s in env) return env[s];
                           raise("Unkown variable: $s") }
eval (env, add l r)   { return eval(env, l) + eval(env, r) }
eval (env, min l r)   { return eval(env, l) + eval(env, r) }
eval (env, times l r) { return eval(env, l) + eval(env, r) }
eval (env, div l r)   { return eval(env, l) + eval(env, r) }

// x = 1; y = 2; average = x + y / 2
var test_expr = div (add (var x) (var y)) (const 2)
test_eval() {
    assert(eval ({x: 1, y: 2}, test_expr), 1.5)
}

// pretty print, only use minimal amount of parens due to use of precedence
pprint (e) {
    result = ""
    // override global print function (only for this scope)
    print (s) { result += s }
    
    // print paren, if needed surrounds expr with parens
    pparen(prec, my_prec, f) { 
        if (prec > my_prec) { print("("); f(); print(")")
        } else f()
    }

    pprint (prec, const n) { print(n) }
    pprint (prec, var s) { print(s) }
    
    // notice use of anonymous (lambda) to be evaluated by pparen
    pprint (prec, sum l r) { paren(prec, 0,
            fun () { pprint(0, l); print("+"); pprint(0, r) }) }
    pprint (prec, min l r) { paren(prec, 1,
            fun () { pprint(1, l); print("+"); pprint(1, r) }) }
    pprint (prec, times l r) { paren(prec, 2,
            fun () { pprint(2, l); print("+"); pprint(2, r) }) }
    pprint (prec, div l r) { paren(prec, 3,
            fun () { pprint(3, l); print("+"); pprint(3, r) }) }
    
    pprint(0, e)
    return result
}

test_pprint() {
    assert(pprint(test_expr), "(x + y)/2")
}

// a lexer that breaks a string apart based on separators
// notice object vs closure when we use the lexer ...
lexer (s, desc) {
    var pos = 0
    var whitespace = ("whitespace" in desc)?desc["whitespace"]:/\w/
    
    next() {
        // eat whitespace
        var m = whitespace.match(s)
        if (m) pos += m[0].length
        
        // find next useful token
        for (var t, d in desc) {
            if (lang.type(d) == lang.list) {
                for (var e in d) {
                    if (s[pos:e.length] == d) {
                        pos += e.length
                        return (t, d)
                    }
                }
            } else if (lang.type(d) == lang.regex) {
                var m = d.match(s, pos)
                if (m) {
                    pos += m[0].length
                    return (t, m[0])
                }
            } else {
                raise("Not supported")
            }
        }
        raise("No token found, token description not exhaustive?")
    }
}

// parsing an expression from a string
parse (s) {
    // create a new lexer 'object', notice use of regular expressions
    // and closure as object
    l = new lexer(s, 
        {token: ["(", ")"], op: ["+", "-", "*", "/"],
         string: /[a-Z_]/, number: /[0-9]+(\.[0-9]+)?/})

    // l.next() -> ("token", "("); l.next() -> ("string", "x")

    // again, using pattern matching
    parse("number", s) {
        return parse_op(const lang.int(s), l.next())
    }
    parse("string", s) {
        return parse_op(var s, l.next())
    }
    parse("token", "(") {
        var r = parse(l.next())
        if (l.next() != ("token", ")") raise "Syntax Error"
        return r
    }
    parse_op(e, "op", s) {
        var op = expr.from_string(s)
        return op e parse(l.next())
    }

    return parse(l.next())
}

test_parse() {
    assert(parse(pprint(test_expr), test_expr))
}

if (sys.arg.length == 2) {
    print(eval({}, parse(sys.arg[1])))
    sys.exit()
} else {
    print "Usage: ${sys.arg[0]} <expr>"
    print "example: ${sys.arg[0]} \"(10 + 2 + 7) / 3\""
    sys.exit(1)
}

Last modified: 2007-11-19 20:18 GMT