The Case For the ARM 16" MacBook Pro

This isn’t* a prediction, it’s a justification: the 16” MacBook Pro should be the first Apple notebook with an ARM-based processor.

Apple’s latest ARM-based SOC (the A13 Bionic) has a higher single-core Geekbench score than any Mac ever made. If Apple is going to release a notebook-class ARM SOC, they will want to blow away Intel on benchmarks. The next-generation SOC (we’ll assume it’s the A14) is guaranteed to out-perform Intel chips on single-core performance, and all but guaranteed to out-perform Intel on multi-core performance in notebook processors.

Imagine putting the best-performing Mac notebook processor ever made in a MacBook Air. It would make an amazing MacBook Air, no doubt, but it would ruin the rest of the product line. Why would anyone spend more money on a 13” MacBook Pro? At least with a 16” MacBook Pro you get a larger screen, but every “pro” person buying the 16” is going to cringe when they think about how much faster a MacBook Air could compile their project, render their video, or de-noise their podcast.

If it’s not the MacBook Air, it has to be the 13” MacBook Pro or the 16” MacBook Pro.

The 13” MacBook Pro was updated May 2020. This doesn’t prevent it from getting another update soon, but the rumors suggest a late 2020, or early 2021 release of the first ARM notebook. The 16” was last updated November 2019 and has been (including the 15” version) getting updates roughly once per year. One year after November 2019 lines up nicely with the current rumors.

I’m a bettin’ man, so I’ll wager we’ll see ARM Pro notebooks before we see an ARM MacBook Air.

*If the 16” is the first ARM MacBook, I reserve the right to retroactively call this a prediction.

Case Expressions

Case matching in Elixir gives you a few different ways to check the type of a value. Here we check if val is a map or list.

def foo(val) do
  case val do
    %{} = map ->
      map |> Map.keys() |> Enum.join(",")

    list when is_list(list) ->
      Enum.join(list, ",")

    _ ->
      ""
  end
end

Erie needs some syntactic way to allow matching in a case expression like this. One option is to compel Elixir’s match operator (=) to work in prefix notation.

(def foo [val]
  (case val
    [(= %{} map) (-> map (Map.keys) (Enum.join ","))
     (= [] list) (Enum.join list ",")
     _ ""]))

I don’t like this (and this applies to Elixir as well) because %{} matches on all maps, regardless of if they’re empty or not, but [] only matches on empty lists. I’d rather match on explicit Erie type names to accomplish this.

(def foo [val]
  (case val
    [(= Map map) …
     (= List list) …
     _ …]))

This could potentially work, but Erie’s union types can make it impossible for the compiler to handle. In this next example, how will the compiler know if list is a (List String) or a (List Integer)? toUpper can only operate on Strings, so the compiler should fail on the last line.

(deftype EitherList []
  (union [(List String) (List Integer)]))

(sig foo [EitherList] (List String))
(def foo [val]
  (case val
    [(= List list) (Enum.map list toUpper)]))

Matching on the entire type name should fix that problem.

(deftype EitherList []
  (union [(List String) (List Integer)]))

(sig foo [EitherList] (List String))
(def foo [val]
  (case val
    [(= (List String) list) (Enum.map list toUpper)
     (= (List Integer) list) (Enum.map list toString)]))

What about when you want undefined polymorphic type parameters? Should this type of matching be allowed? Exploring this in the next example, the first match, on (List String), seems like it should be compilable because it returns the exact same type it operates on. But the second match, on (List Integer) doesn’t seem like it should compile. It returns a different type than it matches on, and how can the compiler know if that fits the EitherList pattern?

(deftype EitherList [a b]
  (union [(List a) (List b)]))

(sig foo [(EitherList a b)] (EitherList a b))
(def foo [val]
  (case val
    [(= (List String) list) (Enum.map list toUpper)
     (= (List Integer) list) (Enum.map list toString)
     (= (List a) list) list]))

For now I’ll disallow matching on the type parameter. I’m not sure it’s even technically possible to ever accomplish so I’m happy to hold off.


That should cover matching on types, but what about values? Elixir’s ability to match on values is also worth copying.

def foo(val) do
  case val do
    [] -> "empty"
    [x] -> "one element"
    [x|_] -> "multiple elements"
  end
end

I don’t really like the [x|_] pattern because it feels too infix-y for a lisp, but I’ll leave it in the examples. It’s likely to change before making its way into Erie for good.

(sig foo [(List String)] String)
(def foo [val]
  (case val
    [[] "empty"
     [x] "one element"
     [x|_] "multiple elements"]))

This should combine well with union type matching. It’s a little dense, but I think it’s still readable.

(deftype EitherList []
  (union [(List String) (List Integer)]))

(sig foo [EitherList] String)
(def foo [val]
  (case val
    [(= (List String) [x]) "one element string list"
     (= (List Integer) []) "empty integer list"
     (= (List String) list) "empty or mulitple value string list"
     _ "anything else"]))

The one problem that still needs exploring is how to get the code to match on the right branch.

(deftype EitherList []
  (union [(List String) (List Integer)]))

(sig foo [EitherList] String)
(def foo [val]
  (case val
    [(= (List String) []) "empty string list"
     (= (List Integer) []) "empty integer list"
     _ "anything else"])

Once this type checks, the compiled Erlang Abstract Form will know that val is a list, but if it’s empty, there won’t be any other type information to glean from it and help with matching. I’m not sure exactly how to fix this, but it’s something to keep thinking about.

Type Expansion

Code for this post is available at 651548ffa8.

To confirm that a function type-checks correctly requires inferring the type of the return value and then checking it matches with the type signature. A return value can either be a function parameter, a return value of an inner expression, or a literal.

(sig return_value_as_parameter [Integer] Integer)
(def return_value_as_parameter [x]
  x)

(sig return_value_from_inner_expression [] Integer)
(def return_value_from_inner_expression []
  (* 3 (+ 1 1)))

(sig return_value_as_literal [] Integer)
(def return_value_as_literal [] Integer
  0)

In each of these cases, we can easily infer the type of the return value is Integer and that it matches with the signature. Things get a bit more complicated when you introduce algebraic data types.

(deftype IntOrString []
  (union [Integer String]))

(sig foo [] IntOrString)
(def foo []
  1)

Now it’s not as simple to compare the inferred type of Integer with the signature’s type of IntOrString. Instead of an exact Integer to Integer comparision, we need to expand1 IntOrString into a list of its possible types, and then compare Integer to each of those until we find a match. If the return type is composed of several other union types, we need to expand recursively to get all possible types.

Recursive expansion does expose the possiblity of an infitie recursive type, like those that define lists and trees. The current version of the Erie compiler will infitie loop when it finds a type like this; fixing that will come later.

(deftype RecursiveList [a]
  (union [{a nil} {a (RecursiveList a)}]))

; instance of (RecursiveList Integer) might look like
; {4 {3 {2 {1 nil}}}}

(deftype RecursiveTree [a]
  (union [{'leaf a} {'left (RecursiveTree a) 'right (RecursiveTree a)}]))

; instance of (RecursiveTree Integer) might look like
; {'left {'leaf 1} 'right {'left {'leaf 2} 'right {'leaf 3}}}

As of today, the Erie compiler is about as naive as it gets when it comes to type expansion. It all happens just-in-time and nothing is cached. This means every time the compiler encounters a signature with a return value of IntOrString, it will completely expand it and check, then throw everything away. This is no doubt the slowest way to do type checking, but it was the simplest way for now.

  1. I'm using the phrase "type expansion", but I don't know if that is the phrase favored by compiler authors. My googling efforts have been fruitless. It makes sense to me so I'm sticking with it.

Algebraic Data Types

Code for this post is available at 651548ffa8.

The builtin types are all well and good, but Erie isn’t much without a few additions: maps/dictionaries, structs, and especially alegbraic data types (ADT). Erie will eventually support all of these, but I focused on implementing ADTs first. I refer to them as “union” types throughout the compiler, but they are one in the same.

I want all Erie types to be declared with deftype, regardless of the ultimate shape of the type (i.e. ADT, struct, or type alias). By making the signature of deftype look like a function, it makes adding type parameters straight forward for supporting polymorphic types.

(deftype Maybe [a]
  (union [a 'nil]))

One big difference between Erie’s ADTs and those of other langauges that I’m familiar with is Erie doesn’t required “tags” for each option. For example, in Elm, representing Maybe looks like:

type Maybe a
  = Just a
  | Nothing

Note how the non-Nothing case requires the Just tag. This means a value of type String does not conform to the Maybe String type in Elm. In Erie, String will conform to (Maybe String) because of the lack of tags. I’m not aware (at least not yet) of why tags are necessary. I may run into the reason later in development, but for now, it seems to be working. If you, as a developer, like the tagged approach you can mimic it with tuples in the definition.

(deftype Result [error value]
  (union [{'ok value} {'error error}]))

Erie treats a quoted symbol as a type with a single value of itself. It allows for the above which will integrate nicely with Erlang and Elixir.

In the future I want deftype to be a macro that generates useful functions regarding that type. Specifically allowing matching in a case expression. For now deftype is treated as a special form by the compiler. As I continue to expand the functionality of macros, a large amount of that work should shift out of the compiler and into Erie macros.

Type Checker

Code for this post is available at ed4ee0b78.

Erie now has a very basic type checker. It supports simple builtin types like Integer and String, and the slightly more complex polymorphic list and tuple types. User-defined types are not supported yet.

Testing

The type checker has provided a pretty amazing opportunity for test-driven development. The type checker code has so many paths and branches that having a complete test suite helps prevent me from introducing regression bugs.

I have also found assert_raise to be indespensible. A proper type checker would collect errors and print them all at the end of compilation, but for now, it raises errors when the type checker finds a mismatch.

test "literal list type fails with mismatched types" do
  code = """
  (sig one [] [Integer])
  (def one []
    [1 2 "3" 4])
  """

  {:ok, forms} = Parser.parse(code)
  translator = Translator.from_parsed(forms, {:Core, 1}, false)

  assert_raise RuntimeError, fn ->
    TypeChecker.check(translator)
  end
end

Error Messages

When writing a type checker, one should aspire to the level of detail and helpfulness in Elm type checker messages. The errors raised by the Erie type checker right now are just good enough to help me debug any example Erie code I write, but definitely not good enough to live in the compiler long-term.

def expression_type({:var, _, name}, bindings, _signatures) do
  bindings
  |> Enum.find(fn {n, _} -> n == name end)
  |> case do
    nil -> raise "Returning a binding `#{name}` that isn't a function parameter"
    {_, type} -> type
  end
end

As much as I’d like to add great error messages now, that’s a major project in and of itself.

REPL Start

Code for this post is available at 2734fc099d.

A basic REPL will go a long way toward making Erie feel like a true Lisp. It doesn’t need to be fancy and support helper functions, tab-completion, or even the “up arrow” to redo the last expression. All we want (for now) is to read an expression, evaluate it on the fly, print the result, and repeat.

I was able to put together something that reads input and starts a new Task to compile and evaluate it. But this isn’t terribly useful because Erie is missing a standard library. The only functions that can be called are Elixir or Erlang functions (but they work!).

Core(1)> (Elixir.String.split "abc def ghi")
["abc", "def", "ghi"]
Core(2)> ('erlang.atom_to_binary 'derp 'unicode)
"derp"

This is a nice start but defining our own functions in the REPL and calling them would be better.

In Elixir’s REPL, you can do this, but not using def … do … end (unless it’s wrapped in defmodule … do … end). That’s a bit obnoxious, so I want my REPL to handle (def …) gracefully. I do that by stealing a feature of Clojure’s REPL where you’re effectively “in” a module while in the REPL, so any function defined is a part of that module.

I want to make as much use of my Translator module as possible, but as it’s currently written, it requires all function definitions to be inside a module (now I see why Eilixr may have made that decision). To get around this, I wrap each call to (def …) in a new module thats name is based on a counter. For example, this defintion of x in the REPL

Core(1)> (def x [] 1)

would be compiled as if I wrote

(defmodule Core1) (def x [] 1)

And this defintion of y

Core(2)> (def y [] 2)

would be compiled as if I wrote

(defmodule Core2) (def y [] 2)

This works great, but now calling x the intended way produces an error because it’s not defined in the Core module.

Core(1)> (def x [] 1)
Core(2)> (Core.x)

20:27:31.177 [error] Task #PID<0.143.0> started from Erie.Repl terminating
** (UndefinedFunctionError) function :Core.x/0 is undefined (module :Core is not available)

To solve this, we can keep track of all functions defined in the REPL and their associated modules. If one of them is invoked under the Core module, we can intercept the call and translate it to the correct module.

The REPL is by no means complete, but it feels good to be able to write some Erie and see it execute in realtime.

Macro and Other Updates

Code for this post is available at 0646b91480.

Erie can now handle some basic macros. Very basic. It only does macro expansion once, and it doesn’t correctly call Erie definied functions from within a macro.

Despite those limitations it can correctly compile this module. However, cons and + aren’t properly defined as functions, so executing either one_plus_two or the_first will fail.

(defmodule Core)

(defmacro infix [ast]
  ((lambda [list]
    [(Elixir.Enum.at list 1) (Elixir.Enum.at list 0) (Elixir.Enum.at list 2)])
    (Elixir.Enum.at ast 0)))

(def one_plus_two []
  (infix (1 + 2)))

(defmacro a_list [ast]
  ['cons 1 nil])

(def the_first []
  (Elixir.List.first (a_list nil)))

I’m pretty happy with the progress I’ve made so far on Erie, but there is obviously a lot more to do. Some possible next steps are:

  • REPL
  • Types
  • Standard library
  • More advanced macros
  • Syntax highlighting

More advanced macros can definitely wait. A standard library will be very handy, but it’s not preventing me from writing Erie right now. Types are going to be very important soon, but I think a REPL will have the biggest impact. The ability to write small bits of Erie and execute it immediately will help shape the ergonomics and dictate what should be included in the standard library.

Macro Plan

No Lisp is complete without a macro system. If only it were so easy to implement.

Having never written a language before, let alone a macro system, I’m going to try to break it down into manageable pieces. I have no idea if this is the right approach, but we’ll start with it and see where it takes us.

The steps I’m envisioning are

  1. Parse the code.
  2. Find instances of defmacro.
  3. Find invocations of each of those macros.
  4. Convert the output of the parser into an Erie abstract syntax tree.
  5. Execute each macro with the AST as the argument.
  6. Support “unquoting” somehow.
  7. Replace the macros in the AST with their results.
  8. Repeat until all macros have been expanded.

The simplest macro I can think of is comment: replace the inner forms with a single nil.

(defmacro comment [ast]
  nil)

(def loop []
  (comment
    (Enum.map [1 2 3]
      (lambda [x]
        (+ 1 x)))))

Step One

After parsing, we have this.

[
  [
    {:atom, 1, :defmacro},
    {:atom, 1, :comment},
    {:list, 1, [{:atom, 1, :ast}]},
    {:atom, 2, nil}
  ],
  [
    {:atom, 4, :def},
    {:atom, 4, :loop},
    {:list, 4, []},
    [
      {:atom, 5, :comment},
      [
        {:symbol, 6, :"Enum.map"},
        {:list, 6, [{:integer, 6, 1}, {:integer, 6, 2}, {:integer, 6, 3}]},
        [
          {:atom, 7, :lambda},
          {:list, 7, [{:atom, 7, :x}]},
          [{:atom, 8, :+}, {:integer, 8, 1}, {:atom, 8, :x}]
        ]
      ]
    ]
  ]
]

Step Two

We can find instances of defmacro just like we find instances of def. In this case we only find one: comment.

Step Three

We find inovcations of comment by looking through the results of the parser for anything that looks like a function call to comment.

Step Four

Converting the output of the parser into an Erie abstract syntax tree will look something like this.

[
  [:defmacro, :comment, [:ast], nil],
  [:def, :loop, [],
    [:comment, [:"Enum.map", [1, 2, 3], [:lambda, [:x], [:+, 1, :x]]]]
  ]
]

Step Five

To execute the comment macro, we need to complie it as if it were a normal function. That is possible with :compile.forms/1, but as far as I know, it needs to be contained in a module.

Current strategy is to define all macros as functions in their “macro” module which will be prefixed with MACRO. E.g. the module Erie.Core’s macros will be defined as functions in MACRO.Erie.Core.

Step Six

TBD.

Step Seven

The result of the call to comment is nil, so we can replace that list in the parser output with nil. We’ll also remove the macro definition which should leave us with this.

[
  [
    {:atom, 4, :def},
    {:atom, 4, :loop},
    {:list, 4, []},
    {:atom, 5, nil}
  ]
]

Step Eight

In this case nothing. comment doesn’t need any further expansion, so we can now continue the pipeline and pass this to the translator stage which will produce Erlang Abstract Form for final compilation.

Proper let Syntax

Code for this post is available at e6c19d930f.

Most of the basic features of Erie are working, but I am still missing let bindings. I looked for inspiration in the syntax of other Lisps and found a couple of options. Common Lisp and Racket are techincally the same because Racket treats [] as (), but I included both because Racket often uses [] to make certain forms stand out.

None of the styles strike me as particularly better. Clojure has fewer characters, but I like the way CL and Racket group the binding and the value more explicitly.

Clojure

(let [a 1 b 2]
  (+ a b))

Common Lisp

(let ((a 1) (b 2))
  (+ a b))

Racket

(let ([a 1]
      [b 2])
  (+ a b))

Regardless of the syntax, implmenting let should be straight forward. Erlang Abstract Format supports match which is very similar. In Elixir

x = 42

is represented as

{:match, 1, {:var, 1, :x}, {:integer, 1, 42}}

in EAF and could be used to represent let bindings.

But before implementing that, I found this answer on StackOverflow and it revealed a new way to look at let. To sum it up, let doesn’t need to be implemented in the compiler. It can be a macro that converts its forms into a tree of nested lambdas. (Samples are using the currently proposed Erie let syntax.)

(let
  [a 1]
  [b 2]
  (+ a b))

is equivalent to

((lambda [a]
  ((lambda [b]
    (+ a b)) 2)) 1)

So instead of adding let as a feature in the compiler, I’ll save the syntax decision for when macros are implemented.

Erie's Function Signatures

Code for this post is available at 9e7d81b810.

I had written previously that a function in Erie might look like this. After the function name is the list of arguments and their types, followed by the return type, a doc comment, and finally the function body.

(def to_string [result: (Result String Integer)] String
  """
  Convert to result into a printable string to
  be used in interfaces.
  """
  (case result
    [{'ok int} (Integer.to_string int)]
    [{'error str} str]))

After further deliberation, I realized including the argument and return types in the definition makes things a little bit awkward if I want to support pattern matching on the arguments. Not only does it make the types redundant, but the colon syntax gets weird.

(def to_string [{'ok int}: (Result String Integer)] String
  …)

(def to_string [{'error ""}: (Result String Integer)] String
  …)

(def to_string [fallthrough: (Result String Integer)] String
  …)

Including the comment in the function also gets strange when there are multiple clauses for this function. Which of the clauses should the comment be written under?

My latest thinking is to move the types and doc comment out of the definition and into their own lines. Perhaps something like this.

(doc to_string
  """
  Convert a result type into a printable
  string to be used in interfaces.
  """)
(sig to_string (Result String Integer) String)
(def to_string [{'ok int}]
  …)

(def to_string [{'error ""}]
  …)

(def to_string [fallthrough]
  …)

Now the doc and sig lines could be technically written anywhere in the module, but the formatter can move them to the correct location.

Archive