Source code to AST diagram

Making a Programming Language

A step-by-step guide to building your own programming language in TypeScript — covering lexing, parsing, evaluation, variables, math, arrays, and more.

May 18, 2026#typescript#compilers

Have you ever looked at a piece of code and wondered: how does the computer actually know what this means? I don't mean how it compiles or optimizes. I mean the very first step. How does text become instructions?

The core ideas are surprisingly simple. There are only a few pieces: a lexer that turns text into tokens, a parser that builds structure out of those tokens, and an evaluator that runs that structure. Everything else (variables, functions, conditionals, even types) is just patterns built on top.

In this series, we'll build a working programming language called Spark using TypeScript. By the end, you'll have an actual interpreter you can run code in, right here in your browser.

Choosing a Syntax

Before writing any code, we need to decide what our language looks like. I wanted something that feels familiar but has its own character, not just another JavaScript clone.

Here's what Spark looks like:

1val name = "Spark"
2val version = 1
3val features = ["lexer", "parser", "eval"]
4
5when (version > 0) {
6  say("ready to go")
7}
8
9func add(a, b) {
10  return a + b
11}
12
13say(add(3, 4))

The design choices:

  • val instead of let or const (borrowed from Kotlin and Swift)
  • func instead of fn or function (explicit and readable)
  • say instead of print (Spark speaks to you)
  • when instead of if (reads more like natural language)
  • yes and no instead of true and false (plain English)
  • Curly braces for blocks, no semicolons, newlines separate statements

The Pipeline

Every language interpreter follows the same pipeline:

1source code → lexer → tokens → parser → AST → evaluator → output

An analogy: imagine explaining a recipe to someone. The lexer is like recognizing individual words. The parser is understanding the grammar ("chop the onions" is a verb+noun pair). The evaluator is actually performing the actions.

Token Types

The lexer and parser communicate through tokens. Each token has a type, a string value, and a source position for error reporting:

types.ts
1export enum TokenType {
2  Number, String, Identifier,
3  Let, If, Else, Fn, Return, Print,
4  True, False,
5  Plus, Minus, Star, Slash,
6  Eq, EqEq, Bang, BangEq,
7  Lt, Gt, LtEq, GtEq,
8  LParen, RParen, LBrace, RBrace,
9  LBracket, RBracket,
10  Comma, Arrow, DotDot,
11  EOF,
12}
13
14export interface Token {
15  type: TokenType;
16  value: string;
17  line: number;
18  col: number;
19}

Every language construct becomes a TokenType enum member. The line and col fields let us report exactly where errors occur.

Error Classes

When something goes wrong, we need to point the programmer to the exact location:

types.ts
1export class SparkError extends Error {
2  line: number;
3  col: number;
4
5  constructor(message: string, line: number, col: number) {
6    super(message);
7    this.name = "SparkError";
8    this.line = line;
9    this.col = col;
10  }
11
12  format(source: string): string {
13    const lines = source.split("\n");
14    const lineStr = lines[this.line - 1] ?? "";
15    const pointer = " ".repeat(Math.max(0, this.col - 1)) + "^";
16    return `${this.message}\n  at ${this.line}:${this.col}\n  ${lineStr}\n  ${pointer}`;
17  }
18}
19
20export class ParseError extends SparkError {}
21export class RuntimeError extends SparkError {}

SparkError stores the position and can render a formatted message with a caret pointing at the problem. ParseError and RuntimeError extend it with no extra logic. They just carry different names so callers can distinguish syntax errors from runtime errors.

Statement Nodes

A Spark program is a list of statements. Each statement node is a discriminated union with a kind field:

types.ts
1export type Statement =
2  | Program | LetStatement | IfStatement
3  | FunctionDeclaration | ReturnStatement
4  | ExpressionStatement;
5
6export interface Program {
7  kind: "Program";
8  body: Statement[];
9}
10
11export interface LetStatement {
12  kind: "LetStatement";
13  name: string;
14  value: Expression;
15}
16
17export interface IfStatement {
18  kind: "IfStatement";
19  condition: Expression;
20  consequent: Statement[];
21  alternate: Statement[] | null;
22}
23
24export interface FunctionDeclaration {
25  kind: "FunctionDeclaration";
26  name: string;
27  params: string[];
28  body: Statement[];
29}
30
31export interface ReturnStatement {
32  kind: "ReturnStatement";
33  value: Expression | null;
34}
35
36export interface ExpressionStatement {
37  kind: "ExpressionStatement";
38  expression: Expression;
39}

Program is the root node containing all top-level statements. IfStatement carries both branches, and FunctionDeclaration stores its parameters and body separately.

Expression Nodes

Expressions produce values. They follow the same discriminated union pattern:

types.ts
1export type Expression =
2  | BinaryExpression | Identifier
3  | NumberLiteral | StringLiteral | BoolLiteral
4  | ArrayLiteral | IndexExpression
5  | CallExpression | Assignment;
6
7export interface BinaryExpression {
8  kind: "BinaryExpression";
9  left: Expression;
10  operator: string;
11  right: Expression;
12}
13
14export interface Identifier {
15  kind: "Identifier";
16  name: string;
17}
18
19export interface NumberLiteral {
20  kind: "NumberLiteral";
21  value: number;
22}
23
24export interface StringLiteral {
25  kind: "StringLiteral";
26  value: string;
27}
28
29export interface BoolLiteral {
30  kind: "BoolLiteral";
31  value: boolean;
32}
33
34export interface ArrayLiteral {
35  kind: "ArrayLiteral";
36  elements: Expression[];
37}

The operator field on BinaryExpression is a string like "+", "-", "==", or ".." for ranges. This keeps the parser simple: it just records the operator text and lets the evaluator decide what each operator means.

Indexing, Calls, and Assignment

After an expression, you can index into it, call it, or assign to it:

types.ts
1export interface IndexExpression {
2  kind: "IndexExpression";
3  array: Expression;
4  index: Expression;
5}
6
7export interface CallExpression {
8  kind: "CallExpression";
9  callee: Expression;
10  args: Expression[];
11}
12
13export interface Assignment {
14  kind: "Assignment";
15  name: string;
16  value: Expression;
17}

These three node types handle arr[0], foo(), and x = 5 respectively. They're parsed as suffixes to an existing expression, which naturally handles chaining like getArr()[0].

Runtime Values

The evaluator doesn't work with AST nodes. It works with runtime values. Every value in Spark is one of these:

types.ts
1export type RuntimeValue =
2  | { type: "number";  value: number }
3  | { type: "string";  value: string }
4  | { type: "boolean"; value: boolean }
5  | { type: "array";   value: RuntimeValue[] }
6  | { type: "function"; name: string; params: string[];
7      body: Statement[]; closure: Environment }
8  | { type: "native";  name: string;
9      fn: (args: RuntimeValue[]) => RuntimeValue }
10  | { type: "null" };

Functions are first-class values. A user-defined function stores its AST (body), parameters (params), and the scope where it was defined (closure). Native functions are TypeScript callbacks. say is implemented this way.

Environments

Variables live in environments, which form a linked list for lexical scoping:

types.ts
1export interface Environment {
2  variables: Map<string, RuntimeValue>;
3  parent: Environment | null;
4}

When you reference a variable, the evaluator checks the current scope, then walks up through parents. A function's closure is the environment at the point of definition. That is what makes closures work.

With the types in place, we're ready to build the first stage of the pipeline. In the next part, we'll write the lexer, the component that turns raw text into a stream of tokens.