This homework covers material from Lecture 5: Foundational Theories of Computation and Lecture 6: Classifying Programming Languages. You will explore the four pillars of computer science, discover new computational models, reason about the limits of computation, and analyze how pragmatic design choices shape programming languages.
All questions require written responses. Show your work where applicable, and provide clear justifications for your answers. Several questions require independent research โ cite your sources. Partial credit will be awarded based on the quality of your reasoning.
The following four code snippets all compute the factorial of a number. Each is written in a language you likely have not used before.
APL:
factorial โ {ร/โณโต}
Forth:
: factorial ( n -- n! ) 1 swap 1+ 1 do i * loop ;
Prolog:
factorial(0, 1).
factorial(N, F) :- N > 0, N1 is N - 1, factorial(N1, F1), F is N * F1.
Brainfuck:
+++++ Set cell 0 to 5 (input)
[>+>+++<<-]>>+>+<<< Setup for multiplication loop
[>[>+>+<<-]>>[<<+>>-]<<<-] Compute factorial
>>>. Output result
Choose two of these snippets. For each:
Natural language is famously ambiguous. Consider these two English sentences:
For each sentence:
The Fundamental Hypothesis of Language Theory (LN5) claims that any piece of human-usable information can be represented as a string of symbols โ but this hypothesis is unprovable.
Consider the following example of information that seems to resist symbolic capture:
A chess grandmaster's "positional intuition" โ the ability to glance at a board and immediately sense which side has the advantage, without calculating any specific lines of play.
Programming languages are not the only formal languages. Many domains use their own formal notations to represent structured information as strings.
Research one of the following formal languages (or propose your own, subject to instructor approval):
For your chosen formal language:
In LN5, we studied Turing Machines โ the most powerful computational model we know. But Turing Machines are complex: an infinite tape, a read/write head, arbitrary movement. What happens when we strip away that complexity?
In Part 2, you will discover two simpler classes of automata by researching them independently, building examples, and exploring their capabilities and limits. By the end, you will construct the complete hierarchy of computational power โ connecting machines, languages, and decidability.
In LN5, we defined Turing Machines as tape + head + states + transition function. But what if we removed the tape entirely? What if all a machine had was a finite set of states and transitions between them? In this problem, you will research the simplest class of automata, build one, discover what it cannot do, and connect it to regular expressions.
Research: What is a Finite Automaton?
Look up Deterministic Finite Automata (DFAs). A DFA has no tape and no stack โ just a finite set of states, transitions between them based on input symbols, a start state, and a set of accepting states. It reads input one symbol at a time, left to right, with no backtracking.
In your own words:
Build and Trace a DFA
Design a DFA that accepts binary strings containing an even number of 1s and rejects those with an odd number of 1s. (The empty string has zero 1s, which is even.)
10110, showing the state after each symbol is consumedDiscover a Limitation
Now consider the language of balanced parentheses: the set of strings like (), (()), (()(())), etc.
Attempt to design a DFA for this language. You will find that you cannot.
Regular Expressions and the Chomsky Connection
The regular expression (a|b)*abb describes a language โ the set of all strings over the alphabet abb.
You have just discovered that DFAs are elegant but limited โ they cannot count beyond a fixed bound. What if we gave our finite automaton a single additional data structure? In this problem, you will research Pushdown Automata, see how they overcome the DFA's limitation, discover their own limits, and build the complete hierarchy of computational power.
Research: What is a Pushdown Automaton?
Look up Pushdown Automata (PDAs). A PDA is a finite automaton augmented with a stack โ an unbounded last-in, first-out (LIFO) data structure.
In your own words:
Solving What DFAs Cannot
In Q2c, you showed that DFAs cannot recognize the language of balanced parentheses because they cannot count unbounded nesting depth.
(? What does it do when it reads )? How does it know the string is balanced when input ends?(()()), showing the stack contents after each symbol is consumed.Discovering PDA Limitations
Consider the language of strings consisting of equal numbers of a's, b's, and c's, in that order โ for example, abc, aabbcc, aaabbbccc, and so on.
Building the Complete Hierarchy
You have now encountered three machines: DFAs (Q2), PDAs (Q3), and Turing Machines (LN5). Each has strictly more computational power than the last.
In LN5, we learned that most functions are not computable โ there are more functions than programs. The proof uses the same diagonal argument from LN3's set theory. In Part 3, you will work through this argument yourself and connect the machine hierarchy you built in Part 2 to the fundamental limits of computation.
This is a guided proof. Each sub-question builds on the previous one. By the end, you will have constructed a formal argument for why most functions have no program โ using the set theory tools from LN3.
Step 1: Programs Are Countable
Using the set theory from LN3:
Step 2: Functions Are Uncountable
Now consider the set of all functions
Apply Cantor's diagonal argument (from the cardinality section of LN3):
Step 3: The Implication
Combine your results from Q4a and Q4b:
Step 4: Connection to the Halting Problem
The diagonal argument shows that uncomputable functions exist. The Halting Problem (LN5) gives us a specific, famous one.
fox(f) that does the opposite of what halts predicts โ eerily similar to how the diagonal function In Q2 you built a DFA for binary parity, and in Q3 you built a PDA for balanced parentheses. Consider these three languages:
For each language, state the weakest machine from the hierarchy you built in Q3d that can recognize it. Then explain why the next weaker machine cannot โ what specific capability does it lack?
The Halting Problem (LN5) defines a language: the set of all (program, input) pairs where the program eventually halts.
In LN4, we showed that the Simply Typed Lambda Calculus always terminates โ every well-typed expression has a normal form (strong normalization). This means its set of expressible computations is a proper subset of what a Turing Machine can do.
Defining Reduction
In complexity theory, a polynomial-time reduction from problem
Using the set theory and logic from LN2 and LN3:
A Guided Reduction: Building Structural Transformation Intuition
Consider two classic NP-complete problems on graphs:
Consider the following graph with 5 vertices and 6 edges:
Loading graph...
Looking ahead: Notice how a systematic structural transformation showed that two problems that look different are actually the same in disguise. This is the same skill you will use throughout this course: parsing transforms strings into trees, trimming a Concrete Syntax Tree to an Abstract Syntax Tree removes redundant structure while preserving meaning, and evaluating an AST into a semantic value transforms structure into computation. Each of these is a reduction โ a systematic mapping between representations that preserves what matters.
Formalizing Complexity with Our Tools
The formal tools from LN2 (logic) and LN3 (set theory) are exactly what we use to define complexity classes precisely. In this problem, we give you the formal components โ your job is to explain what they mean and assemble them correctly.
1. The formal definition of Big-O is:
This single definition uses three of our formal tools simultaneously. Identify each one and explain its role:
2. Now define the complexity class P. We give you the building blocks โ assemble them into a single formal definition using set-builder notation:
Fill in each blank and explain your choices. (Hint: you need to say there exists a machine and a polynomial such that for every input, the machine decides correctly within the polynomial bound.)
3. Define NP similarly. The key difference is that NP uses a verifier instead of a decider:
Fill in the blanks. Then explain: why does NP have an extra existential quantifier that P does not? What is being "guessed" versus "checked"?
Complexity Analysis
Consider this Python function:
def find_pair(items, target):
for i in range(len(items)):
for j in range(i + 1, len(items)):
if items[i] + items[j] == target:
return (i, j)
return None
1. Formally prove the worst-case time complexity of find_pair using the Big-O definition from Q6c. We break this into guided steps:
2. Is the problem solved by find_pair in P? Use your definition of P from Q6c. (Hint: you just proved its time complexity โ does it fit within a polynomial bound? What would the polynomial
3. Is it in NP? Describe a polynomial-time verifier: given a proposed solution (a pair of indices
In LN6, we explored how programming languages force programmers to think at different levels of abstraction โ and how choosing the right tool is a pragmatic decision about matching the level of concern to the problem domain. In Part 5, you will research real languages and apply the pragmatic principles from lecture.
For each language pair below, research both languages and perform a comparative pragmatic analysis. You should look at documentation, tutorials, and example programs in each language. Cite your sources.
Functional Pair: Erlang vs. Haskell
Both Erlang and Haskell are functional programming languages, but they were designed for very different problem domains and make very different pragmatic trade-offs.
Systems Pair: Rust vs. C++
Both Rust and C++ are systems programming languages that give the programmer fine-grained control over memory and performance. But they take fundamentally different approaches to safety and correctness.
Perform the same analysis as Q7a:
Pit of Success or Pit of Despair?
LN6 introduced the concept of pragmatic gravity โ some languages make the easiest path the correct path ("pit of success"), while others make the easiest path the dangerous path ("pit of despair").
Choose a language from the LN6 abstraction spectrum that was not already analyzed in lecture (do not choose Rust, Elm, C, JavaScript, or PHP). Argue whether your chosen language is a pit of success or a pit of despair for its primary problem domain.
Provide a specific code example that demonstrates the pragmatic gravity โ show how the language's defaults either guide the programmer toward correct code or toward error-prone code.
The Blub Paradox
Paul Graham's Blub Paradox (LN6) argues that programmers can look down the abstraction spectrum and recognize inferior tools, but cannot look up and recognize what they are missing.
Error Messages as a Pragmatic Design Dimension
LN6 argued that error messages are a kind of meta-language with their own pragmatic properties.
Find a single type error or syntax error that can occur in two different languages (or the same error reported by two different compilers/tools). Show the error message produced by each.
Compare:
| Section | Question | Points |
|---|---|---|
| Part 1: Language Theory | Q1: Exploring Forms of Expression | 15 |
| Part 2: Automata Theory | Q2: Guided Discovery โ Finite Automata | 12 |
| Q3: Guided Discovery โ PDAs + Hierarchy | 13 | |
| Part 3: Computability Theory | Q4: The Diagonal Argument (Guided Proof) | 15 |
| Q5: Machine Capabilities Across the Hierarchy | 10 | |
| Part 4: Complexity Theory | Q6: Reducibility and Formalizing Complexity | 15 |
| Part 5: Language Pragmatics | Q7: Paradigmatic Friction (Research) | 12 |
| Q8: Pragmatic Principles in the Wild | 8 | |
| Total | 100 |
This is Stage 3 of the WebAssembly project. In HW3, you reasoned about what machines can and cannot decide โ the halting problem, the Chomsky hierarchy, and the trade-off between expressiveness and guarantees. In this optional, you will implement a decidable validation algorithm for WebAssembly and understand why it is decidable.
Wasm was deliberately designed so that validation (type-checking an entire module) runs in linear time. This is not an accident โ it is a consequence of design decisions that restrict Wasm's control flow in ways that map directly to the Chomsky hierarchy you studied. You will implement the validator and then reason about what makes it work.
If you completed the HW1 optional (the type checker), you may use your own implementation here. Otherwise, the scaffolding below includes a reference type checker you can build on.
Validation โ Modules โ How entire modules are validated: checking function types, verifying imports/exports, validating function bodies.
Appendix โ Validation Algorithm โ This is the key reading. The spec provides a complete validation algorithm and proves it is equivalent to the declarative typing rules from Stage 1. The algorithm runs in linear time on the instruction sequence. Read this and understand why linear-time validation is possible.
Introduction โ Overview โ Semantic Phases โ Re-read the "Validation" paragraph: "Validation checks a number of well-formedness conditions to guarantee that the module is meaningful and safe."
Implement a function validate(module) that takes a Wasm module (as a structured object) and returns either valid or a list of validation errors. Your validator should check:
local.get x references an index that exists in the function's local declarations; every call x references a valid function index.block and if must have a matching end; the operand stack must be balanced at every join point.You may use any programming language.
Module structure โ the input to your validator:
Module = {
types: functype[] -- type declarations
funcs: Function[] -- function definitions
}
Function = {
typeIndex: int -- index into Module.types
locals: valtype[] -- declared local variables (params + locals)
body: Instruction[] -- instruction sequence
}
Reference type checker (if you did not complete Stage 1):
typecheck(instructions, context) -> instrtype | error
You may assume this function exists and works correctly. Implement it yourself, or use the pseudocode from the HW1 optional scaffolding.
Test modules:
Test 1 โ Valid module with two functions:
Module {
types: [{ params: ["i32", "i32"], results: ["i32"] },
{ params: ["i32"], results: ["i32"] }],
funcs: [
{ typeIndex: 0, locals: ["i32", "i32"],
body: [local.get 0, local.get 1, i32.add] },
{ typeIndex: 1, locals: ["i32"],
body: [local.get 0, local.get 0, i32.mul] }
]
}
Expected: valid.
Test 2 โ Invalid: function body returns wrong type:
Module {
types: [{ params: ["i32"], results: ["i32"] }],
funcs: [
{ typeIndex: 0, locals: ["i32"],
body: [local.get 0, drop] }
]
}
Expected: error โ function body produces [] but signature expects ["i32"].
Test 3 โ Invalid: out-of-bounds local index:
Module {
types: [{ params: [], results: ["i32"] }],
funcs: [
{ typeIndex: 0, locals: [],
body: [local.get 5] }
]
}
Expected: error โ local index 5 does not exist.
Test 4 โ Valid module with control flow:
Module {
types: [{ params: ["i32"], results: ["i32"] }],
funcs: [
{ typeIndex: 0, locals: ["i32"],
body: [local.get 0, if (result i32) [i32.const 1] else [i32.const 0] end] }
]
}
Expected: valid.
After implementing the validator, answer these questions (include them as comments in your code or in a separate REFLECTION.md):
Wasm's control flow is structured โ no arbitrary goto, only block/loop/if with br targeting labels. How does this restriction relate to the Chomsky hierarchy from HW3? At what level of the hierarchy does Wasm's control flow sit? What would change if Wasm allowed arbitrary jumps?
The validation algorithm runs in
SQL, regex, and CSS were discussed in HW3 as intentionally sub-Turing-complete. Is Wasm Turing-complete? What does Wasm sacrifice (compared to native x86 code) and what does it gain?
Submit your validator source code, verify it handles all 4 test modules correctly, and include your reflection answers.