parsers 1. Hand-writtern parsers 1.1 Recursive-descent parser 2. Automatically generated parsers 2.1 LL, LR, LALR, GLR, PEG
Languages can be: 1. Interpreted Language - Implements semantics themselves - Executes the code - There are two types: - AST-based (recursive) - tree like structure - Bytecode-int (VM) - plain array of encoded intruction - closer to real machine -- that's why they are virtual machine 2. Compiled Language - Delegates semantic to a target language - hoping that target language has a interpreter - Just translate, doesn't execute - There are three types: - Ahead-of-time (AOT) compilers- converted to other language - Before code execution - AST -> Code Generator -> Intermediate Representation -> Native Code (x64.x86. ARM, web-assembly, etc) -> CPU - Alternative, there is LLVM (Low-level Virtual Machine), which can generate intructions for many runtime at once. - AST -> LLVM IR Generator -> LLVM IR -> LLVM Blackbox -> x64/x84/wasm -> CPU - Just-in-time (JIT) compilers - code generation execute at runtime - Byte code is passed to the interpreter (runtime semantics) - It can decide to compiler some heavy weight methods using code generator -> x64 -> CPU -> output. Instead of interpreting them. - These CPU instruction can be cached and used during the runtime - AST-transformers (transpiler) - high level compiler - AST -> AST transformer -> AST (same/different language) -> Code generator -> source code -> Compiler -> ....
Note: ulimately every language needs to be interpreted. In case of compiled language it is often compiles to CPU instruction which are directly interpreted by the CPU.
Is JS/Python/C++ interpreted or compiled language?
Wrong Question. What interpreted or compiled is not the language but the implementations. we can easily have an interpreter for C++. we can allocate a virtual heap in JS and implement C++ Semantic
AST-Interpreter - high level semantic AST is dirtectly passed to the interpreters
Bytecode Interpreter - has a intermediate step After AST - Bytecode Emitter, which produces byte code which is further passed to interpreter takes less space. Instruction for precise and closer to machine instruction.
Virtual Machines - stack based machine - Register based machine in most of the cases the virtual machine and real machines are of mixed type. So if we take for example actual intel architecture, it's a register machine hwich also has a stack, so its mixed machine. You need stack to implement recursive function
Frontend - Till AST Backend - After AST
Runtime semantic should be preserved regardless of implementation
AST interpreter. S-expression (Symbolic expression) -> syntax of programming languages such a Scheme, lisp & Racket // Source code total = current + 10
//AST ["set", "total", ["+", "current", 150]]
// S-expression (set total (+ current 150))
IILE - immediately invoked lambda expression
Design Goal for Eva - Simple Syntax - S-expression - Everything is an expresion - No explicit return - First class function - Static scope - closure - Lambda functions, IILE - Functional programming - Imperative programming - Namespace and module - OOP: class-based, prototype-based
- variable declaration
- Assignment expression
- Variable access
Environment: a repository of variables and functions defined in a scope Environment Structure: - Environment record - Optional reference to Parent Environment Environment interface: - Define a variable (var x 10) - Assign a new value to a variable (set x 20) - Lookup a variable (x)
Block: group of expressions Block scope: entering creates new environment