The runtime debugging feature
There is a runtime debugging mode which can be enabled by the
--debug flag for
ahc-link. When enabled, the compiler inserts "tracing" instructions in the following places:
- The start of a function/basic block
SetLocalwhen the local type is
- Memory load/stores, when the value type is
The tracing messages are quite helpful in observing control flow transfers and memory operations. Remember to also use the
--output-link-report flag to dump the linking report, which contains mapping from data/function symbols to addresses.
The runtime debugging mode also enables a "memory trap" which intercepts every memory load/store instruction and checks if the address is null pointer or other uninitialized regions of the linear memory. The program immediately aborts if an invalid address is encountered. (When debugging mode is switched off, program continues execution and the rest of control flow is all undefined behavior!)
Virtual address spaces
Remember that we're compiling to
wasm32 which has a 32-bit address space, but the host GHC is actually 64-bits, so all pointers in asterius are 64-bits, and upon
call_indirect, we truncate the 64-bit pointer, using only the lower 32-bits for indexing.
The higher 32-bits of pointers are idle tag bits at our disposal, so, we implemented simple virtual address spaces. The linker/runtime is aware of the distinction between:
- The physical address, which is either an
i32index of the linear memory for data, or an
i32index of the table for functions.
- The logical address, which is the
i64pointer value we're passing around.
All access to the memory/table is achieved by using the logical address. The access operations are accompanied by a mapping operation which translates a logical address to a physical one. Currently it's just a truncate, but in the future we may get a more feature-complete
munmap implementation, and some additional computation may occur when address translation is done.
We chose two magic numbers (in
When we emit debug log entries, we may encounter various
i64 values. We examine the higher 32-bits, and if it matches the pointer tag bits, we do a lookup in the data/function symbol table, and if there's a hit, we output the symbol along the value. This spares us the pain to keep a lot of symbol/address mappings in our working memory when examining the debug logs. Some false positives (e.g. some random intermediate
i64 value in a Haskell computation accidently collides with a logical address) may exist in theory, but the probability should be very low.
Note that for consistency between vanilla/debug mode, the virtual address spaces are in effect even in vanilla mode. This won't add extra overhead, since the truncate instruction for 64-bit addresses has been present since the beginning.
Complete list of emitted debugging log entries
- Assertions: some hand-written WebAssembly functions in
Asterius.Builtinscontain assertions which are only active in debugging mode. Failure of an assertion causes a string error message to be printed, and the whole execution flow aborted.
- Memory traps: In
Asterius.MemoryTrap, we implement a rewriting pass which rewrites all load/store instructions into invocations of load/store wrapper functions. The wrapper functions are defined in
Asterius.Builtins, which checks the address and traps if it's an invalid one (null pointer, uninitialized region, etc).
- Control-flow: In
Asterius.Tracing, we implement a rewriting pass on functions (which are later invoked at link-time in
Asterius.Resolve), which emits messages when:
- Entering a Cmm function.
- Entering a basic block. To make sense of block ids, you need to dump pre-linking IRs (which isn't processed by the relooper yet, and preserves the control-flow graph structure)
- Assigning a value to an i64 local. To make sense of local ids, dump IRs. Also note that the local ids here doesn't match the actual local ids in wasm binary code (there is a re-mapping upon serialization), but it shouldn't be a problem since we are debugging the higher level IR here.
There are multiple ways to dump IRs:
- Via GHC flags: GHC flags like
-ddump-to-file -ddump-cmm-rawdump pretty-printed GHC IRs to files.
- Via environment variable: Set the
ASTERIUS_DEBUGenvironment variable, then during booting, a number of IRs (mainly raw Cmm in its AST form, instead of pretty-printed form) will be dumped.
ahc-link --output-irto dump IRs when compiling user code.