llvm-mc
Assemble and disassemble LLVM machine code
TLDR
Assemble assembly code file into object file with machine code
Disassemble object file with machine code into assembly code file
Compile LLVM bit code file into assembly code
Assemble assembly code from standard input stream and show encoding to standard output stream
Disassemble machine code from standard input stream for specified triple
SYNOPSIS
llvm-mc [options] [filename]
PARAMETERS
-arch=
Specify the target architecture. Example: `x86_64`, `arm64`, `riscv32`. If not specified, uses the default target.
-asm-instrumentation
Enable assembly instrumentation (for testing purposes).
-disassemble
Disassemble the input file instead of assembling it. Turns an object into assembly.
-filetype=
Specify the output file type. Common values: `asm` (assembly), `obj` (object file), `null` (discard output). Default is `obj` when assembling and `asm` when disassembling.
-o
Specify the output filename. If not specified, writes to standard output (stdout) or creates a file based on the input filename (e.g., `input.s` becomes `input.o`).
-show-encoding
Show the machine code encoding for each instruction (when disassembling).
-triple=
Specify the target triple (e.g., `x86_64-linux-gnu`). Overrides the `-arch` option and provides a more complete target description.
-verify-debug-info
Verify the debug info that is produced from assembly.
-x
Treat the input as having the specified type (e.g. `assembly`, `ir`)
--help
Display available options.
DESCRIPTION
The `llvm-mc` tool is a standalone program provided by the LLVM project that assembles and disassembles machine code. It's a crucial utility for low-level tasks, such as verifying compiler output, analyzing object files, and creating hand-optimized assembly code. Unlike a full-fledged assembler/linker toolchain, `llvm-mc` focuses solely on the machine code representation, offering fine-grained control over instruction encoding and decoding. It supports a variety of target architectures, making it a versatile tool for developers working on different platforms. `llvm-mc` is primarily used by compiler developers, reverse engineers, and anyone needing direct manipulation of machine code instructions. It is particularly helpful when debugging compiler backend issues, as it allows direct input of assembly code for precise testing of the code generation process. When dealing with assembly, use cases may include, but are not limited to: code analysis, optimization, and understanding the intricacies of target architectures. It allows users to verify compiler-generated output and construct code by hand, while understanding the direct relation to the machine language.
CAVEATS
Error messages from `llvm-mc` can sometimes be cryptic, requiring familiarity with the target architecture's instruction set and assembly syntax. The exact assembly syntax supported depends on the target architecture.
ASSEMBLER SYNTAX
The assembly syntax accepted by `llvm-mc` closely follows the conventions of the target architecture. Consult the target architecture's documentation for specific instruction mnemonics, operand encodings, and directive usage. Be mindful of different assembly dialects (e.g., AT&T vs. Intel syntax for x86) and ensure consistency within the input file.
COMMON USE CASES
Common use cases include:
- Testing and validating compiler code generation.
- Hand-crafting optimized assembly routines.
- Reverse engineering object files to understand their functionality.
- Creating minimal test cases for LLVM backend bugs.
HISTORY
llvm-mc was developed as part of the LLVM project, a modular and reusable compiler and toolchain technologies. It was created to provide a standalone assembler and disassembler within the LLVM ecosystem, enabling more fine-grained control over machine code manipulation. Its usage grew alongside LLVM's adoption as a compiler infrastructure, particularly in scenarios requiring low-level code analysis and generation.
SEE ALSO
llvm-objdump(1), llvm-as(1), clang(1)