pypcode
#
Pythonic interface to SLEIGH
- class pypcode.Address#
Bases:
object
Low level machine byte address.
- property offset#
The offset within the space.
- property space#
The address space.
- class pypcode.AddrSpace#
Bases:
object
A region where processor data is stored.
- property name#
The name of this address space.
- class pypcode.Arch(name, ldefpath)[source]#
Bases:
object
Main class representing an architecture describing available languages.
- Parameters:
name (str) –
ldefpath (str) –
-
archpath:
str
#
-
archname:
str
#
-
ldefpath:
str
#
-
ldef:
ElementTree
#
-
languages:
Sequence
[ArchLanguage
]#
- class pypcode.ArchLanguage(archdir, ldef)[source]#
Bases:
object
A specific language for an architecture. Provides access to language, pspec, and cspecs.
- Parameters:
archdir (str) –
ldef (Element) –
-
archdir:
str
#
-
ldef:
Element
#
- property pspec_path: str#
- property slafile_path: str#
- property description: str#
- property pspec: Element | None#
- property cspecs: Mapping[Tuple[str, str], Element]#
- classmethod from_id(langid)[source]#
Return language with given id, or None if the language could not be found.
- Return type:
Optional
[ArchLanguage
]- Parameters:
langid (str) –
- exception pypcode.BadDataError#
Bases:
Exception
- args#
- with_traceback()#
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class pypcode.Context(language)[source]#
Bases:
Context
Context for translation.
- Parameters:
language (ArchLanguage) –
-
language:
ArchLanguage
#
- disassemble#
Disassemble and format machine code as assembly code.
In [1]: import pypcode ...: ctx = pypcode.Context("x86:LE:64:default") ...: dx = ctx.disassemble(b"\x48\x35\x78\x56\x34\x12\xc3") ...: for ins in dx.instructions: ...: print(f"{ins.addr.offset:#x}/{ins.length}: {ins.mnem} {ins.body}") ...: 0x0/6: XOR RAX,0x12345678 0x6/1: RET
- Instructions are decoded from
buf
and formatted inInstruction
s: the end of the buffer is reached,
max_bytes
ormax_instructions
is reached, oran exception occurs.
If an exception occurs following successful disassembly of at least one instruction, the exception is discarded and the successful disassembly is returned. If the exception occurs at disassembly of the first instruction, it will be raised. See below for possible exceptions.
- Parameters:
buf (bytes) – Machine code to disassemble.
base_address (int) – Base address of the code at offset being decoded, 0 by default.
offset (int) – Offset into
bytes
to begin disassembly, 0 by default.max_bytes (int) – Maximum number of bytes to disassemble, or 0 for no limit (default).
max_instructions (int) – Maximum number of instructions to disassemble, or 0 for no limit (default).
- Returns:
The disassembled machine code. Instructions are accessible through
Disassembly.instructions
.- Return type:
- Raises:
BadDataError – The instruction at
base_address
could be decoded.
- Instructions are decoded from
- getAllRegisters#
Get a mapping of all register locations to their corresponding names.
- getRegisterName#
Get the name of a register.
- Parameters:
space (AddrSpace) – The address space.
offset (int) – Offset within the address space.
size (int) – Size of the register, in bytes.
- Returns:
The register name, or the empty string if the register could not be identified.
- Return type:
str
- reset#
Reset the context.
- setVariableDefault#
Provide a default value for a context variable.
- translate#
Translate machine code to P-Code.
In [1]: import pypcode ...: ctx = pypcode.Context("x86:LE:64:default") ...: tx = ctx.translate(b"\x48\x35\x78\x56\x34\x12\xc3") # xor rax, 0x12345678; ret ...: for op in tx.ops: ...: print(pypcode.PcodePrettyPrinter.fmt_op(op)) ...: IMARK ram[0:6] CF = 0x0 OF = 0x0 RAX = RAX ^ 0x12345678 SF = RAX s< 0x0 ZF = RAX == 0x0 unique[13180:8] = RAX & 0xff unique[13200:1] = popcount(unique[13180:8]) unique[13280:1] = unique[13200:1] & 0x1 PF = unique[13280:1] == 0x0 IMARK ram[6:1] RIP = *[ram]RSP RSP = RSP + 0x8 return RIP
- Instructions are decoded from
buf
and translated to a sequence ofPcodeOp
s until: the end of the buffer is reached,
max_bytes
ormax_instructions
is reached,if the
BB_TERMINATING
flag is set, an instruction which performs a branch is encountered, oran exception occurs.
A
PcodeOp
with opcodeOpCode.IMARK
is used to identify machine instructions corresponding to a translation.OpCode.IMARK
ops precede the corresponding P-Code translation, and will have one or more inputVarnode
s identifying the address and length in bytes of the source machine instruction(s). The number of inputVarnode
s depends on the number of instructions that were decoded for the translation of the particular instruction.On architectures with branch delay slots, the effects of the delay slot instructions will be included in the translation of the branch instruction. For this reason, it is possible that more instructions than specified in
max_instructions
may be translated. TheOpCode.IMARK
op identifying the branch instruction will contain an inputVarnode
corresponding to the branch instruction, with additional inputVarnode
identifying corresponding delay slot instructions.If an exception occurs following successful translation of at least one instruction, the exception is discarded and the successful translation is returned. If the exception occurs during translation of the first instruction, the exception will be raised. See below for possible exceptions.
- Parameters:
buf (bytes) – Machine code to translate.
base_address (int) – Base address of the code at offset being decoded.
offset (int) – Offset into
bytes
to begin translation.max_bytes (int) – Maximum number of bytes to translate.
max_instructions (int) – Maximum number of instructions to translate.
flags (int) – Flags controlling translation. See
TranslateFlags
.
- Returns:
The P-Code translation of the input machine code. P-Code ops are accessible through
Translation.ops
.- Return type:
- Raises:
BadDataError – The instruction at
base_address
could not be decoded.UnimplError – The P-Code for instruction at
base_address
is not yet implemented.
- Instructions are decoded from
- exception pypcode.DecoderError#
Bases:
Exception
- args#
- with_traceback()#
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class pypcode.Disassembly#
Bases:
object
Machine Code Disassembly.
- property instructions#
The disassembled instructions.
- class pypcode.Instruction#
Bases:
object
Disassembled machine code instruction.
- property addr#
Address of this instruction.
- property body#
Operand string of this instruction.
- property length#
Length, in bytes, of this instruction.
- property mnem#
Mnemonic string of this instruction.
- class pypcode.OpCode#
Bases:
object
- BOOL_AND = pypcode.pypcode_native.OpCode.BOOL_AND#
- BOOL_NEGATE = pypcode.pypcode_native.OpCode.BOOL_NEGATE#
- BOOL_OR = pypcode.pypcode_native.OpCode.BOOL_OR#
- BOOL_XOR = pypcode.pypcode_native.OpCode.BOOL_XOR#
- BRANCH = pypcode.pypcode_native.OpCode.BRANCH#
- BRANCHIND = pypcode.pypcode_native.OpCode.BRANCHIND#
- CALL = pypcode.pypcode_native.OpCode.CALL#
- CALLIND = pypcode.pypcode_native.OpCode.CALLIND#
- CALLOTHER = pypcode.pypcode_native.OpCode.CALLOTHER#
- CAST = pypcode.pypcode_native.OpCode.CAST#
- CBRANCH = pypcode.pypcode_native.OpCode.CBRANCH#
- COPY = pypcode.pypcode_native.OpCode.COPY#
- CPOOLREF = pypcode.pypcode_native.OpCode.CPOOLREF#
- EXTRACT = pypcode.pypcode_native.OpCode.EXTRACT#
- FLOAT_ABS = pypcode.pypcode_native.OpCode.FLOAT_ABS#
- FLOAT_ADD = pypcode.pypcode_native.OpCode.FLOAT_ADD#
- FLOAT_CEIL = pypcode.pypcode_native.OpCode.FLOAT_CEIL#
- FLOAT_DIV = pypcode.pypcode_native.OpCode.FLOAT_DIV#
- FLOAT_EQUAL = pypcode.pypcode_native.OpCode.FLOAT_EQUAL#
- FLOAT_FLOAT2FLOAT = pypcode.pypcode_native.OpCode.FLOAT_FLOAT2FLOAT#
- FLOAT_FLOOR = pypcode.pypcode_native.OpCode.FLOAT_FLOOR#
- FLOAT_INT2FLOAT = pypcode.pypcode_native.OpCode.FLOAT_INT2FLOAT#
- FLOAT_LESS = pypcode.pypcode_native.OpCode.FLOAT_LESS#
- FLOAT_LESSEQUAL = pypcode.pypcode_native.OpCode.FLOAT_LESSEQUAL#
- FLOAT_MULT = pypcode.pypcode_native.OpCode.FLOAT_MULT#
- FLOAT_NAN = pypcode.pypcode_native.OpCode.FLOAT_NAN#
- FLOAT_NEG = pypcode.pypcode_native.OpCode.FLOAT_NEG#
- FLOAT_NOTEQUAL = pypcode.pypcode_native.OpCode.FLOAT_NOTEQUAL#
- FLOAT_ROUND = pypcode.pypcode_native.OpCode.FLOAT_ROUND#
- FLOAT_SQRT = pypcode.pypcode_native.OpCode.FLOAT_SQRT#
- FLOAT_SUB = pypcode.pypcode_native.OpCode.FLOAT_SUB#
- FLOAT_TRUNC = pypcode.pypcode_native.OpCode.FLOAT_TRUNC#
- IMARK = pypcode.pypcode_native.OpCode.IMARK#
- INDIRECT = pypcode.pypcode_native.OpCode.INDIRECT#
- INSERT = pypcode.pypcode_native.OpCode.INSERT#
- INT_2COMP = pypcode.pypcode_native.OpCode.INT_2COMP#
- INT_ADD = pypcode.pypcode_native.OpCode.INT_ADD#
- INT_AND = pypcode.pypcode_native.OpCode.INT_AND#
- INT_CARRY = pypcode.pypcode_native.OpCode.INT_CARRY#
- INT_DIV = pypcode.pypcode_native.OpCode.INT_DIV#
- INT_EQUAL = pypcode.pypcode_native.OpCode.INT_EQUAL#
- INT_LEFT = pypcode.pypcode_native.OpCode.INT_LEFT#
- INT_LESS = pypcode.pypcode_native.OpCode.INT_LESS#
- INT_LESSEQUAL = pypcode.pypcode_native.OpCode.INT_LESSEQUAL#
- INT_MULT = pypcode.pypcode_native.OpCode.INT_MULT#
- INT_NEGATE = pypcode.pypcode_native.OpCode.INT_NEGATE#
- INT_NOTEQUAL = pypcode.pypcode_native.OpCode.INT_NOTEQUAL#
- INT_OR = pypcode.pypcode_native.OpCode.INT_OR#
- INT_REM = pypcode.pypcode_native.OpCode.INT_REM#
- INT_RIGHT = pypcode.pypcode_native.OpCode.INT_RIGHT#
- INT_SBORROW = pypcode.pypcode_native.OpCode.INT_SBORROW#
- INT_SCARRY = pypcode.pypcode_native.OpCode.INT_SCARRY#
- INT_SDIV = pypcode.pypcode_native.OpCode.INT_SDIV#
- INT_SEXT = pypcode.pypcode_native.OpCode.INT_SEXT#
- INT_SLESS = pypcode.pypcode_native.OpCode.INT_SLESS#
- INT_SLESSEQUAL = pypcode.pypcode_native.OpCode.INT_SLESSEQUAL#
- INT_SREM = pypcode.pypcode_native.OpCode.INT_SREM#
- INT_SRIGHT = pypcode.pypcode_native.OpCode.INT_SRIGHT#
- INT_SUB = pypcode.pypcode_native.OpCode.INT_SUB#
- INT_XOR = pypcode.pypcode_native.OpCode.INT_XOR#
- INT_ZEXT = pypcode.pypcode_native.OpCode.INT_ZEXT#
- LOAD = pypcode.pypcode_native.OpCode.LOAD#
- MULTIEQUAL = pypcode.pypcode_native.OpCode.MULTIEQUAL#
- NEW = pypcode.pypcode_native.OpCode.NEW#
- PIECE = pypcode.pypcode_native.OpCode.PIECE#
- POPCOUNT = pypcode.pypcode_native.OpCode.POPCOUNT#
- PTRADD = pypcode.pypcode_native.OpCode.PTRADD#
- PTRSUB = pypcode.pypcode_native.OpCode.PTRSUB#
- RETURN = pypcode.pypcode_native.OpCode.RETURN#
- SEGMENTOP = pypcode.pypcode_native.OpCode.SEGMENTOP#
- STORE = pypcode.pypcode_native.OpCode.STORE#
- SUBPIECE = pypcode.pypcode_native.OpCode.SUBPIECE#
- class pypcode.OpFormatBinary(operator)[source]#
Bases:
OpFormat
General binary op pretty-printer.
- Parameters:
operator (str) –
- operator#
- class pypcode.OpFormatFunc(operator)[source]#
Bases:
OpFormat
Function-call style op pretty-printer.
- Parameters:
operator (str) –
- operator#
- class pypcode.OpFormatUnary(operator)[source]#
Bases:
OpFormat
General unary op pretty-printer.
- Parameters:
operator (str) –
- operator#
- class pypcode.PcodeOp#
Bases:
object
Low-level representation of a single P-Code operation.
- property inputs#
Input varnodes for this operation.
- property opcode#
Opcode for this operation.
- property output#
Output varnode for this operation.
- class pypcode.PcodePrettyPrinter[source]#
Bases:
object
P-code pretty-printer.
- DEFAULT_OP_FORMAT = <pypcode.OpFormat object>#
- OP_FORMATS = {pypcode.pypcode_native.OpCode.COPY: <pypcode.OpFormatUnary object>, pypcode.pypcode_native.OpCode.LOAD: <pypcode.OpFormatSpecial object>, pypcode.pypcode_native.OpCode.STORE: <pypcode.OpFormatSpecial object>, pypcode.pypcode_native.OpCode.BRANCH: <pypcode.OpFormatSpecial object>, pypcode.pypcode_native.OpCode.CBRANCH: <pypcode.OpFormatSpecial object>, pypcode.pypcode_native.OpCode.BRANCHIND: <pypcode.OpFormatSpecial object>, pypcode.pypcode_native.OpCode.CALL: <pypcode.OpFormatSpecial object>, pypcode.pypcode_native.OpCode.CALLIND: <pypcode.OpFormatSpecial object>, pypcode.pypcode_native.OpCode.RETURN: <pypcode.OpFormatSpecial object>, pypcode.pypcode_native.OpCode.INT_EQUAL: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_NOTEQUAL: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_SLESS: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_SLESSEQUAL: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_LESS: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_LESSEQUAL: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_ZEXT: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.INT_SEXT: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.INT_ADD: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_SUB: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_CARRY: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.INT_SCARRY: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.INT_SBORROW: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.INT_2COMP: <pypcode.OpFormatUnary object>, pypcode.pypcode_native.OpCode.INT_NEGATE: <pypcode.OpFormatUnary object>, pypcode.pypcode_native.OpCode.INT_XOR: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_AND: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_OR: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_LEFT: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_RIGHT: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_SRIGHT: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_MULT: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_DIV: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_SDIV: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_REM: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.INT_SREM: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.BOOL_NEGATE: <pypcode.OpFormatUnary object>, pypcode.pypcode_native.OpCode.BOOL_XOR: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.BOOL_AND: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.BOOL_OR: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.FLOAT_EQUAL: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.FLOAT_NOTEQUAL: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.FLOAT_LESS: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.FLOAT_LESSEQUAL: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.FLOAT_NAN: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.FLOAT_ADD: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.FLOAT_DIV: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.FLOAT_MULT: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.FLOAT_SUB: <pypcode.OpFormatBinary object>, pypcode.pypcode_native.OpCode.FLOAT_NEG: <pypcode.OpFormatUnary object>, pypcode.pypcode_native.OpCode.FLOAT_ABS: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.FLOAT_SQRT: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.FLOAT_INT2FLOAT: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.FLOAT_FLOAT2FLOAT: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.FLOAT_TRUNC: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.FLOAT_CEIL: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.FLOAT_FLOOR: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.FLOAT_ROUND: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.CPOOLREF: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.NEW: <pypcode.OpFormatFunc object>, pypcode.pypcode_native.OpCode.POPCOUNT: <pypcode.OpFormatFunc object>}#
- class pypcode.TranslateFlags(value)[source]#
Bases:
IntEnum
Flags that can be passed to Context::translate
- BB_TERMINATING = 1#
- class pypcode.Translation#
Bases:
object
P-Code translation.
- property ops#
The translated sequence of P-Code ops.
- exception pypcode.UnimplError#
Bases:
Exception
- args#
- with_traceback()#
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class pypcode.Varnode#
Bases:
object
Data defining a specific memory location.
- getRegisterName#
Return the register name if this Varnode references a register, otherwise return the empty string.
- getSpaceFromConst#
Recover encoded address space from constant value.
- property offset#
The offset within the space.
- property size#
The number of bytes in the location.
- property space#
The address space.