angr.analyses.reassembler

exception angr.analyses.reassembler.BinaryError

Bases: Exception

exception angr.analyses.reassembler.InstructionError

Bases: BinaryError

exception angr.analyses.reassembler.ReassemblerFailureNotice

Bases: BinaryError

angr.analyses.reassembler.string_escape(s)
angr.analyses.reassembler.fill_reg_map()
angr.analyses.reassembler.split_operands(s)
angr.analyses.reassembler.is_hex(s)
class angr.analyses.reassembler.Label

Bases: object

g_label_ctr = count(0)
__init__(binary, name, original_addr=None)
property operand_str
property offset
static new_label(binary, name=None, function_name=None, original_addr=None, data_label=False)
class angr.analyses.reassembler.DataLabel

Bases: Label

__init__(binary, original_addr, name=None)
property operand_str
class angr.analyses.reassembler.FunctionLabel

Bases: Label

__init__(binary, function_name, original_addr, plt=False)
property function_name
property operand_str
class angr.analyses.reassembler.ObjectLabel

Bases: Label

__init__(binary, symbol_name, original_addr, plt=False)
property symbol_name
property operand_str
class angr.analyses.reassembler.NotypeLabel

Bases: Label

__init__(binary, symbol_name, original_addr, plt=False)
property symbol_name
property operand_str
class angr.analyses.reassembler.SymbolManager

Bases: object

SymbolManager manages all symbols in the binary.

__init__(binary, cfg)

Constructor.

Parameters:
  • binary (Reassembler) – The Binary analysis instance.

  • cfg (angr.analyses.CFG) – The CFG analysis instance.

Returns:

None

get_unique_symbol_name(symbol_name)
new_label(addr, name=None, is_function=None, force=False)
label_got(addr, label)

Mark a certain label as assigned (to an instruction or a block of data).

Parameters:
Returns:

None

class angr.analyses.reassembler.Operand

Bases: object

__init__(binary, insn_addr, insn_size, capstone_operand, operand_str, mnemonic, operand_offset, syntax=None)

Constructor.

Parameters:
  • binary (Reassembler) – The Binary analysis.

  • insn_addr (int) – Address of the instruction.

  • capstone_operand

  • operand_str (str) – the string representation of this operand

  • mnemonic (str) – Mnemonic of the instruction that this operand belongs to.

  • operand_offset (int) – offset of the operand into the instruction.

  • syntax (str) – Provide a way to override the default syntax coming from binary.

Returns:

None

assembly()
property is_immediate
property symbolized
class angr.analyses.reassembler.Instruction

Bases: object

High-level representation of an instruction in the binary

__init__(binary, addr, size, insn_bytes, capstone_instr)
Parameters:
  • binary (Reassembler) – The Binary analysis

  • addr (int) – Address of the instruction

  • size (int) – Size of the instruction

  • insn_bytes (str) – Instruction bytes

  • capstone_instr – Capstone Instr object.

Returns:

None

assign_labels()
dbg_comments()
assembly(comments=False, symbolized=True)
Returns:

class angr.analyses.reassembler.BasicBlock

Bases: object

BasicBlock represents a basic block in the binary.

__init__(binary, addr, size, x86_getpc_retsite=False)

Constructor.

Parameters:
  • binary (Reassembler) – The Binary analysis.

  • addr (int) – Address of the block

  • size (int) – Size of the block

  • x86_getpc_retsite (bool)

Returns:

None

assign_labels()
assembly(comments=False, symbolized=True)
instruction_addresses()
class angr.analyses.reassembler.Procedure

Bases: object

Procedure in the binary.

__init__(binary, function=None, addr=None, size=None, name=None, section='.text', asm_code=None)

Constructor.

Parameters:
  • binary (Reassembler) – The Binary analysis.

  • function (angr.knowledge.Function) – The function it represents

  • addr (int) – Address of the function. Not required if function is provided.

  • size (int) – Size of the function. Not required if function is provided.

  • section (str) – Which section this function comes from.

Returns:

None

property name

Get function name from the labels of the very first block. :return: Function name if there is any, None otherwise :rtype: string

property is_plt

If this function is a PLT entry or not. :return: True if this function is a PLT entry, False otherwise :rtype: bool

assign_labels()
assembly(comments=False, symbolized=True)

Get the assembly manifest of the procedure.

Parameters:
  • comments

  • symbolized

Returns:

A list of tuples (address, basic block assembly), ordered by basic block addresses

Return type:

list

instruction_addresses()

Get all instruction addresses in the binary.

Returns:

A list of sorted instruction addresses.

Return type:

list

class angr.analyses.reassembler.ProcedureChunk

Bases: Procedure

Procedure chunk.

__init__(project, addr, size)

Constructor.

Parameters:
  • project

  • addr

  • size

Returns:

class angr.analyses.reassembler.Data

Bases: object

__init__(binary, memory_data=None, section=None, section_name=None, name=None, size=None, sort=None, addr=None, initial_content=None)
property content
shrink(new_size)

Reduce the size of this block

Parameters:

new_size (int) – The new size

Returns:

None

desymbolize()

We believe this was a pointer and symbolized it before. Now we want to desymbolize it.

The following actions are performed: - Reload content from memory - Mark the sort as ‘unknown’

Returns:

None

assign_labels()
assembly(comments=False, symbolized=True)
class angr.analyses.reassembler.Relocation

Bases: object

__init__(addr, ref_addr, sort)
class angr.analyses.reassembler.Reassembler

Bases: Analysis

High-level representation of a binary with a linear representation of all instructions and data regions. After calling “symbolize”, it essentially acts as a binary reassembler.

Tested on CGC, x86 and x86-64 binaries.

Disclaimer: The reassembler is an empirical solution. Don’t be surprised if it does not work on some binaries.

__init__(syntax='intel', remove_cgc_attachments=True, log_relocations=True)
property instructions

Get a list of all instructions in the binary

Returns:

A list of (address, instruction)

Return type:

tuple

property relocations
property inserted_asm_before_label
property inserted_asm_after_label
property main_executable_regions

return:

property main_nonexecutable_regions

return:

section_alignment(section_name)

Get the alignment for the specific section. If the section is not found, 16 is used as default.

Parameters:

section_name (str) – The section.

Returns:

The alignment in bytes.

Return type:

int

main_executable_regions_contain(addr)
Parameters:

addr

Returns:

main_executable_region_limbos_contain(addr)

Sometimes there exists a pointer that points to a few bytes before the beginning of a section, or a few bytes after the beginning of the section. We take care of that here.

Parameters:

addr (int) – The address to check.

Returns:

A 2-tuple of (bool, the closest base address)

Return type:

tuple

main_nonexecutable_regions_contain(addr)
Parameters:

addr (int) – The address to check.

Returns:

True if the address is inside a non-executable region, False otherwise.

Return type:

bool

main_nonexecutable_region_limbos_contain(addr, tolerance_before=64, tolerance_after=64)

Sometimes there exists a pointer that points to a few bytes before the beginning of a section, or a few bytes after the beginning of the section. We take care of that here.

Parameters:

addr (int) – The address to check.

Returns:

A 2-tuple of (bool, the closest base address)

Return type:

tuple

register_instruction_reference(insn_addr, ref_addr, sort, operand_offset)
register_data_reference(data_addr, ref_addr)
add_label(name, addr)

Add a new label to the symbol manager.

Parameters:
  • name (str) – Name of the label.

  • addr (int) – Address of the label.

Returns:

None

insert_asm(addr, asm_code, before_label=False)

Insert some assembly code at the specific address. There must be an instruction starting at that address.

Parameters:
  • addr (int) – Address of insertion

  • asm_code (str) – The assembly code to insert

Returns:

None

append_procedure(name, asm_code)

Add a new procedure with specific name and assembly code.

Parameters:
  • name (str) – The name of the new procedure.

  • asm_code (str) – The assembly code of the procedure

Returns:

None

append_data(name, initial_content, size, readonly=False, sort='unknown')

Append a new data entry into the binary with specific name, content, and size.

Parameters:
  • name (str) – Name of the data entry. Will be used as the label.

  • initial_content (bytes) – The initial content of the data entry.

  • size (int) – Size of the data entry.

  • readonly (bool) – If the data entry belongs to the readonly region.

  • sort (str) – Type of the data.

Returns:

None

remove_instruction(ins_addr)
Parameters:

ins_addr

Returns:

randomize_procedures()
Returns:

symbolize()
assembly(comments=False, symbolized=True)
remove_cgc_attachments()

Remove CGC attachments.

Returns:

True if CGC attachments are found and removed, False otherwise

Return type:

bool

remove_unnecessary_stuff()

Remove unnecessary functions and data

Returns:

None

remove_unnecessary_stuff_glibc()
fast_memory_load(addr, size, data_type, endness='Iend_LE')

Load memory bytes from loader’s memory backend.

Parameters:
  • addr (int) – The address to begin memory loading.

  • size (int) – Size in bytes.

  • data_type – Type of the data.

  • endness (str) – Endianness of this memory load.

Returns:

Data read out of the memory.

Return type:

int or bytes or str or None