Backend Interface

class cle.backends.backend.FunctionHintSource[source]

Bases: object

Enums that describe the source of function hints.

EH_FRAME = 0
EXTERNAL_EH_FRAME = 1
class cle.backends.backend.FunctionHint[source]

Bases: object

Describes a function hint.

Variables:
  • addr (int) – Address of the function.

  • size (int) – Size of the function.

  • source (int) – Source of this hint.

__init__(addr, size, source)[source]
addr
size
source
class cle.backends.backend.ExceptionHandling[source]

Bases: object

Describes an exception handling.

Exception handlers are usually language-specific. In C++, it is usually implemented as try {} catch {} blocks.

Variables:
  • start_addr (int) – The beginning of the try block.

  • size (int) – Size of the try block.

  • handler_addr (Optional[int]) – Address of the exception handler code.

  • type – Type of the exception handler. Optional.

  • func_addr (Optional[int]) – Address of the function. Optional.

__init__(start_addr, size, handler_addr=None, type_=None, func_addr=None)[source]
start_addr
size
handler_addr
type
func_addr
class cle.backends.backend.Backend[source]

Bases: object

Main base class for CLE binary objects.

An alternate interface to this constructor exists as the static method cle.loader.Loader.load_object()

Variables:
  • binary – The path to the file this object is loaded from

  • binary_basename – The basename of the filepath, or a short representation of the stream it was loaded from

  • is_main_bin – Whether this binary is loaded as the main executable

  • segments – A listing of all the loaded segments in this file

  • sections – A listing of all the demarked sections in the file

  • sections_map – A dict mapping from section name to section

  • imports – A mapping from symbol name to import relocation

  • resolved_imports – A list of all the import symbols that are successfully resolved

  • relocs – A list of all the relocations in this binary

  • irelatives – A list of tuples representing all the irelative relocations that need to be performed. The first item in the tuple is the address of the resolver function, and the second item is the address of where to write the result. The destination address is an RVA.

  • jmprel – A mapping from symbol name to the address of its jump slot relocation, i.e. its GOT entry.

  • arch (archinfo.arch.Arch) – The architecture of this binary

  • os (str) – The operating system this binary is meant to run under

  • mapped_base (int) – The base address of this object in virtual memory

  • deps – A list of names of shared libraries this binary depends on

  • linking – ‘dynamic’ or ‘static’

  • linked_base – The base address this object requests to be loaded at

  • pic (bool) – Whether this object is position-independent

  • execstack (bool) – Whether this executable has an executable stack

  • provides (str) – The name of the shared library dependancy that this object resolves

  • symbols (list) – A list of symbols provided by this object, sorted by address

  • has_memory – Whether this backend is backed by a Clemory or not. As it stands now, a backend should still define min_addr and max_addr even if has_memory is False.

is_default = False
is_outer = False
__init__(binary, binary_stream, loader=None, is_main_bin=False, entry_point=None, arch=None, base_addr=None, force_rebase=False, has_memory=True, **kwargs)[source]
Parameters:
  • binary – The path to the binary to load

  • binary_stream – The open stream to this binary. The reference to this will be held until you call close.

  • is_main_bin – Whether this binary should be loaded as the main executable

property arch: Arch
property loader: Loader
close() None[source]
Return type:

None

set_arch(arch)[source]
set_load_args(**kwargs) None[source]
Return type:

None

property image_base_delta
property entry
property segments: Regions[Segment]
property sections: Regions[Section]
property symbols_by_addr
rebase(new_base)[source]

Rebase backend’s regions to the new base where they were mapped by the loader

relocate()[source]

Apply all resolved relocations to memory.

The meaning of “resolved relocations” is somewhat subtle - there is a linking step which attempts to resolve each relocation, currently only present in the main internal loading function since the calculation of which objects should be available

contains_addr(addr)[source]

Is addr in one of the binary’s segments/sections we have loaded? (i.e. is it mapped into memory ?)

find_loadable_containing(addr)[source]
find_segment_containing(addr: int) Segment | None[source]

Returns the segment that contains addr, or None.

Return type:

Segment | None

Parameters:

addr (int)

find_section_containing(addr: int) Section | None[source]

Returns the section that contains addr or None.

Return type:

Section | None

Parameters:

addr (int)

addr_to_offset(addr: int) int | None[source]
Return type:

int | None

Parameters:

addr (int)

offset_to_addr(offset: int) int | None[source]
Return type:

int | None

Parameters:

offset (int)

property min_addr: int

This returns the lowest virtual address contained in any loaded segment of the binary.

property max_addr: int

This returns the highest virtual address contained in any loaded segment of the binary.

property initializers: list[int]

Stub function. Should be overridden by backends that can provide initializer functions that ought to be run before execution reaches the entry point. Addresses should be rebased.

property finalizers: list[int]

Stub function. Like initializers, but with finalizers.

property threads: list

If this backend represents a dump of a running program, it may contain one or more thread contexts, i.e. register files. This property should contain a list of names for these threads, which should be unique.

thread_registers(thread=None) dict[str, Any][source]

If this backend represents a dump of a running program, it may contain one or more thread contexts, i.e. register files. This method should return the register file for a given thread (as named in Backend.threads) as a dict mapping register names (as seen in archinfo) to numbers. If the thread is not specified, it should return the context for a “default” thread. If there are no threads, it should return an empty dict.

Return type:

dict[str, Any]

initial_register_values()[source]

Deprecated

get_symbol(name: str) Symbol | None[source]

Stub function. Implement to find the symbol with name name.

Return type:

Symbol | None

Parameters:

name (str)

static extract_soname(path) str | None[source]

Extracts the shared object identifier from the path, or returns None if it cannot.

Return type:

str | None

classmethod is_compatible(stream) bool[source]

Determine quickly whether this backend can load an object from this stream

Return type:

bool

classmethod check_compatibility(spec, obj) bool[source]

Performs a minimal static load of spec and returns whether it’s compatible with other_obj

Return type:

bool

classmethod check_magic_compatibility(stream: BinaryIO) bool[source]

Check if a stream of bytes contains the same magic number as the main object

Return type:

bool

Parameters:

stream (BinaryIO)

cle.backends.backend.register_backend(name, cls)[source]
class cle.backends.symbol.SymbolType[source]

Bases: Enum

ABI-agnostic symbol types

TYPE_OTHER = 0
TYPE_NONE = 1
TYPE_FUNCTION = 2
TYPE_OBJECT = 3
TYPE_SECTION = 4
TYPE_TLS_OBJECT = 5
class cle.backends.symbol.SymbolSubType[source]

Bases: Enum

Abstract base class for ABI-specific symbol types

to_base_type() SymbolType[source]

A subclass’ ABI-specific mapping to :SymbolType:

Return type:

SymbolType

class cle.backends.symbol.Symbol[source]

Bases: object

Representation of a symbol from a binary file. Smart enough to rebase itself.

There should never be more than one Symbol instance representing a single symbol. To make sure of this, only use the cle.backends.Backend.get_symbol() to create new symbols.

Variables:
  • owner (cle.backends.Backend) – The object that contains this symbol

  • name (str) – The name of this symbol

  • addr (int) – The un-based address of this symbol, an RVA

  • size (int) – The size of this symbol

  • _type – The ABI-agnostic type of this symbol

  • resolved (bool) – Whether this import symbol has been resolved to a real symbol

  • resolvedby (None or cle.backends.Symbol) – The real symbol this import symbol has been resolve to

  • resolvewith (str) – The name of the library we must use to resolve this symbol, or None if none is required.

__init__(owner: Backend, name: str, relative_addr: int, size: int, sym_type: SymbolType)[source]

Not documenting this since if you try calling it, you’re wrong.

Parameters:
resolve(obj)[source]
property type: SymbolType

The ABI-agnostic SymbolType. Must be overridden by derived types.

property subtype: SymbolSubType

A subclass’ ABI-specific types

property rebased_addr

The address of this symbol in the global memory space

property linked_addr
property is_function

Whether this symbol is a function

is_static = False
is_common = False
is_import = False
is_export = False
is_local = False
is_weak = False
is_extern = False
is_forward = False
resolve_forwarder()[source]

If this symbol is a forwarding export, return the symbol the forwarding refers to, or None if it cannot be found

property owner_obj
class cle.backends.regions.Regions[source]

Bases: Generic[R]

A container class acting as a list of regions (sections or segments). Additionally, it keeps an sorted list of all regions that are mapped into memory to allow fast lookups.

We assume none of the regions overlap with others.

__init__(lst: list[R] | None = None)[source]
Parameters:

lst (list[R] | None)

property raw_list: list[R]

Get the internal list. Any change to it is not tracked, and therefore _sorted_list will not be updated. Therefore you probably does not want to modify the list.

Returns:

The internal list container.

Return type:

list

property max_addr: int | None

Get the highest address of all regions.

Returns:

The highest address of all regions, or None if there is no region available.

Return type:

int or None

append(region: R)[source]

Append a new Region instance into the list.

Parameters:

region (TypeVar(R, bound= Region)) – The region to append.

remove(region: R) None[source]

Remove an existing Region instance from the list.

Parameters:

region (TypeVar(R, bound= Region)) – The region to remove.

Return type:

None

find_region_containing(addr: int) R | None[source]

Find the region that contains a specific address. Returns None if none of the regions covers the address.

Parameters:

addr (int) – The address.

Return type:

Optional[TypeVar(R, bound= Region)]

Returns:

The region that covers the specific address, or None if no such region is found.

find_region_next_to(addr: int) R | None[source]

Find the next region after the given address.

Parameters:

addr (int) – The address to test.

Return type:

Optional[TypeVar(R, bound= Region)]

Returns:

The next region that goes after the given address, or None if there is no section after the address,

class cle.backends.region.Region[source]

Bases: object

A region of memory that is mapped in the object’s file.

Variables:
  • offset – The offset into the file the region starts.

  • vaddr – The virtual address.

  • filesize – The size of the region in the file.

  • memsize – The size of the region when loaded into memory.

The prefix v- on a variable or parameter name indicates that it refers to the virtual, loaded memory space, while a corresponding variable without the v- refers to the flat zero-based memory of the file.

When used next to each other, addr and offset refer to virtual memory address and file offset, respectively.

__init__(offset, vaddr, filesize, memsize)[source]
vaddr: int
memsize: int
filesize: int
contains_addr(addr)[source]

Does this region contain this virtual address?

contains_offset(offset)[source]

Does this region contain this offset into the file?

addr_to_offset(addr)[source]

Convert a virtual memory address into a file offset

offset_to_addr(offset)[source]

Convert a file offset into a virtual memory address

property max_addr

The maximum virtual address of this region

property min_addr

The minimum virtual address of this region

property max_offset

The maximum file offset of this region

min_offset()[source]

The minimum file offset of this region

property is_readable: bool
property is_writable: bool
property is_executable: bool
class cle.backends.region.Segment[source]

Bases: Region

class cle.backends.region.EmptySegment[source]

Bases: Segment

A segment with no static content, and permissions

__init__(vaddr, memsize, is_readable=True, is_writable=True, is_executable=False)[source]
property is_executable
property is_writable
property is_readable
property only_contains_uninitialized_data

Whether this section is initialized to zero after the executable is loaded.

class cle.backends.region.Section[source]

Bases: Region

Simple representation of a loaded section.

Variables:

name (str) – The name of the section

__init__(name, offset, vaddr, size)[source]
Parameters:
  • name (str) – The name of the section

  • offset (int) – The offset into the binary file this section begins

  • vaddr (int) – The address in virtual memory this section begins

  • size (int) – How large this section is

property is_readable

Whether this section has read permissions

property is_writable

Whether this section has write permissions

property is_executable

Whether this section has execute permissions

property only_contains_uninitialized_data

Whether this section is initialized to zero after the executable is loaded.

class cle.backends.named_region.NamedRegion[source]

Bases: Backend

A NamedRegion represents a region of memory that has a name, a location, but no static content.

This region also has permissions; with no memory, these obviously don’t do anything on their own, but they help inform any other code that relies on CLE (e.g., angr)

This can be used as a placeholder for memory that should exist in CLE’s view, but for which it does not need data, like RAM, MMIO, etc

is_default = False
__init__(name, start, end, is_readable=True, is_writable=True, is_executable=False, **kwargs)[source]

Create a NamedRegion.

Parameters:
  • name – The name of the region

  • start – The start address of the region

  • end – The end address (exclusive) of the region

  • is_readable – Whether the region is readable

  • is_writable – Whether the region is writable

  • is_executable – Whether the region is executable

  • kwargs

has_memory = False
static is_compatible(stream)[source]

Determine quickly whether this backend can load an object from this stream

property min_addr

This returns the lowest virtual address contained in any loaded segment of the binary.

property max_addr

This returns the highest virtual address contained in any loaded segment of the binary.

function_name(addr)[source]

NamedRegions don’t support function names.

contains_addr(addr)[source]

Is addr in one of the binary’s segments/sections we have loaded? (i.e. is it mapped into memory ?)

classmethod check_compatibility(spec, obj)[source]

Performs a minimal static load of spec and returns whether it’s compatible with other_obj

class cle.backends.externs.ExternSegment[source]

Bases: Segment

__init__(map_size)[source]
addr_to_offset(addr)[source]

Convert a virtual memory address into a file offset

offset_to_addr(offset)[source]

Convert a file offset into a virtual memory address

contains_offset(offset)[source]

Does this region contain this offset into the file?

is_readable = True
is_writable = True
is_executable = True
class cle.backends.externs.TOCRelocation[source]

Bases: Relocation

property value
class cle.backends.externs.ExternObject[source]

Bases: Backend

__init__(loader, map_size=0, tls_size=0)[source]
Parameters:
  • binary – The path to the binary to load

  • binary_stream – The open stream to this binary. The reference to this will be held until you call close.

  • is_main_bin – Whether this binary should be loaded as the main executable

rebase(new_base)[source]

Rebase backend’s regions to the new base where they were mapped by the loader

make_extern(name, size=0, alignment=None, thumb=False, sym_type=SymbolType.TYPE_FUNCTION, point_to=None, libname=None) Symbol[source]
Return type:

Symbol

get_pseudo_addr(name) int[source]
Return type:

int

allocate(size=1, alignment=8, thumb=False, tls=False) int[source]
Return type:

int

property max_addr

This returns the highest virtual address contained in any loaded segment of the binary.

make_import(name, sym_type)[source]
class cle.backends.externs.KernelObject[source]

Bases: Backend

__init__(loader, map_size=32768)[source]
Parameters:
  • binary – The path to the binary to load

  • binary_stream – The open stream to this binary. The reference to this will be held until you call close.

  • is_main_bin – Whether this binary should be loaded as the main executable

add_name(name, addr)[source]
property max_addr

This returns the highest virtual address contained in any loaded segment of the binary.

class cle.backends.externs.PointToPrecise[source]

Bases: PointTo

pointto_precise = None
relocations()[source]

Maybe implement me: If you like, return a list of relocation objects to apply. To create new import symbols, use self.owner.make_extern_import.

class cle.backends.externs.simdata.SimData[source]

Bases: Symbol

A SimData class is used to provide data when there is an unresolved data import symbol.

To use it, subclass this class and implement the below attributes and methods.

Variables:
  • name – The name of the symbol to provide

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • type – The type of the symbol, usually SymbolType.TYPE_OBJECT.

Use the below register method to register SimData subclasses with CLE.

NOTE: SimData.type hides the Symbol.type instance property

name: str = NotImplemented
type: SymbolType = NotImplemented
libname: str = NotImplemented
classmethod static_size(owner) int[source]

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

Return type:

int

value() bytes[source]

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

Return type:

bytes

relocations() list[Relocation][source]

Maybe implement me: If you like, return a list of relocation objects to apply. To create new import symbols, use self.owner.make_extern_import.

Return type:

list[Relocation]

cle.backends.externs.simdata.lookup(name: str, libname) type[SimData] | None[source]
Return type:

type[SimData] | None

Parameters:

name (str)

cle.backends.externs.simdata.register(simdata_cls: type[SimData])[source]

Register the given SimData class with CLE so it may be used during loading

Parameters:

simdata_cls (type[SimData])

class cle.backends.externs.simdata.simdata.SimData[source]

Bases: Symbol

A SimData class is used to provide data when there is an unresolved data import symbol.

To use it, subclass this class and implement the below attributes and methods.

Variables:
  • name – The name of the symbol to provide

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • type – The type of the symbol, usually SymbolType.TYPE_OBJECT.

Use the below register method to register SimData subclasses with CLE.

NOTE: SimData.type hides the Symbol.type instance property

name: str = NotImplemented
type: SymbolType = NotImplemented
libname: str = NotImplemented
classmethod static_size(owner) int[source]

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

Return type:

int

value() bytes[source]

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

Return type:

bytes

relocations() list[Relocation][source]

Maybe implement me: If you like, return a list of relocation objects to apply. To create new import symbols, use self.owner.make_extern_import.

Return type:

list[Relocation]

owner: Backend
cle.backends.externs.simdata.simdata.register(simdata_cls: type[SimData])[source]

Register the given SimData class with CLE so it may be used during loading

Parameters:

simdata_cls (type[SimData])

cle.backends.externs.simdata.simdata.lookup(name: str, libname) type[SimData] | None[source]
Return type:

type[SimData] | None

Parameters:

name (str)

class cle.backends.externs.simdata.common.StaticData[source]

Bases: SimData

A simple SimData utility class to use when you have a SimData which should provide just a static set of bytes. To use, implement the following:

Variables:
  • name – The name of the symbol to provide.

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • data – The bytes to provide

type: SymbolType = 3
data: bytes = NotImplemented
classmethod static_size(owner)[source]

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

value()[source]

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

class cle.backends.externs.simdata.common.StaticWord[source]

Bases: SimData

A simple SimData utility class to use when you have a SimData which should provide just a static integer. To use, implement the following:

Variables:
  • name – The name of the symbol to provide.

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • word – The value to provide

  • wordsize – (optional) The size of the value in bytes, default the CPU wordsize

type: SymbolType = 3
word: int = NotImplemented
wordsize: int = None
classmethod static_size(owner)[source]

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

value()[source]

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

class cle.backends.externs.simdata.common.PointTo[source]

Bases: SimData

A simple SimData utility class to use when you have a SimData which should provide just a pointer to some other symbol. To use, implement the following:

Variables:
  • name – The name of the symbol to provide.

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • pointto_name – The name of the symbol to point to

  • pointto_type – The type of the symbol to point to (usually SymbolType.TYPE_FUNCTION or SymbolType.TYPE_OBJECT)

  • addend – (optional) an integer to be added to the symbol’s address before storage

pointto_name: str = NotImplemented
pointto_type: SymbolType = NotImplemented
type: SymbolType = 3
addend: int = 0
classmethod static_size(owner)[source]

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

value()[source]

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

relocations()[source]

Maybe implement me: If you like, return a list of relocation objects to apply. To create new import symbols, use self.owner.make_extern_import.

class cle.backends.externs.simdata.common.SimDataSimpleRelocation[source]

Bases: Relocation

A relocation used to implement PointTo. Pretty simple.

__init__(owner, symbol, addr, addend, preresolved=False)[source]
resolve_symbol(solist, **kwargs)[source]
property value