Backend Interface#

class cle.backends.backend.FunctionHintSource[source]#

Bases: object

Enums that describe the source of function hints.

EH_FRAME = 0#
EXTERNAL_EH_FRAME = 1#
class cle.backends.backend.FunctionHint[source]#

Bases: object

Describes a function hint.

Variables:
  • addr (int) – Address of the function.

  • size (int) – Size of the function.

  • source (int) – Source of this hint.

__init__(addr, size, source)[source]#
addr#
size#
source#
class cle.backends.backend.ExceptionHandling[source]#

Bases: object

Describes an exception handling.

Exception handlers are usually language-specific. In C++, it is usually implemented as try {} catch {} blocks.

Variables:
  • start_addr (int) – The beginning of the try block.

  • size (int) – Size of the try block.

  • handler_addr (Optional[int]) – Address of the exception handler code.

  • type – Type of the exception handler. Optional.

  • func_addr (Optional[int]) – Address of the function. Optional.

__init__(start_addr, size, handler_addr=None, type_=None, func_addr=None)[source]#
start_addr#
size#
handler_addr#
type#
func_addr#
class cle.backends.backend.Backend[source]#

Bases: object

Main base class for CLE binary objects.

An alternate interface to this constructor exists as the static method cle.loader.Loader.load_object()

Variables:
  • binary – The path to the file this object is loaded from

  • binary_basename – The basename of the filepath, or a short representation of the stream it was loaded from

  • is_main_bin – Whether this binary is loaded as the main executable

  • segments – A listing of all the loaded segments in this file

  • sections – A listing of all the demarked sections in the file

  • sections_map – A dict mapping from section name to section

  • imports – A mapping from symbol name to import relocation

  • resolved_imports – A list of all the import symbols that are successfully resolved

  • relocs – A list of all the relocations in this binary

  • irelatives – A list of tuples representing all the irelative relocations that need to be performed. The first item in the tuple is the address of the resolver function, and the second item is the address of where to write the result. The destination address is an RVA.

  • jmprel – A mapping from symbol name to the address of its jump slot relocation, i.e. its GOT entry.

  • arch (archinfo.arch.Arch) – The architecture of this binary

  • os (str) – The operating system this binary is meant to run under

  • mapped_base (int) – The base address of this object in virtual memory

  • deps – A list of names of shared libraries this binary depends on

  • linking – ‘dynamic’ or ‘static’

  • linked_base – The base address this object requests to be loaded at

  • pic (bool) – Whether this object is position-independent

  • execstack (bool) – Whether this executable has an executable stack

  • provides (str) – The name of the shared library dependancy that this object resolves

  • symbols (list) – A list of symbols provided by this object, sorted by address

  • has_memory – Whether this backend is backed by a Clemory or not. As it stands now, a backend should still define min_addr and max_addr even if has_memory is False.

is_default = False#
__init__(binary, binary_stream, loader=None, is_main_bin=False, entry_point=None, arch=None, base_addr=None, force_rebase=False, has_memory=True, **kwargs)[source]#
Parameters:
  • binary – The path to the binary to load

  • binary_stream – The open stream to this binary. The reference to this will be held until you call close.

  • is_main_bin – Whether this binary should be loaded as the main executable

property arch: Arch#
property loader: Loader#
close() None[source]#
Return type:

None

set_arch(arch)[source]#
property image_base_delta#
property entry#
property segments: Regions[Segment]#
property sections: Regions[Section]#
property symbols_by_addr#
rebase(new_base)[source]#

Rebase backend’s regions to the new base where they were mapped by the loader

relocate()[source]#

Apply all resolved relocations to memory.

The meaning of “resolved relocations” is somewhat subtle - there is a linking step which attempts to resolve each relocation, currently only present in the main internal loading function since the calculation of which objects should be available

contains_addr(addr)[source]#

Is addr in one of the binary’s segments/sections we have loaded? (i.e. is it mapped into memory ?)

find_loadable_containing(addr)[source]#
find_segment_containing(addr: int) Segment | None[source]#

Returns the segment that contains addr, or None.

Return type:

Optional[Segment]

Parameters:

addr (int) –

find_section_containing(addr: int) Section | None[source]#

Returns the section that contains addr or None.

Return type:

Optional[Section]

Parameters:

addr (int) –

addr_to_offset(addr: int) int | None[source]#
Return type:

Optional[int]

Parameters:

addr (int) –

offset_to_addr(offset: int) int | None[source]#
Return type:

Optional[int]

Parameters:

offset (int) –

property min_addr: int#

This returns the lowest virtual address contained in any loaded segment of the binary.

property max_addr: int#

This returns the highest virtual address contained in any loaded segment of the binary.

property initializers: List[int]#

Stub function. Should be overridden by backends that can provide initializer functions that ought to be run before execution reaches the entry point. Addresses should be rebased.

property finalizers: List[int]#

Stub function. Like initializers, but with finalizers.

property threads: List#

If this backend represents a dump of a running program, it may contain one or more thread contexts, i.e. register files. This property should contain a list of names for these threads, which should be unique.

thread_registers(thread=None) Dict[str, Any][source]#

If this backend represents a dump of a running program, it may contain one or more thread contexts, i.e. register files. This method should return the register file for a given thread (as named in Backend.threads) as a dict mapping register names (as seen in archinfo) to numbers. If the thread is not specified, it should return the context for a “default” thread. If there are no threads, it should return an empty dict.

Return type:

Dict[str, Any]

initial_register_values()[source]#

Deprecated

get_symbol(name: str) Symbol | None[source]#

Stub function. Implement to find the symbol with name name.

Return type:

Optional[Symbol]

Parameters:

name (str) –

static extract_soname(path) str | None[source]#

Extracts the shared object identifier from the path, or returns None if it cannot.

Return type:

Optional[str]

classmethod is_compatible(stream) bool[source]#

Determine quickly whether this backend can load an object from this stream

Return type:

bool

classmethod check_compatibility(spec, obj) bool[source]#

Performs a minimal static load of spec and returns whether it’s compatible with other_obj

Return type:

bool

classmethod check_magic_compatibility(stream: BinaryIO) bool[source]#

Check if a stream of bytes contains the same magic number as the main object

Return type:

bool

Parameters:

stream (BinaryIO) –

cle.backends.backend.register_backend(name, cls)[source]#
class cle.backends.symbol.SymbolType[source]#

Bases: Enum

ABI-agnostic symbol types

TYPE_OTHER = 0#
TYPE_NONE = 1#
TYPE_FUNCTION = 2#
TYPE_OBJECT = 3#
TYPE_SECTION = 4#
TYPE_TLS_OBJECT = 5#
class cle.backends.symbol.SymbolSubType[source]#

Bases: Enum

Abstract base class for ABI-specific symbol types

to_base_type() SymbolType[source]#

A subclass’ ABI-specific mapping to :SymbolType:

Return type:

SymbolType

class cle.backends.symbol.Symbol[source]#

Bases: object

Representation of a symbol from a binary file. Smart enough to rebase itself.

There should never be more than one Symbol instance representing a single symbol. To make sure of this, only use the cle.backends.Backend.get_symbol() to create new symbols.

Variables:
  • owner (cle.backends.Backend) – The object that contains this symbol

  • name (str) – The name of this symbol

  • addr (int) – The un-based address of this symbol, an RVA

  • size (int) – The size of this symbol

  • _type – The ABI-agnostic type of this symbol

  • resolved (bool) – Whether this import symbol has been resolved to a real symbol

  • resolvedby (None or cle.backends.Symbol) – The real symbol this import symbol has been resolve to

  • resolvewith (str) – The name of the library we must use to resolve this symbol, or None if none is required.

__init__(owner: Backend, name: str, relative_addr: int, size: int, sym_type: SymbolType)[source]#

Not documenting this since if you try calling it, you’re wrong.

Parameters:
resolve(obj)[source]#
property type: SymbolType#

The ABI-agnostic SymbolType. Must be overridden by derived types.

property subtype: SymbolSubType#

A subclass’ ABI-specific types

property rebased_addr#

The address of this symbol in the global memory space

property linked_addr#
property is_function#

Whether this symbol is a function

is_static = False#
is_common = False#
is_import = False#
is_export = False#
is_local = False#
is_weak = False#
is_extern = False#
is_forward = False#
resolve_forwarder()[source]#

If this symbol is a forwarding export, return the symbol the forwarding refers to, or None if it cannot be found

property owner_obj#
class cle.backends.regions.Regions[source]#

Bases: Generic[R]

A container class acting as a list of regions (sections or segments). Additionally, it keeps an sorted list of all regions that are mapped into memory to allow fast lookups.

We assume none of the regions overlap with others.

__init__(lst: List[R] | None = None)[source]#
Parameters:

lst (List[R] | None) –

property raw_list: List[R]#

Get the internal list. Any change to it is not tracked, and therefore _sorted_list will not be updated. Therefore you probably does not want to modify the list.

Returns:

The internal list container.

Return type:

list

property max_addr: int | None#

Get the highest address of all regions.

Returns:

The highest address of all regions, or None if there is no region available.

Return type:

int or None

append(region: R)[source]#

Append a new Region instance into the list.

Parameters:

region (TypeVar(R, bound= Region)) – The region to append.

remove(region: R) None[source]#

Remove an existing Region instance from the list.

Parameters:

region (TypeVar(R, bound= Region)) – The region to remove.

Return type:

None

find_region_containing(addr: int) R | None[source]#

Find the region that contains a specific address. Returns None if none of the regions covers the address.

Parameters:

addr (int) – The address.

Return type:

Optional[TypeVar(R, bound= Region)]

Returns:

The region that covers the specific address, or None if no such region is found.

find_region_next_to(addr: int) R | None[source]#

Find the next region after the given address.

Parameters:

addr (int) – The address to test.

Return type:

Optional[TypeVar(R, bound= Region)]

Returns:

The next region that goes after the given address, or None if there is no section after the address,

class cle.backends.region.Region[source]#

Bases: object

A region of memory that is mapped in the object’s file.

Variables:
  • offset – The offset into the file the region starts.

  • vaddr – The virtual address.

  • filesize – The size of the region in the file.

  • memsize – The size of the region when loaded into memory.

The prefix v- on a variable or parameter name indicates that it refers to the virtual, loaded memory space, while a corresponding variable without the v- refers to the flat zero-based memory of the file.

When used next to each other, addr and offset refer to virtual memory address and file offset, respectively.

__init__(offset, vaddr, filesize, memsize)[source]#
vaddr: int#
memsize: int#
filesize: int#
contains_addr(addr)[source]#

Does this region contain this virtual address?

contains_offset(offset)[source]#

Does this region contain this offset into the file?

addr_to_offset(addr)[source]#

Convert a virtual memory address into a file offset

offset_to_addr(offset)[source]#

Convert a file offset into a virtual memory address

property max_addr#

The maximum virtual address of this region

property min_addr#

The minimum virtual address of this region

property max_offset#

The maximum file offset of this region

min_offset()[source]#

The minimum file offset of this region

property is_readable: bool#
property is_writable: bool#
property is_executable: bool#
class cle.backends.region.Segment[source]#

Bases: Region

vaddr: int#
memsize: int#
filesize: int#
class cle.backends.region.EmptySegment[source]#

Bases: Segment

A segment with no static content, and permissions

__init__(vaddr, memsize, is_readable=True, is_writable=True, is_executable=False)[source]#
property is_executable#
property is_writable#
property is_readable#
property only_contains_uninitialized_data#

Whether this section is initialized to zero after the executable is loaded.

vaddr: int#
memsize: int#
filesize: int#
class cle.backends.region.Section[source]#

Bases: Region

Simple representation of a loaded section.

Variables:

name (str) – The name of the section

__init__(name, offset, vaddr, size)[source]#
Parameters:
  • name (str) – The name of the section

  • offset (int) – The offset into the binary file this section begins

  • vaddr (int) – The address in virtual memory this section begins

  • size (int) – How large this section is

property is_readable#

Whether this section has read permissions

property is_writable#

Whether this section has write permissions

property is_executable#

Whether this section has execute permissions

vaddr: int#
memsize: int#
filesize: int#
property only_contains_uninitialized_data#

Whether this section is initialized to zero after the executable is loaded.

class cle.backends.named_region.NamedRegion[source]#

Bases: Backend

A NamedRegion represents a region of memory that has a name, a location, but no static content.

This region also has permissions; with no memory, these obviously don’t do anything on their own, but they help inform any other code that relies on CLE (e.g., angr)

This can be used as a placeholder for memory that should exist in CLE’s view, but for which it does not need data, like RAM, MMIO, etc

is_default = False#
__init__(name, start, end, is_readable=True, is_writable=True, is_executable=False, **kwargs)[source]#

Create a NamedRegion.

Parameters:
  • name – The name of the region

  • start – The start address of the region

  • end – The end address (exclusive) of the region

  • is_readable – Whether the region is readable

  • is_writable – Whether the region is writable

  • is_executable – Whether the region is executable

  • kwargs

has_memory = False#
static is_compatible(stream)[source]#

Determine quickly whether this backend can load an object from this stream

property min_addr#

This returns the lowest virtual address contained in any loaded segment of the binary.

property max_addr#

This returns the highest virtual address contained in any loaded segment of the binary.

function_name(addr)[source]#

NamedRegions don’t support function names.

contains_addr(addr)[source]#

Is addr in one of the binary’s segments/sections we have loaded? (i.e. is it mapped into memory ?)

classmethod check_compatibility(spec, obj)[source]#

Performs a minimal static load of spec and returns whether it’s compatible with other_obj

imports: typing.Dict[str, 'Relocation']#
relocs: List[Relocation]#
child_objects: List['Backend']#
exception_handlings: List[ExceptionHandling]#
function_hints: List[FunctionHint]#
memory: Clemory#
cached_content: Optional[bytes]#
class cle.backends.externs.ExternSegment[source]#

Bases: Segment

__init__(map_size)[source]#
addr_to_offset(addr)[source]#

Convert a virtual memory address into a file offset

offset_to_addr(offset)[source]#

Convert a file offset into a virtual memory address

contains_offset(offset)[source]#

Does this region contain this offset into the file?

is_readable = True#
is_writable = True#
is_executable = True#
vaddr: int#
memsize: int#
filesize: int#
class cle.backends.externs.TOCRelocation[source]#

Bases: Relocation

property value#
class cle.backends.externs.ExternObject[source]#

Bases: Backend

__init__(loader, map_size=0, tls_size=0)[source]#
Parameters:
  • binary – The path to the binary to load

  • binary_stream – The open stream to this binary. The reference to this will be held until you call close.

  • is_main_bin – Whether this binary should be loaded as the main executable

rebase(new_base)[source]#

Rebase backend’s regions to the new base where they were mapped by the loader

make_extern(name, size=0, alignment=None, thumb=False, sym_type=SymbolType.TYPE_FUNCTION, point_to=None, libname=None) Symbol[source]#
Return type:

Symbol

get_pseudo_addr(name) int[source]#
Return type:

int

allocate(size=1, alignment=8, thumb=False, tls=False) int[source]#
Return type:

int

property max_addr#

This returns the highest virtual address contained in any loaded segment of the binary.

make_import(name, sym_type)[source]#
imports: typing.Dict[str, 'Relocation']#
relocs: List[Relocation]#
child_objects: List['Backend']#
exception_handlings: List[ExceptionHandling]#
function_hints: List[FunctionHint]#
memory: Clemory#
cached_content: Optional[bytes]#
class cle.backends.externs.KernelObject[source]#

Bases: Backend

__init__(loader, map_size=32768)[source]#
Parameters:
  • binary – The path to the binary to load

  • binary_stream – The open stream to this binary. The reference to this will be held until you call close.

  • is_main_bin – Whether this binary should be loaded as the main executable

add_name(name, addr)[source]#
property max_addr#

This returns the highest virtual address contained in any loaded segment of the binary.

imports: typing.Dict[str, 'Relocation']#
relocs: List[Relocation]#
child_objects: List['Backend']#
exception_handlings: List[ExceptionHandling]#
function_hints: List[FunctionHint]#
memory: Clemory#
cached_content: Optional[bytes]#
class cle.backends.externs.PointToPrecise[source]#

Bases: PointTo

pointto_precise = None#
relocations()[source]#

Maybe implement me: If you like, return a list of relocation objects to apply. To create new import symbols, use self.owner.make_extern_import.

class cle.backends.externs.simdata.SimData[source]#

Bases: Symbol

A SimData class is used to provide data when there is an unresolved data import symbol.

To use it, subclass this class and implement the below attributes and methods.

Variables:
  • name – The name of the symbol to provide

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • type – The type of the symbol, usually SymbolType.TYPE_OBJECT.

Use the below register method to register SimData subclasses with CLE.

NOTE: SimData.type hides the Symbol.type instance property

name: str = NotImplemented#
type: SymbolType = NotImplemented#
libname: str = NotImplemented#
classmethod static_size(owner) int[source]#

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

Return type:

int

value() bytes[source]#

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

Return type:

bytes

relocations() List[Relocation][source]#

Maybe implement me: If you like, return a list of relocation objects to apply. To create new import symbols, use self.owner.make_extern_import.

Return type:

List[Relocation]

cle.backends.externs.simdata.lookup(name: str, libname) Type[SimData] | None[source]#
Return type:

Optional[Type[SimData]]

Parameters:

name (str) –

cle.backends.externs.simdata.register(simdata_cls: Type[SimData])[source]#

Register the given SimData class with CLE so it may be used during loading

Parameters:

simdata_cls (Type[SimData]) –

class cle.backends.externs.simdata.simdata.SimData[source]#

Bases: Symbol

A SimData class is used to provide data when there is an unresolved data import symbol.

To use it, subclass this class and implement the below attributes and methods.

Variables:
  • name – The name of the symbol to provide

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • type – The type of the symbol, usually SymbolType.TYPE_OBJECT.

Use the below register method to register SimData subclasses with CLE.

NOTE: SimData.type hides the Symbol.type instance property

name: str = NotImplemented#
type: SymbolType = NotImplemented#
libname: str = NotImplemented#
classmethod static_size(owner) int[source]#

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

Return type:

int

value() bytes[source]#

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

Return type:

bytes

relocations() List[Relocation][source]#

Maybe implement me: If you like, return a list of relocation objects to apply. To create new import symbols, use self.owner.make_extern_import.

Return type:

List[Relocation]

owner: Backend#
cle.backends.externs.simdata.simdata.register(simdata_cls: Type[SimData])[source]#

Register the given SimData class with CLE so it may be used during loading

Parameters:

simdata_cls (Type[SimData]) –

cle.backends.externs.simdata.simdata.lookup(name: str, libname) Type[SimData] | None[source]#
Return type:

Optional[Type[SimData]]

Parameters:

name (str) –

class cle.backends.externs.simdata.common.StaticData[source]#

Bases: SimData

A simple SimData utility class to use when you have a SimData which should provide just a static set of bytes. To use, implement the following:

Variables:
  • name – The name of the symbol to provide.

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • data – The bytes to provide

type: SymbolType = 3#
data: bytes = NotImplemented#
classmethod static_size(owner)[source]#

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

value()[source]#

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

class cle.backends.externs.simdata.common.StaticWord[source]#

Bases: SimData

A simple SimData utility class to use when you have a SimData which should provide just a static integer. To use, implement the following:

Variables:
  • name – The name of the symbol to provide.

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • word – The value to provide

  • wordsize – (optional) The size of the value in bytes, default the CPU wordsize

type: SymbolType = 3#
word: int = NotImplemented#
wordsize: int = None#
classmethod static_size(owner)[source]#

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

value()[source]#

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

class cle.backends.externs.simdata.common.PointTo[source]#

Bases: SimData

A simple SimData utility class to use when you have a SimData which should provide just a pointer to some other symbol. To use, implement the following:

Variables:
  • name – The name of the symbol to provide.

  • libname – The name of the library from which the symbol originally comes (currently unused).

  • pointto_name – The name of the symbol to point to

  • pointto_type – The type of the symbol to point to (usually SymbolType.TYPE_FUNCTION or SymbolType.TYPE_OBJECT)

  • addend – (optional) an integer to be added to the symbol’s address before storage

pointto_name: str = NotImplemented#
pointto_type: SymbolType = NotImplemented#
type: SymbolType = 3#
addend: int = 0#
classmethod static_size(owner)[source]#

Implement me: return the size of the symbol in bytes before it gets constructed

Parameters:

owner – The ExternObject owning the symbol-to-be. Useful to get at owner.arch.

value()[source]#

Implement me: the initial value of the bytes in memory for the symbol. Should return a bytestring of the same length as static_size returned. (owner is self.owner now)

relocations()[source]#

Maybe implement me: If you like, return a list of relocation objects to apply. To create new import symbols, use self.owner.make_extern_import.

class cle.backends.externs.simdata.common.SimDataSimpleRelocation[source]#

Bases: Relocation

A relocation used to implement PointTo. Pretty simple.

__init__(owner, symbol, addr, addend, preresolved=False)[source]#
resolve_symbol(solist, **kwargs)[source]#
property value#