angr.knowledge_plugins.functions

class angr.knowledge_plugins.functions.Function

Bases: Serializable

A representation of a function and various information about it.

Variables:
  • meta_only (bool) – Whether this function only contains meta-information and in read-only mode.

  • _dirty (bool) – Whether this function has been modified since last serialization.

  • evicted (bool) – Whether this function has been evicted from FunctionManager to external storage.

__init__(function_manager, addr, name=None, syscall=None, is_simprocedure=None, binary_name=None, is_plt=None, returning=None, alignment=False, calling_convention=None, prototype=None, prototype_libname=None, prototype_source=None, is_prototype_guessed=True)

Function constructor. If the optional parameters are not provided, they will be automatically determined upon the creation of a Function object.

Parameters:
  • addr (int) – The address of the function.

  • name (str) – The name of the function.

  • syscall (bool) – Whether this function is a syscall or not.

  • is_simprocedure (bool | None) – Whether this function is a SimProcedure or not.

  • binary_name (str) – Name of the binary where this function is.

  • is_plt (bool | None) – If this function is a PLT entry.

  • returning (bool) – If this function returns.

  • alignment (bool) – If this function acts as an alignment filler. Such functions usually only contain nops.

  • function_manager (FunctionManager | None)

  • calling_convention (SimCC | None)

  • prototype (SimTypeFunction | None)

  • prototype_libname (str | None)

  • prototype_source (PrototypeSource | None)

  • is_prototype_guessed (bool)

transition_graph
normalized
addr
startpoint
bp_on_stack
retaddr_on_stack
sp_delta
tags
ran_cca
meta_only: bool
evicted: bool
is_default_name
previous_names
from_signature: str | None
binary_name
property name
property project: Project | None
property returning
property calling_convention: SimCC | None
property prototype: SimTypeFunction | None
property prototype_libname
property is_prototype_guessed: bool
property prototype_source: PrototypeSource
property info: FunctionInfo
property is_plt: bool
property is_simprocedure: bool
property is_syscall: bool
property is_alignment: bool
property blocks

An iterator of all local blocks in the current function.

Returns:

angr.lifter.Block instances.

property code_nodes: dict[int, CodeNode]
property cyclomatic_complexity

The cyclomatic complexity of the function.

Cyclomatic complexity is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program’s source code. It is computed using the formula: M = E - N + 2P, where E = the number of edges in the graph, N = the number of nodes in the graph, P = the number of connected components.

The cyclomatic complexity value is lazily computed and cached for future use. Initially this value is None until it is computed for the first time

Returns:

The cyclomatic complexity of the function.

Return type:

int

property xrefs: Iterator[XRef]

An iterator of all xrefs of the current function.

Returns:

angr.knowledge_plugins.xrefs.xref.XRef instances.

property block_addrs

An iterator of all local block addresses in the current function.

Returns:

block addresses.

property block_addrs_set

Return a set of block addresses for a better performance of inclusion tests.

Returns:

A set of block addresses.

Return type:

set

get_block(addr, size=None, byte_string=None)

Getting a block out of the current function.

Parameters:
  • addr (int) – The address of the block.

  • size (int | None) – The size of the block. This is optional. If not provided, angr will load

  • byte_string (bytes | None)

Returns:

get_block_size(addr)
Return type:

int | None

Parameters:

addr (int)

property nodes: Iterable[CodeNode]
get_node(addr)
Return type:

BlockNode | None

property has_unresolved_jumps
property has_unresolved_calls
property operations

All of the operations that are done by this functions.

property code_constants

All of the constants that are used by this functions’s code.

classmethod parse_from_cmessage(cmsg, **kwargs)
Parameters:

cmsg

Return Function:

The function instantiated out of the cmsg data.

string_references(minimum_length=2)

All of the constant string references used by this function.

Parameters:

minimum_length – The minimum length of strings to find (default is 1)

Returns:

A generator yielding tuples of (address, string) where is address is the location of the string in memory.

property local_runtime_values

Tries to find all runtime values of this function which do not come from inputs. These values are generated by starting from a blank state and reanalyzing the basic blocks once each. Function calls are skipped, and back edges are never taken so these values are often unreliable, This function is good at finding simple constant addresses which the function will use or calculate.

Returns:

a set of constants

property num_arguments
property endpoints
property endpoints_with_type
property ret_sites
property jumpout_sites
property retout_sites
property callout_sites
property size
property binary

Get the object this function belongs to. :return: The object this function belongs to.

property offset: int

the function’s binary offset (i.e., non-rebased address)

Type:

return

property symbol: None | Symbol

the function’s Symbol, if any

Type:

return

property pseudocode: str | None

the function’s pseudocode

Type:

return

property dirty: bool
mark_dirty()
Return type:

None

add_jumpout_site(node)

Add a custom jumpout site.

Parameters:

node (CodeNode) – The address of the basic block that control flow leaves during this transition.

Returns:

None

add_retout_site(node)

Add a custom retout site.

Retout (returning to outside of the function) sites are very rare. It mostly occurs during CFG recovery when we incorrectly identify the beginning of a function in the first iteration, and then correctly identify that function later in the same iteration (function alignments can lead to this bizarre case). We will mark all edges going out of the header of that function as a outside edge, because all successors now belong to the incorrectly-identified function. This identification error will be fixed in the second iteration of CFG recovery. However, we still want to keep track of jumpouts/retouts during the first iteration so other logic in CFG recovery still work.

Parameters:

node (CodeNode) – The address of the basic block that control flow leaves the current function after a call.

Returns:

None

update_func_block_count()

Update the cached block count of this function in the function manager.

Return type:

None

mark_nonreturning_calls_endpoints()

Iterate through all call edges in transition graph. For each call a non-returning function, mark the source basic block as an endpoint.

This method should only be executed once all functions are recovered and analyzed by CFG recovery, so we know whether each function returns or not.

Returns:

None

get_call_sites()

Gets a list of all the basic blocks that end in calls.

Return type:

Iterable[int]

Returns:

A view of the addresses of the blocks that end in calls.

get_call_target(callsite_addr)

Get the target of a call.

Parameters:

callsite_addr – The address of a basic block that ends in a call.

Returns:

The target of said call, or None if callsite_addr is not a callsite.

get_call_return(callsite_addr)

Get the hypothetical return address of a call.

Parameters:

callsite_addr – The address of the basic block that ends in a call.

Returns:

The likely return target of said call, or None if callsite_addr is not a callsite.

property graph

Get a local transition graph. A local transition graph is a transition graph that only contains nodes that belong to the current function. All edges, except for the edges going out from the current function or coming from outside the current function, are included.

The generated graph is cached in self._local_transition_graph.

Returns:

A local transition graph.

Return type:

networkx.DiGraph

graph_ex(exception_edges=True)

Get a local transition graph with a custom configuration. A local transition graph is a transition graph that only contains nodes that belong to the current function. This method allows user to exclude certain types of edges together with the nodes that are only reachable through such edges, such as exception edges.

The generated graph is not cached.

Parameters:

exception_edges (bool) – Should exception edges and the nodes that are only reachable through exception edges be kept.

Returns:

A local transition graph with a special configuration.

Return type:

networkx.DiGraph

transition_graph_ex(exception_edges=True)

Get a transition graph with a custom configuration. This method allows user to exclude certain types of edges together with the nodes that are only reachable through such edges, such as exception edges.

The generated graph is not cached.

Parameters:

exception_edges (bool) – Should exception edges and the nodes that are only reachable through exception edges be kept.

Returns:

A local transition graph with a special configuration.

Return type:

networkx.DiGraph

subgraph(ins_addrs)

Generate a sub control flow graph of instruction addresses based on self.graph

Parameters:

ins_addrs (iterable) – A collection of instruction addresses that should be included in the subgraph.

Return networkx.DiGraph:

A subgraph.

instruction_size(insn_addr)

Get the size of the instruction specified by insn_addr.

Parameters:

insn_addr (int) – Address of the instruction

Return int:

Size of the instruction in bytes, or None if the instruction is not found.

addr_to_instruction_addr(addr)

Obtain the address of the instruction that covers @addr.

Parameters:

addr (int) – An address.

Returns:

Address of the instruction that covers @addr, or None if this addr is not covered by any instruction of this function.

Return type:

int or None

dbg_print()

Returns a representation of the list of basic blocks in this function.

dbg_draw(filename)

Draw the graph and save it to a PNG file.

property arguments
property has_return
property callable
normalize()

Make sure all basic blocks in the transition graph of this function do not overlap. You will end up with a CFG that IDA Pro generates.

This method does not touch the CFG result. You may call CFG{Emulated, Fast}.normalize() for that matter.

Returns:

None

find_declaration(ignore_binary_name=False, binary_name_hint=None)

Find the most likely function declaration from the embedded collection of prototypes, set it to self.prototype, and update self.calling_convention with the declaration.

Parameters:
  • ignore_binary_name (bool) – Do not rely on the executable or library where the function belongs to determine its source library. This is useful when working on statically linked binaries (because all functions will belong to the main executable). We will search for all libraries in angr to find the first declaration match.

  • binary_name_hint (str | None) – Substring of the library name where this function might be originally coming from. Useful for FLIRT-identified functions in statically linked binaries.

Return type:

bool

Returns:

True if a declaration is found and self.prototype and self.calling_convention are updated. False if we fail to find a matching function declaration, in which case self.prototype or self.calling_convention will be kept untouched.

is_rust_function()

Determines if the function name follows Rust mangling conventions.

property demangled_name
property short_name
get_unambiguous_name(display_name=None)

Get a disambiguated function name.

Parameters:

display_name (str | None) – Name to display, otherwise the function name.

Return type:

str

Returns:

The function name in one of the following forms:

  • ::<name> when the function binary is the main object.

  • ::<obj>::<name> when the function binary is not the main object.

  • ::<addr>::<name> when the function binary is an unnamed non-main object, or when multiple functions with the same name are defined in the function binary.

apply_definition(definition, calling_convention=None)
Return type:

None

Parameters:
functions_reachable()
Return type:

set[Function]

Returns:

The set of all functions that can be reached from the function represented by self.

holes(min_size=8)

Find the number of non-consecutive areas in the function that are at least min_size bytes large.

Return type:

int

Parameters:

min_size (int)

copy()
pp(**kwargs)

Pretty-print the function disassembly.

class angr.knowledge_plugins.functions.FunctionManager

Bases: KnowledgeBasePlugin, Mapping[K, Function], Generic

When cache_limit is set, the FunctionManager uses a SpillingFunctionDict that implements an LRU cache keeping only the most recently accessed N functions in memory, spilling others to an LMDB database on disk. This allows working with binaries that have more functions than can fit in memory.

Parameters:

cache_limit (int | None) – Maximum number of functions to keep in memory. None means unlimited (no eviction). Default is None.

__init__(kb, cache_limit=None)
Parameters:
copy()
clear()
get_default_cache_limit(max_limit=5000)

Get the default function cache limit based on the size of the binary.

Return type:

int | None

Returns:

The default cache limit; None means unlimited.

Parameters:

max_limit (int)

is_plt_cached(addr)
Return type:

bool

Parameters:

addr (int)

get_binary_name_cached(addr)
Return type:

str | None

Parameters:

addr (int)

function_name_changed(addr, old_name, new_name)

Notify the FunctionManager that a function’s name has changed.

Parameters:
  • addr (TypeVar(K, int, SootMethodDescriptor)) – Address of the function.

  • old_name (str | None) – Old name of the function, or None if there is no old name.

  • new_name (str) – New name of the function.

Return type:

None

get_by_addr(addr, meta_only=False)
Return type:

Function

Parameters:

meta_only (bool)

get_by_name(name, check_previous_names=False)
Return type:

Generator[Function]

Parameters:
  • name (str)

  • check_previous_names (bool)

get_addrs_by_name(name, check_previous_names=False)
Return type:

set[int]

Parameters:
  • name (str)

  • check_previous_names (bool)

contains_addr(addr)

Decide if an address is handled by the function manager.

Note: this function is non-conformant with python programming idioms, but its needed for performance reasons.

Parameters:

addr (int) – Address of the function.

ceiling_addr(addr)

Return the function who has the least address that is greater than or equal to addr.

Parameters:

addr (TypeVar(K, int, SootMethodDescriptor)) – The address to query.

Return type:

Optional[TypeVar(K, int, SootMethodDescriptor)]

Returns:

A Function instance, or None if there is no other function after addr.

ceiling_func(addr)

Return the function who has the least address that is greater than or equal to addr.

Parameters:

addr (int) – The address to query.

Return type:

Function | None

Returns:

A Function instance, or None if there is no other function after addr.

floor_addr(addr)

Return the function who has the greatest address that is less than or equal to addr.

Parameters:

addr (TypeVar(K, int, SootMethodDescriptor)) – The address to query.

Return type:

Optional[TypeVar(K, int, SootMethodDescriptor)]

Returns:

An address, or None if there is no other function before addr.

floor_func(addr)

Return the function who has the greatest address that is less than or equal to addr.

Parameters:

addr (int) – The address to query.

Returns:

A Function instance, or None if there is no other function before addr.

Return type:

Function or None

query(query, check_previous_names=False)

Query for a function using selectors to disambiguate. Supported variations:

::<name> Function <name> in the main object ::<addr>::<name> Function <name> at <addr> ::<obj>::<name> Function <name> in <obj>

Return type:

Function | None

Parameters:
  • query (str)

  • check_previous_names (bool)

function(addr=None, name=None, check_previous_names=False, create=False, syscall=False, plt=None)

Get a function object from the function manager.

Pass either addr or name with the appropriate values.

Parameters:
  • addr (Optional[TypeVar(K, int, SootMethodDescriptor)]) – Address of the function.

  • name (str | None) – Name of the function.

  • create (bool) – Whether to create the function or not if the function does not exist.

  • syscall (bool) – True to create the function as a syscall, False otherwise.

  • plt (bool | None) – True to find the PLT stub, False to find a non-PLT stub, None to disable this restriction.

  • check_previous_names (bool)

Returns:

The Function instance, or None if the function is not found and create is False.

Return type:

Function | None

dbg_draw(prefix='dbg_function_')
rebuild_callgraph()
set_function_returning(addr, v)
Return type:

None

Parameters:
  • addr (K)

  • v (bool | None)

nonreturning_func_addrs()

Yield all non-returning function addresses.

Return type:

Generator[int]

unknown_returning_func_addrs()

Yield all function addresses with unknown returning status.

Return type:

Generator[int]

is_func_nonreturning(addr)

Check if a function is non-returning.

Parameters:

addr (TypeVar(K, int, SootMethodDescriptor)) – Address of the function.

Return type:

bool

Returns:

True if non-returning, False if returning or unknown.

is_func_returning_unknown(addr)

Check if a function’s returning status is unknown.

Parameters:

addr (TypeVar(K, int, SootMethodDescriptor)) – Address of the function.

Return type:

bool

Returns:

True if returning status is unknown, False otherwise.

get_func_block_count(addr)

Get the number of blocks in a function.

Parameters:

addr (TypeVar(K, int, SootMethodDescriptor)) – Address of the function.

Return type:

int | None

Returns:

Number of blocks, or None if unknown.

set_func_block_count(addr, count)

Set the number of blocks in a function.

Parameters:
Return type:

None

Returns:

None

get_key_func_addrs(func_type)
Return type:

set[TypeVar(K, int, SootMethodDescriptor)]

Parameters:

func_type (str)

add_key_func_addr(func_type, addr)
Return type:

None

Parameters:
  • func_type (str)

  • addr (K)

property cache_limit: int | None

Get the maximum number of functions to keep in memory. None means unlimited (no eviction).

property cached_function_count: int

Return the number of functions currently in memory.

property spilled_function_count: int

Return the number of functions currently spilled to LMDB.

property total_function_count: int

Return the total number of functions (in memory + spilled).

Submodules