angr.knowledge_plugins.functions¶
- class angr.knowledge_plugins.functions.Function
Bases:
SerializableA representation of a function and various information about it.
- Variables:
- __init__(function_manager, addr, name=None, syscall=None, is_simprocedure=None, binary_name=None, is_plt=None, returning=None, alignment=False, calling_convention=None, prototype=None, prototype_libname=None, prototype_source=None, is_prototype_guessed=True)
Function constructor. If the optional parameters are not provided, they will be automatically determined upon the creation of a Function object.
- Parameters:
addr (
int) – The address of the function.name (str) – The name of the function.
syscall (bool) – Whether this function is a syscall or not.
is_simprocedure (
bool|None) – Whether this function is a SimProcedure or not.binary_name (str) – Name of the binary where this function is.
returning (bool) – If this function returns.
alignment (bool) – If this function acts as an alignment filler. Such functions usually only contain nops.
function_manager (FunctionManager | None)
calling_convention (SimCC | None)
prototype (SimTypeFunction | None)
prototype_libname (str | None)
prototype_source (PrototypeSource | None)
is_prototype_guessed (bool)
- transition_graph
- normalized
- addr
- startpoint
- bp_on_stack
- retaddr_on_stack
- sp_delta
- tags
- ran_cca
- meta_only: bool
- evicted: bool
- is_default_name
- previous_names
- binary_name
- property name
- property returning
- property prototype: SimTypeFunction | None
- property prototype_libname
- property is_prototype_guessed: bool
- property prototype_source: PrototypeSource
- property info: FunctionInfo
- property is_plt: bool
- property is_simprocedure: bool
- property is_syscall: bool
- property is_alignment: bool
- property blocks
An iterator of all local blocks in the current function.
- Returns:
angr.lifter.Block instances.
- property cyclomatic_complexity
The cyclomatic complexity of the function.
Cyclomatic complexity is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program’s source code. It is computed using the formula: M = E - N + 2P, where E = the number of edges in the graph, N = the number of nodes in the graph, P = the number of connected components.
The cyclomatic complexity value is lazily computed and cached for future use. Initially this value is None until it is computed for the first time
- Returns:
The cyclomatic complexity of the function.
- Return type:
- property xrefs: Iterator[XRef]
An iterator of all xrefs of the current function.
- Returns:
angr.knowledge_plugins.xrefs.xref.XRef instances.
- property block_addrs
An iterator of all local block addresses in the current function.
- Returns:
block addresses.
- property block_addrs_set
Return a set of block addresses for a better performance of inclusion tests.
- Returns:
A set of block addresses.
- Return type:
- get_block(addr, size=None, byte_string=None)
Getting a block out of the current function.
- property has_unresolved_jumps
- property has_unresolved_calls
- property operations
All of the operations that are done by this functions.
- property code_constants
All of the constants that are used by this functions’s code.
- classmethod parse_from_cmessage(cmsg, **kwargs)
- Parameters:
cmsg
- Return Function:
The function instantiated out of the cmsg data.
- string_references(minimum_length=2)
All of the constant string references used by this function.
- Parameters:
minimum_length – The minimum length of strings to find (default is 1)
- Returns:
A generator yielding tuples of (address, string) where is address is the location of the string in memory.
- property local_runtime_values
Tries to find all runtime values of this function which do not come from inputs. These values are generated by starting from a blank state and reanalyzing the basic blocks once each. Function calls are skipped, and back edges are never taken so these values are often unreliable, This function is good at finding simple constant addresses which the function will use or calculate.
- Returns:
a set of constants
- property num_arguments
- property endpoints
- property endpoints_with_type
- property ret_sites
- property jumpout_sites
- property retout_sites
- property callout_sites
- property size
- property binary
Get the object this function belongs to. :return: The object this function belongs to.
- property offset: int
the function’s binary offset (i.e., non-rebased address)
- Type:
return
- property dirty: bool
- mark_dirty()
- Return type:
- add_jumpout_site(node)
Add a custom jumpout site.
- Parameters:
node (
CodeNode) – The address of the basic block that control flow leaves during this transition.- Returns:
None
- add_retout_site(node)
Add a custom retout site.
Retout (returning to outside of the function) sites are very rare. It mostly occurs during CFG recovery when we incorrectly identify the beginning of a function in the first iteration, and then correctly identify that function later in the same iteration (function alignments can lead to this bizarre case). We will mark all edges going out of the header of that function as a outside edge, because all successors now belong to the incorrectly-identified function. This identification error will be fixed in the second iteration of CFG recovery. However, we still want to keep track of jumpouts/retouts during the first iteration so other logic in CFG recovery still work.
- Parameters:
node (
CodeNode) – The address of the basic block that control flow leaves the current function after a call.- Returns:
None
- update_func_block_count()
Update the cached block count of this function in the function manager.
- Return type:
- mark_nonreturning_calls_endpoints()
Iterate through all call edges in transition graph. For each call a non-returning function, mark the source basic block as an endpoint.
This method should only be executed once all functions are recovered and analyzed by CFG recovery, so we know whether each function returns or not.
- Returns:
None
- get_call_sites()
Gets a list of all the basic blocks that end in calls.
- get_call_target(callsite_addr)
Get the target of a call.
- Parameters:
callsite_addr – The address of a basic block that ends in a call.
- Returns:
The target of said call, or None if callsite_addr is not a callsite.
- get_call_return(callsite_addr)
Get the hypothetical return address of a call.
- Parameters:
callsite_addr – The address of the basic block that ends in a call.
- Returns:
The likely return target of said call, or None if callsite_addr is not a callsite.
- property graph
Get a local transition graph. A local transition graph is a transition graph that only contains nodes that belong to the current function. All edges, except for the edges going out from the current function or coming from outside the current function, are included.
The generated graph is cached in self._local_transition_graph.
- Returns:
A local transition graph.
- Return type:
networkx.DiGraph
- graph_ex(exception_edges=True)
Get a local transition graph with a custom configuration. A local transition graph is a transition graph that only contains nodes that belong to the current function. This method allows user to exclude certain types of edges together with the nodes that are only reachable through such edges, such as exception edges.
The generated graph is not cached.
- Parameters:
exception_edges (bool) – Should exception edges and the nodes that are only reachable through exception edges be kept.
- Returns:
A local transition graph with a special configuration.
- Return type:
networkx.DiGraph
- transition_graph_ex(exception_edges=True)
Get a transition graph with a custom configuration. This method allows user to exclude certain types of edges together with the nodes that are only reachable through such edges, such as exception edges.
The generated graph is not cached.
- Parameters:
exception_edges (bool) – Should exception edges and the nodes that are only reachable through exception edges be kept.
- Returns:
A local transition graph with a special configuration.
- Return type:
networkx.DiGraph
- subgraph(ins_addrs)
Generate a sub control flow graph of instruction addresses based on self.graph
- Parameters:
ins_addrs (iterable) – A collection of instruction addresses that should be included in the subgraph.
- Return networkx.DiGraph:
A subgraph.
- instruction_size(insn_addr)
Get the size of the instruction specified by insn_addr.
- Parameters:
insn_addr (int) – Address of the instruction
- Return int:
Size of the instruction in bytes, or None if the instruction is not found.
- addr_to_instruction_addr(addr)
Obtain the address of the instruction that covers @addr.
- dbg_print()
Returns a representation of the list of basic blocks in this function.
- dbg_draw(filename)
Draw the graph and save it to a PNG file.
- property arguments
- property has_return
- property callable
- normalize()
Make sure all basic blocks in the transition graph of this function do not overlap. You will end up with a CFG that IDA Pro generates.
This method does not touch the CFG result. You may call CFG{Emulated, Fast}.normalize() for that matter.
- Returns:
None
- find_declaration(ignore_binary_name=False, binary_name_hint=None)
Find the most likely function declaration from the embedded collection of prototypes, set it to self.prototype, and update self.calling_convention with the declaration.
- Parameters:
ignore_binary_name (
bool) – Do not rely on the executable or library where the function belongs to determine its source library. This is useful when working on statically linked binaries (because all functions will belong to the main executable). We will search for all libraries in angr to find the first declaration match.binary_name_hint (
str|None) – Substring of the library name where this function might be originally coming from. Useful for FLIRT-identified functions in statically linked binaries.
- Return type:
- Returns:
True if a declaration is found and self.prototype and self.calling_convention are updated. False if we fail to find a matching function declaration, in which case self.prototype or self.calling_convention will be kept untouched.
- is_rust_function()
Determines if the function name follows Rust mangling conventions.
- property demangled_name
- property short_name
- get_unambiguous_name(display_name=None)
Get a disambiguated function name.
- Parameters:
display_name (
str|None) – Name to display, otherwise the function name.- Return type:
- Returns:
The function name in one of the following forms:
::<name>when the function binary is the main object.::<obj>::<name>when the function binary is not the main object.::<addr>::<name>when the function binary is an unnamed non-main object, or when multiple functions with the same name are defined in the function binary.
- apply_definition(definition, calling_convention=None)
- functions_reachable()
- holes(min_size=8)
Find the number of non-consecutive areas in the function that are at least min_size bytes large.
- copy()
- pp(**kwargs)
Pretty-print the function disassembly.
- class angr.knowledge_plugins.functions.FunctionManager
Bases:
KnowledgeBasePlugin,Mapping[K,Function],GenericWhen cache_limit is set, the FunctionManager uses a SpillingFunctionDict that implements an LRU cache keeping only the most recently accessed N functions in memory, spilling others to an LMDB database on disk. This allows working with binaries that have more functions than can fit in memory.
- Parameters:
cache_limit (
int|None) – Maximum number of functions to keep in memory. None means unlimited (no eviction). Default is None.
- __init__(kb, cache_limit=None)
- Parameters:
kb (KnowledgeBase)
cache_limit (int | None)
- copy()
- clear()
- get_default_cache_limit(max_limit=5000)
Get the default function cache limit based on the size of the binary.
- function_name_changed(addr, old_name, new_name)
Notify the FunctionManager that a function’s name has changed.
- get_by_name(name, check_previous_names=False)
- get_addrs_by_name(name, check_previous_names=False)
- contains_addr(addr)
Decide if an address is handled by the function manager.
Note: this function is non-conformant with python programming idioms, but its needed for performance reasons.
- Parameters:
addr (int) – Address of the function.
- ceiling_addr(addr)
Return the function who has the least address that is greater than or equal to addr.
- Parameters:
addr (
TypeVar(K,int,SootMethodDescriptor)) – The address to query.- Return type:
- Returns:
A Function instance, or None if there is no other function after addr.
- ceiling_func(addr)
Return the function who has the least address that is greater than or equal to addr.
- floor_addr(addr)
Return the function who has the greatest address that is less than or equal to addr.
- Parameters:
addr (
TypeVar(K,int,SootMethodDescriptor)) – The address to query.- Return type:
- Returns:
An address, or None if there is no other function before addr.
- floor_func(addr)
Return the function who has the greatest address that is less than or equal to addr.
- query(query, check_previous_names=False)
Query for a function using selectors to disambiguate. Supported variations:
::<name> Function <name> in the main object ::<addr>::<name> Function <name> at <addr> ::<obj>::<name> Function <name> in <obj>
- function(addr=None, name=None, check_previous_names=False, create=False, syscall=False, plt=None)
Get a function object from the function manager.
Pass either addr or name with the appropriate values.
- Parameters:
addr (
Optional[TypeVar(K,int,SootMethodDescriptor)]) – Address of the function.create (
bool) – Whether to create the function or not if the function does not exist.syscall (
bool) – True to create the function as a syscall, False otherwise.plt (
bool|None) – True to find the PLT stub, False to find a non-PLT stub, None to disable this restriction.check_previous_names (bool)
- Returns:
The Function instance, or None if the function is not found and create is False.
- Return type:
- dbg_draw(prefix='dbg_function_')
- rebuild_callgraph()
- unknown_returning_func_addrs()
Yield all function addresses with unknown returning status.
- is_func_nonreturning(addr)
Check if a function is non-returning.
- Parameters:
addr (
TypeVar(K,int,SootMethodDescriptor)) – Address of the function.- Return type:
- Returns:
True if non-returning, False if returning or unknown.
- is_func_returning_unknown(addr)
Check if a function’s returning status is unknown.
- Parameters:
addr (
TypeVar(K,int,SootMethodDescriptor)) – Address of the function.- Return type:
- Returns:
True if returning status is unknown, False otherwise.
- get_func_block_count(addr)
Get the number of blocks in a function.
- Parameters:
addr (
TypeVar(K,int,SootMethodDescriptor)) – Address of the function.- Return type:
- Returns:
Number of blocks, or None if unknown.
- set_func_block_count(addr, count)
Set the number of blocks in a function.
- Parameters:
addr (
TypeVar(K,int,SootMethodDescriptor)) – Address of the function.count (
int) – Number of blocks.
- Return type:
- Returns:
None
- get_key_func_addrs(func_type)
- Return type:
set[TypeVar(K,int,SootMethodDescriptor)]- Parameters:
func_type (str)
- property cache_limit: int | None
Get the maximum number of functions to keep in memory. None means unlimited (no eviction).
- property cached_function_count: int
Return the number of functions currently in memory.
- property spilled_function_count: int
Return the number of functions currently spilled to LMDB.
- property total_function_count: int
Return the total number of functions (in memory + spilled).
Submodules