angr.knowledge_plugins.cfg.cfg_model

class angr.knowledge_plugins.cfg.cfg_model.CFGModel

Bases: Serializable

This class describes a Control Flow Graph for a specific range of code.

__init__(ident, cfg_manager=None, is_arm=False, cache_limit=None, db_batch_size=800, edge_cache_limit=None, edge_db_batch_size=800, addr_type='int')
Parameters:
  • cache_limit (int | None)

  • db_batch_size (int)

  • edge_cache_limit (int | None)

  • edge_db_batch_size (int)

  • addr_type (Literal['int', 'block_id', 'soot'])

ident
is_arm
graph: SpillingCFG
jump_tables: dict[int, IndirectJump]
memory_data: SortedDict[int, MemoryData]
insn_addr_to_memory_data: dict[int, MemoryData]
normalized
edges_to_repair
property addr_type: Literal['int', 'block_id', 'soot']
property project
property node_addrs: SortedList[int]
nodes_by_addr(addr)
Return type:

Iterator[CFGNode]

Parameters:

addr (int)

has_node_addr(addr)
Return type:

bool

Parameters:

addr (int)

mark_node_addr_has_return(node_addr, has_return=True)
Return type:

None

Parameters:
node_addr_has_return(node_addr)
Return type:

bool

Parameters:

node_addr (int | SootAddressDescriptor)

copy()
add_node(block_id, node)
Return type:

None

Parameters:
remove_node(block_id, node)

Remove the given CFGNode instance. Note that this method does not remove the node from the graph.

Parameters:
  • block_id (int) – The Unique ID of the CFGNode.

  • node (CFGNode) – The CFGNode instance to remove.

Return type:

None

Returns:

None

has_node_id(node_id)
Return type:

bool

get_node(block_id)

Get a single node from Block ID.

Parameters:

block_id (BlockID) – Block ID of the node.

Return type:

CFGNode | None

Returns:

The CFGNode, or None if the node does not exist.

get_any_node(addr, is_syscall=None, anyaddr=False, force_fastpath=False)

Get an arbitrary CFGNode (without considering their contexts) from our graph.

Parameters:
  • addr (int) – Address of the beginning of the basic block. Set anyaddr to True to support arbitrary address.

  • is_syscall (bool | None) – Whether you want to get the syscall node or any other node. This is due to the fact that syscall SimProcedures have the same address as the target it returns to. None means get either, True means get a syscall node, False means get something that isn’t a syscall node.

  • anyaddr (bool) – If anyaddr is True, then addr doesn’t have to be the beginning address of a basic block. By default the entire graph.nodes() will be iterated, and the first node containing the specific address is returned, which can be slow.

  • force_fastpath (bool) – If force_fastpath is True, it will only perform a dict lookup in the graph._keys_by_addr dict.

Return type:

CFGNode | None

Returns:

A CFGNode if there is any that satisfies given conditions, or None otherwise

get_all_nodes(addr, is_syscall=None, anyaddr=False)

Get all CFGNodes whose address is the specified one.

Parameters:
  • addr (int) – Address of the node

  • is_syscall (bool | None) – True returns the syscall node, False returns the normal CFGNode, None returns both

  • anyaddr (bool)

Return type:

list[CFGNode]

Returns:

all CFGNodes

get_all_nodes_intersecting_region(addr, size=1)

Get all CFGNodes that intersect the given region.

Parameters:
  • addr (int) – Minimum address of target region.

  • size (int) – Size of region, in bytes.

Return type:

set[CFGNode]

floor_addr(addr)

Get the largest address that is less than or equal to the given address and has a CFGNode.

Parameters:

addr (int) – The address to floor.

Return type:

int | None

Returns:

The largest address that is less than or equal to the given address and has a CFGNode, or None if no such address exists.

ceil_addr(addr)

Get the smallest address that is greater than or equal to the given address and has a CFGNode.

Parameters:

addr (int) – The address to ceil.

Return type:

int | None

Returns:

The smallest address that is greater than or equal to the given address and has a CFGNode, or None if no such address exists.

nodes()

An iterator of all nodes in the graph.

Returns:

The iterator.

Return type:

iterator

get_predecessors(cfgnode, excluding_fakeret=True, jumpkind=None)

Get predecessors of a node in the control flow graph.

Parameters:
  • cfgnode (CFGNode) – The node.

  • excluding_fakeret (bool) – True if you want to exclude all predecessors that is connected to the node with a fakeret edge.

  • jumpkind (str | None) – Only return predecessors with the specified jumpkind. This argument will be ignored if set to None.

Return type:

list[CFGNode]

Returns:

A list of predecessors

get_successors(node, excluding_fakeret=True, jumpkind=None)

Get successors of a node in the control flow graph.

Parameters:
  • node (CFGNode) – The node.

  • excluding_fakeret (bool) – True if you want to exclude all successors that is connected to the node with a fakeret edge.

  • jumpkind (str | None) – Only return successors with the specified jumpkind. This argument will be ignored if set to None.

Returns:

A list of successors

Return type:

list[CFGNode]

get_successors_and_jumpkinds(node, excluding_fakeret=True)

Get a list of tuples where the first element is the successor of the CFG node and the second element is the jumpkind of the successor.

Parameters:
  • node (CFGNode) – The node.

  • excluding_fakeret (bool) – True if you want to exclude all successors that are fall-through successors.

Returns:

A list of successors and their corresponding jumpkinds.

Return type:

list[tuple[CFGNode, str]]

get_successors_and_jumpkind(node, excluding_fakeret=True)

Get a list of tuples where the first element is the successor of the CFG node and the second element is the jumpkind of the successor.

Parameters:
  • node (CFGNode) – The node.

  • excluding_fakeret (bool) – True if you want to exclude all successors that are fall-through successors.

Returns:

A list of successors and their corresponding jumpkinds.

Return type:

list[tuple[CFGNode, str]]

get_predecessors_and_jumpkinds(node, excluding_fakeret=True)

Get a list of tuples where the first element is the predecessor of the CFG node and the second element is the jumpkind of the predecessor.

Parameters:
  • node (CFGNode) – The node.

  • excluding_fakeret (bool) – True if you want to exclude all predecessors that are fall-through predecessors.

Return type:

list[tuple[CFGNode, str]]

Returns:

A list of predecessors and their corresponding jumpkinds.

get_predecessors_and_jumpkind(node, excluding_fakeret=True)

Get a list of tuples where the first element is the predecessor of the CFG node and the second element is the jumpkind of the predecessor.

Parameters:
  • node (CFGNode) – The node.

  • excluding_fakeret (bool) – True if you want to exclude all predecessors that are fall-through predecessors.

Return type:

list[tuple[CFGNode, str]]

Returns:

A list of predecessors and their corresponding jumpkinds.

get_all_predecessors(cfgnode, depth_limit=None)

Get all predecessors of a specific node on the control flow graph.

Parameters:
  • cfgnode (CFGNode) – The CFGNode object

  • depth_limit (int) – Optional depth limit for the depth-first search

Returns:

A list of predecessors in the CFG

Return type:

list

get_all_successors(cfgnode, depth_limit=None)

Get all successors of a specific node on the control flow graph.

Parameters:
  • cfgnode (CFGNode) – The CFGNode object

  • depth_limit (int) – Optional depth limit for the depth-first search

Returns:

A list of successors in the CFG

Return type:

list

get_branching_nodes()

Returns all nodes that has an out degree >= 2

get_exit_stmt_idx(src_block, dst_block)

Get the corresponding exit statement ID for control flow to reach destination block from source block. The exit statement ID was put on the edge when creating the CFG. Note that there must be a direct edge between the two blocks, otherwise an exception will be raised.

Returns:

The exit statement ID

add_memory_data(data_addr, data_type, data_size=None)

Add a MemoryData entry to self.memory_data.

Parameters:
  • data_addr (int) – Address of the data

  • data_type (MemoryDataSort | None) – Type of the memory data

  • data_size (int | None) – Size of the memory data, or None if unknown for now.

Return type:

bool

Returns:

True if a new memory data entry is added, False otherwise.

tidy_data_references(memory_data_addrs=None, exec_mem_regions=None, xrefs=None, seg_list=None, data_type_guessing_handlers=None, fill_gaps=True, new_mem_data_addrs=None)

Go through all data references (or the ones as specified by memory_data_addrs) and determine their sizes and types if possible.

Parameters:
  • memory_data_addrs (list[int] | None) – A list of addresses of memory data, or None if tidying all known memory data entries.

  • exec_mem_regions (list[tuple[int, int]] | None) – A list of start and end addresses of executable memory regions.

  • seg_list (SegmentList | None) – The segment list that CFGFast uses during CFG recovery.

  • data_type_guessing_handlers (list[Callable] | None) – A list of Python functions that will guess data types. They will be called in sequence to determine data types for memory data whose type is unknown.

  • fill_gaps (bool) – If True, when a memory data entry is found to have a gap between its end and the next data entry, a new memory data entry will be created to fill the gap. fill_gaps should only be set to True at the end of CFG recovery when traversing the entire memory_data dict for the last time.

  • xrefs (XRefManager | None)

  • new_mem_data_addrs (set[int] | None)

Return type:

bool

Returns:

True if new data entries are found, False otherwise.

remove_node_and_graph_node(node)

Like remove_node, but also removes node from the graph.

Parameters:

node (CFGNode) – The node to remove.

Return type:

None

get_intersecting_functions(addr, size=1, kb=None)

Find all functions with nodes intersecting [addr, addr + size).

Parameters:
  • addr (int) – Minimum address of target region.

  • size (int) – Size of region, in bytes.

  • kb (KnowledgeBase | None) – Knowledge base to search for functions in.

Return type:

set[Function]

find_function_for_reflow_into_addr(addr, kb=None)

Look for a function that flows into a new node at addr.

Parameters:
  • addr (int) – Address of new block.

  • kb (KnowledgeBase | None) – Knowledge base to search for functions in.

Return type:

Function | None

clear_region_for_reflow(addr, size=1, kb=None)

Remove nodes in the graph intersecting region [addr, addr + size).

Any functions that intersect the range, and their associated nodes in the CFG, will also be removed from the knowledge base for analysis.

Parameters:
  • addr (int) – Minimum address of target region.

  • size (int) – Size of the region, in bytes.

  • kb (KnowledgeBase | None) – Knowledge base to search for functions in.

Return type:

None