angr.analyses.decompiler.optimization_passes¶
- class angr.analyses.decompiler.optimization_passes.BasePointerSaveSimplifier
Bases:
OptimizationPassRemoves the effects of base pointer stack storage at function invocation and restoring at function return.
- ARCHES = ['X86', 'AMD64', 'ARMEL', 'ARMHF', 'ARMCortexM', 'MIPS32', 'MIPS64']¶
- PLATFORMS = None¶
- STAGE = 6¶
- NAME = 'Simplify base pointer saving'¶
- DESCRIPTION = 'Removes the effects of base pointer stack storage at function invocation and restoring at function return.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.CallStatementRewriter
Bases:
OptimizationPassRewrite call statements to assignments if needed.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 5¶
- NAME = 'Unify call statements on demand.'¶
- DESCRIPTION = 'Rewrite call statements to assignments if needed.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.CodeMotionOptimization
Bases:
OptimizationPassMoves common statements out of blocks that share the same predecessors or the same successors. This is done to reduce the number of statements in a block and to make the blocks more similar to each other.
As an example:
if (x) { b = 2; a = 1; c = 3; } else { b = 2; c = 3; }
Will be turned into:
if (x) { a = 1; } b = 2; c = 3;
Current limitations (for very conservative operations):
moving statements above conditional jumps is not supported
only immediate children and parents are considered for moving statements
when moving statements down, a block is only considered if already has a matching statement at the end
- ARCHES = None¶
- PLATFORMS = None¶
- NAME = 'Merge common statements in sub-scopes'¶
- STAGE = 6¶
- DESCRIPTION = '\n Moves common statements out of blocks that share the same predecessors or the same\n successors. This is done to reduce the number of statements in a block and to make the\n blocks more similar to each other.\n\n As an example::\n\n if (x) {\n b = 2;\n a = 1;\n c = 3;\n } else {\n b = 2;\n c = 3;\n }\n\n Will be turned into::\n\n if (x) {\n a = 1;\n }\n b = 2;\n c = 3;\n\n Current limitations (for very conservative operations):\n\n - moving statements above conditional jumps is not supported\n - only immediate children and parents are considered for moving statements\n - when moving statements down, a block is only considered if already has a matching statement at the end\n '¶
- __init__(*args, max_iters=10, node_idx_start=0, **kwargs)
- Parameters:
node_idx_start (int)
- static update_graph_with_super_edits(original_graph, super_graph, updated_blocks)
This function updates an graph when doing block edits on a supergraph version of that same graph. The updated blocks must be provided as a dictionary where the keys are original block in the supergraph and the values are the new blocks that should replace them.
The supergraph MUST be generated using the to_ail_supergraph function, since it stores the original nodes each super node represents. This is necessary to update the original graph with the new super nodes.
- class angr.analyses.decompiler.optimization_passes.ConditionConstantPropagation
Bases:
OptimizationPassReason about constant propagation opportunities from conditionals and propagate constants in the graph accordingly.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 2¶
- NAME = 'Propagate constants using information deduced from conditionals.'¶
- DESCRIPTION = 'Reason about constant propagation opportunities from conditionals and propagate constants in the graph accordingly.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.ConstPropOptReverter
Bases:
OptimizationPassThis optimization reverts the effects of constant propagation done by the compiler as discussed in the USENIX 2024 paper SAILR. This optimization’s main goal is to enable later optimizations that rely on symbolic variables to be more effective. This optimization pass will convert two statements with a difference of a const and a symbolic variable into two statements with the symbolic variables.
As an example: x = 75 puts(x) puts(75)
will be converted to: x = 75 puts(x) puts(x)
- ARCHES = None¶
- PLATFORMS = None¶
- STRUCTURING = ['sailr', 'dream']¶
- STAGE = 10¶
- NAME = 'Revert Constant Propagation Optimizations'¶
- DESCRIPTION = "This optimization reverts the effects of constant propagation done by the compiler as discussed in the\n USENIX 2024 paper SAILR. This optimization's main goal is to enable later optimizations that rely on\n symbolic variables to be more effective. This optimization pass will convert two statements with a difference of\n a const and a symbolic variable into two statements with the symbolic variables.\n\n As an example:\n x = 75\n puts(x)\n puts(75)\n\n will be converted to:\n x = 75\n puts(x)\n puts(x)"¶
- __init__(*args, region_identifier=None, reaching_definitions=None, **kwargs)
- class angr.analyses.decompiler.optimization_passes.ConstantDereferencesSimplifier
Bases:
OptimizationPassMakes the following simplifications:
*(*(const_addr)) ==> *(value) iff *const_addr == value
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 2¶
- NAME = 'Simplify constant dereferences'¶
- DESCRIPTION = 'Makes the following simplifications::\n\n *(*(const_addr)) ==> *(value) iff *const_addr == value'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.CrossJumpReverter
Bases:
StructuringOptimizationPassThis is an implementation to revert the compiler optimization Cross Jumping, an ISC optimization discussed in the USENIX 2024 paper SAILR. This optimization is somewhat aggressive and as such should be run last in your decompiler deoptimization chain. This deoptimization will take any goto it finds and attempt to duplicate its target block if its target only has one outgoing edge.
There are some heuristics in place to prevent duplication everywhere. First, this deoptimization will only run a max of max_opt_iters times. Second, it will not duplicate a block with too many calls.
- STAGE = 10¶
- NAME = 'Duplicate linear blocks with gotos'¶
- DESCRIPTION = 'This is an implementation to revert the compiler optimization Cross Jumping, an ISC optimization discussed\nin the USENIX 2024 paper SAILR. This optimization is somewhat aggressive and as such should be run last in your\ndecompiler deoptimization chain. This deoptimization will take any goto it finds and attempt to duplicate its\ntarget block if its target only has one outgoing edge.\n\nThere are some heuristics in place to prevent duplication everywhere. First, this deoptimization will only run\na max of max_opt_iters times. Second, it will not duplicate a block with too many calls.'¶
- class angr.analyses.decompiler.optimization_passes.DeadblockRemover
Bases:
OptimizationPassRemoves condition-unreachable blocks from the graph.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 9¶
- NAME = 'Remove blocks with unsatisfiable conditions'¶
- DESCRIPTION = 'Removes condition-unreachable blocks from the graph.'¶
- __init__(*args, node_cutoff=200, **kwargs)
- Parameters:
node_cutoff (int)
- class angr.analyses.decompiler.optimization_passes.DivSimplifier
Bases:
OptimizationPassSimplifies various division optimizations back to “div”.
- ARCHES = ['X86', 'AMD64', 'ARMCortexM', 'ARMHF', 'ARMEL']¶
- PLATFORMS = None¶
- STAGE = 6¶
- NAME = 'Simplify arithmetic division'¶
- DESCRIPTION = 'Simplifies various division optimizations back to "div".'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.DuplicationReverter
Bases:
StructuringOptimizationPassThis (de)optimization reverts the effects of many compiler optimizations that cause code duplication in the decompilation. This deoptimization is the implementation of the USENIX 2024 paper SAILR’s ISD doptimization. As such, the main goal of this optimization is to remove code duplication by merging semantically similar blocks in the AIL graph.
- NAME = 'Revert Statement Duplication Optimizations'¶
- DESCRIPTION = "This (de)optimization reverts the effects of many compiler optimizations that cause code duplication in\n the decompilation. This deoptimization is the implementation of the USENIX 2024 paper SAILR's ISD\n doptimization. As such, the main goal of this optimization is to remove code duplication by merging\n semantically similar blocks in the AIL graph."¶
- __init__(*args, max_guarding_conditions=4, **kwargs)
- static boolean_operators_in_condition(condition)
TODO: this entire boolean checking semantic we use needs to be removed, see how it is used for other dels needed we need to replace it with a boolean variable insertion on both branches that lead to the new block.
Say we have:
if (A()) { do_thing(); } if (B()) { do_thing(): }
We want to translate it to:
int should_do_thing = 0; if (A()) should_do_thing = 1; if (B()) should_do_thing = 1; if (should_do_thing): do_thing();
Although longer, this code can be optimized to look like:
int should_do_thing = A() || B(); if (should_do_thing) do_thing();
- Parameters:
condition (Expression)
- stmt_can_move_to(stmt, block, new_idx, io_finder=None)
- maximize_similarity_of_blocks(block1, block2, graph)
This attempts to rearrange the order of statements in block1 and block2 to maximize the similarity between them. This implementation is a little outdated since CodeMotion optimization was implemented, but it should be disabled until we have a good SSA implementation.
TODO: reimplement me when we have better SSA
- create_merged_subgraph(blocks, graph, maximize_similarity=False)
- Return type:
- Parameters:
graph (DiGraph)
- similar_conditional_when_single_corrected(block1, block2, graph)
- collect_conditions_between_nodes(graph, source, sinks, max_depth=15)
- shared_common_conditional_dom(nodes, graph)
Takes n nodes and returns True only if all the nodes are dominated by the same node, which must be a ConditionalJump
@param nodes: @param graph: @return:
- Parameters:
graph (DiGraph)
- class angr.analyses.decompiler.optimization_passes.EagerStdStringConcatenationPass
Bases:
OptimizationPassConcatenate multiple constant std::string creation calls into one when possible.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 7¶
- NAME = 'Condense multiple constant std::string creation calls into one when possible'¶
- DESCRIPTION = 'Concatenate multiple constant std::string creation calls into one when possible.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.ExprOpSwapper
Bases:
SequenceOptimizationPassSwap operands (and the operator accordingly) in a BinOp expression.
- ARCHES = ['X86', 'AMD64', 'ARMEL', 'ARMHF', 'ARMCortexM', 'MIPS32', 'MIPS64']¶
- PLATFORMS = ['windows', 'linux', 'cgc']¶
- STAGE = 11¶
- NAME = 'Swap operands of expressions as requested'¶
- DESCRIPTION = 'Swap operands (and the operator accordingly) in a BinOp expression.'¶
- __init__(*args, binop_operators=None, **kwargs)
- Parameters:
binop_operators (dict[OpDescriptor, str] | None)
- class angr.analyses.decompiler.optimization_passes.FlipBooleanCmp
Bases:
SequenceOptimizationPassIn the scenario in which a false node has no apparent successors, flip the condition on that if-stmt. This is only useful when StructuredCodeGenerator has simplify_else_scopes enabled, as this will allow the flipped if-stmt to remove the redundant else.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 11¶
- NAME = 'Flip small ret booleans'¶
- DESCRIPTION = 'When false node has no successors, flip condition so else scope can be simplified later'¶
- __init__(*args, flip_size=9, **kwargs)
- class angr.analyses.decompiler.optimization_passes.ITEExprConverter
Bases:
OptimizationPassTransform specific expressions into If-Then-Else expressions, or tertiary expressions in C when given a single-use expression address. Requires outside analysis to provide the target expressions.
- ARCHES = ['X86', 'AMD64', 'ARMEL', 'ARMHF', 'ARMCortexM', 'MIPS32', 'MIPS64']¶
- PLATFORMS = ['windows', 'linux', 'cgc']¶
- STAGE = 10¶
- NAME = 'Transform single-use expressions that were assigned to in different If-Else branches into ternary expressions'¶
- DESCRIPTION = 'Transform specific expressions into If-Then-Else expressions, or tertiary expressions in C when\n given a single-use expression address. Requires outside analysis to provide the target expressions.'¶
- __init__(*args, ite_exprs=None, **kwargs)
- class angr.analyses.decompiler.optimization_passes.ITERegionConverter
Bases:
OptimizationPassTransform regions of the form if (c) {x = a} else {x = b} into x = c ? a : b.
- ARCHES = ['X86', 'AMD64', 'ARMEL', 'ARMHF', 'ARMCortexM', 'MIPS32', 'MIPS64']¶
- PLATFORMS = ['windows', 'linux', 'cgc']¶
- STAGE = 6¶
- NAME = 'Transform ITE-assignment regions into ternary expression assignments'¶
- DESCRIPTION = 'Transform regions of the form `if (c) {x = a} else {x = b}` into `x = c ? a : b`.'¶
- __init__(*args, max_updates=10, **kwargs)
- class angr.analyses.decompiler.optimization_passes.InlinedMemcpySimplifier
Bases:
OptimizationPassSimplifies inlined data copying logic into calls to memcpy.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 3¶
- NAME = 'Simplify inlined memcpy'¶
- DESCRIPTION = 'Simplify inlined memcpy patterns into memcpy calls'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.InlinedMemcpySimplifierLate
Bases:
InlinedMemcpySimplifierSame as InlinedMemcpySimplifier but runs after SSA level 1 transformation.
- STAGE = 4¶
- NAME = 'Simplify inlined memcpy (late)'¶
- class angr.analyses.decompiler.optimization_passes.InlinedMemsetSimplifier
Bases:
OptimizationPassSimplifies inlined memory setting logic into calls to memset.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 3¶
- NAME = 'Simplify inlined memset'¶
- DESCRIPTION = 'Simplify inlined memset patterns into memset calls'¶
- MIN_ASSIGNMENTS = 2¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.InlinedMemsetSimplifierLate
Bases:
InlinedMemsetSimplifierSame as InlinedMemsetSimplifier but runs after SSA level 1 transformation.
- STAGE = 4¶
- NAME = 'Simplify inlined memset (late)'¶
- class angr.analyses.decompiler.optimization_passes.InlinedStrcpySimplifier
Bases:
OptimizationPassSimplifies inlined string copying logic into calls to strcpy/strncpy, and consolidates multiple consecutive inlined strcpy calls.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 3¶
- NAME = 'Simplify inlined strcpy'¶
- DESCRIPTION = 'Simplify inlined strcpy patterns and consolidate multiple inlined strcpy calls'¶
- __init__(*args, **kwargs)
- static is_integer_likely_a_string(v, size, endness, min_length=4)
- static is_inlined_strcpy(stmt)
- class angr.analyses.decompiler.optimization_passes.InlinedStrcpySimplifierLate
Bases:
InlinedStrcpySimplifierSame as InlinedStrcpySimplifier but runs after SSA level 1 transformation.
- STAGE = 4¶
- NAME = 'Simplify inlined strcpy (late)'¶
- class angr.analyses.decompiler.optimization_passes.InlinedStringTransformationSimplifier
Bases:
OptimizationPassSimplifies inlined string transformation routines.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 3¶
- NAME = 'Simplify string transformations'¶
- DESCRIPTION = 'Simplify string transformations that are commonly used in obfuscated functions.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.InlinedStrlenSimplifier
Bases:
OptimizationPassAbstracts inlined strlen functions into strlen() calls.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 6¶
- NAME = 'Identify and simplify inlined strlen() functions'¶
- DESCRIPTION = 'Identify and simplify inlined strlen() functions'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.InlinedWcscpySimplifier
Bases:
OptimizationPassSimplifies inlined wide string copying logic into calls to wcsncpy, and consolidates multiple consecutive inlined wcsncpy calls.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 3¶
- NAME = 'Simplify inlined wcscpy'¶
- DESCRIPTION = 'Simplify inlined wcscpy patterns and consolidate multiple inlined wcsncpy calls'¶
- __init__(*args, **kwargs)
- static even_offsets_are_zero(lst)
- static odd_offsets_are_zero(lst)
- static is_integer_likely_a_wide_string(v, size, endness, min_length=4)
- static is_inlined_wcsncpy(stmt)
- class angr.analyses.decompiler.optimization_passes.InlinedWcscpySimplifierLate
Bases:
InlinedWcscpySimplifierSame as InlinedWcscpySimplifier but runs after SSA level 1 transformation.
- STAGE = 4¶
- NAME = 'Simplify inlined wcscpy (late)'¶
- class angr.analyses.decompiler.optimization_passes.LoweredSwitchSimplifier
Bases:
StructuringOptimizationPassThis optimization recognizes and reverts switch cases that have been lowered and possibly split into multiple if-else statements. This optimization, discussed in the USENIX 2024 paper SAILR, aims to undo the compiler optimization known as “Switch Lowering”, present in both GCC and Clang. An in-depth discussion of this optimization can be found in the paper or in our documentation of the optimization: https://github.com/mahaloz/sailr-eval/issues/14#issue-2232616411
Note, this optimization does not occur in MSVC, which uses a different optimization strategy for switch cases. As a hack for now, we only run this deoptimization on Linux binaries.
- PLATFORMS = ['linux']¶
- NAME = 'Convert lowered switch-cases (if-else) to switch-cases'¶
- DESCRIPTION = 'Convert lowered switch-cases (if-else) to switch-cases. Only works when the Phoenix structuring algorithm is in use.'¶
- __init__(*args, min_distinct_cases=2, **kwargs)
- static restore_graph(node, last_stmt, graph, full_graph)
- Parameters:
last_stmt (IncompleteSwitchCaseHeadStatement)
graph (DiGraph)
full_graph (DiGraph)
- class angr.analyses.decompiler.optimization_passes.MipsGpSettingSimplifier
Bases:
OptimizationPassRemoves $gp-setting statements at the beginning of MIPS functions.
- ARCHES = ['MIPS32', 'MIPS64']¶
- PLATFORMS = ['linux']¶
- STAGE = 6¶
- NAME = 'Remove MIPS $gp-setting statements'¶
- DESCRIPTION = 'Removes $gp-setting statements at the beginning of MIPS functions.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.ModSimplifier
Bases:
OptimizationPassSimplifies optimized forms of modulo computation back to “mod”.
- ARCHES = ['X86', 'AMD64', 'ARMCortexM', 'ARMHF', 'ARMEL']¶
- PLATFORMS = ['linux', 'windows']¶
- STAGE = 6¶
- NAME = 'Simplify optimized mod forms'¶
- DESCRIPTION = 'Simplifies optimized forms of modulo computation back to "mod".'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.OptimizationPassStage
Bases:
EnumEnums about optimization pass stages.
Note that the region identification pass (RegionIdentifier) may modify existing AIL blocks without updating the topology of the original AIL graph. For example, loop successor refinement may modify create a new AIL block with an artificial address, and alter existing jump targets of jump statements and conditional jump statements to point to this new block. However, loop successor refinement does not update the topology of the original AIL graph, which means this new AIL block does not exist in the original AIL graph. As a result, until this behavior of RegionIdentifier changes in the future, DURING_REGION_IDENTIFICATION optimization passes should not modify existing jump targets.
- AFTER_AIL_GRAPH_CREATION = 0¶
- BEFORE_SSA_LEVEL0_TRANSFORMATION = 1¶
- AFTER_SINGLE_BLOCK_SIMPLIFICATION = 2¶
- BEFORE_SSA_LEVEL1_TRANSFORMATION = 3¶
- AFTER_SSA_LEVEL1_TRANSFORMATION = 4¶
- AFTER_MAKING_CALLSITES = 5¶
- AFTER_GLOBAL_SIMPLIFICATION = 6¶
- BEFORE_VARIABLE_RECOVERY = 7¶
- AFTER_VARIABLE_RECOVERY = 8¶
- BEFORE_REGION_IDENTIFICATION = 9¶
- DURING_REGION_IDENTIFICATION = 10¶
- AFTER_STRUCTURING = 11¶
- class angr.analyses.decompiler.optimization_passes.RegisterSaveAreaSimplifier
Bases:
OptimizationPassOptimizes away register spilling effects, including callee-saved registers.
This optimization runs between SSA-level0 and SSA-level1, which means registers are converted to vvars but stack accesses stay unchanged.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 2¶
- NAME = 'Simplify register save areas'¶
- DESCRIPTION = 'Optimizes away register spilling effects, including callee-saved registers.\n\n This optimization runs between SSA-level0 and SSA-level1, which means registers are converted to vvars but stack\n accesses stay unchanged.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.RegisterSaveAreaSimplifierAdvanced
Bases:
OptimizationPassOptimizes away registers that are stored to or restored on the stack space.
This analysis is more complex than RegisterSaveAreaSimplifier because it handles: (1) Registers that are stored in the stack shadow space (sp+N) according to the Windows x64 calling convention. (2) Registers that are aliases of sp.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 5¶
- NAME = 'Simplify register save areas (advanced)'¶
- DESCRIPTION = 'Optimizes away registers that are stored to or restored on the stack space.\n\n This analysis is more complex than RegisterSaveAreaSimplifier because it handles:\n (1) Registers that are stored in the stack shadow space (sp+N) according to the Windows x64 calling convention.\n (2) Registers that are aliases of sp.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.RetAddrSaveSimplifier
Bases:
OptimizationPassRemoves code in function prologues and epilogues for saving and restoring return address registers (ra, lr, etc.), generally seen in non-leaf functions.
- ARCHES = ['MIPS32', 'MIPS64']¶
- PLATFORMS = ['linux']¶
- STAGE = 6¶
- NAME = 'Simplify return address storage'¶
- DESCRIPTION = 'Removes code in function prologues and epilogues for saving and restoring return address registers (ra, lr, etc.),\n generally seen in non-leaf functions.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.ReturnDeduplicator
Bases:
OptimizationPassTransforms: - if (cond) { … return x; } return x;
into: - if (cond) { … } return x;
TODO: its possible that this can be expanded to all rets that are equivalent. Testing needed.
- ARCHES = ['X86', 'AMD64', 'ARMEL', 'ARMHF', 'ARMCortexM', 'MIPS32', 'MIPS64']¶
- PLATFORMS = ['windows', 'linux', 'cgc']¶
- STAGE = 10¶
- NAME = 'Deduplicates return statements that may have been duplicated'¶
- DESCRIPTION = 'Transforms:\n - if (cond) { ... return x; } return x;\n\n into:\n - if (cond) { ... } return x;\n\n TODO: its possible that this can be expanded to all rets that are equivalent. Testing needed.'¶
- STRUCTURING = ['sailr', 'dream']¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.ReturnDuplicatorHigh
Bases:
OptimizationPass,ReturnDuplicatorBaseThis is a light-level goto-less version of the ReturnDuplicator optimization pass. It will only duplicate return-only blocks.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 6¶
- NAME = 'Duplicate return-only blocks (high)'¶
- DESCRIPTION = '\n This is a light-level goto-less version of the ReturnDuplicator optimization pass. It will only\n duplicate return-only blocks.\n '¶
- STRUCTURING = ['sailr', 'dream']¶
- class angr.analyses.decompiler.optimization_passes.ReturnDuplicatorLow
Bases:
StructuringOptimizationPass,ReturnDuplicatorBaseAn optimization pass that reverts a subset of Irreducible Statement Condensing (ISC) optimizations, as described in the USENIX 2024 paper SAILR. This is the heavy/goto version of the ReturnDuplicator optimization pass.
Some compilers, including GCC, Clang, and MSVC, apply various optimizations to reduce the number of statements in code. These optimizations will take equivalent statements, or a subset of them, and replace them with a single copy that is jumped to by gotos – optimizing for space and sometimes speed.
This optimization pass will revert those gotos by re-duplicating the condensed blocks. Since Return statements are the most common, we use this optimization pass to revert only gotos to return statements. Additionally, we perform some additional readability fixups, like not re-duplicating returns to shared components.
- Parameters:
func – The function to optimize.
node_idx_start – The index to start at when creating new nodes. This is used by Clinic to ensure that node indices are unique across multiple passes.
max_opt_iters (
int) – The maximum number of optimization iterations to perform.max_calls_in_regions (
int) – The maximum number of calls that can be in a region. This is used to prevent duplicating too much code.prevent_new_gotos (
bool) – If True, this optimization pass will prevent new gotos from being created.minimize_copies_for_regions (
bool) – If True, this optimization pass will minimize the number of copies by doing only a single copy for connected in_edges that form a region.
- ARCHES = None¶
- PLATFORMS = None¶
- NAME = 'Duplicate returns connect with gotos (low)'¶
- DESCRIPTION = 'An optimization pass that reverts a subset of Irreducible Statement Condensing (ISC) optimizations, as described\nin the USENIX 2024 paper SAILR. This is the heavy/goto version of the ReturnDuplicator optimization pass.\n\nSome compilers, including GCC, Clang, and MSVC, apply various optimizations to reduce the number of statements in\ncode. These optimizations will take equivalent statements, or a subset of them, and replace them with a single\ncopy that is jumped to by gotos -- optimizing for space and sometimes speed.\n\nThis optimization pass will revert those gotos by re-duplicating the condensed blocks. Since Return statements\nare the most common, we use this optimization pass to revert only gotos to return statements. Additionally, we\nperform some additional readability fixups, like not re-duplicating returns to shared components.'¶
- __init__(*args, max_opt_iters=4, max_calls_in_regions=2, prevent_new_gotos=True, minimize_copies_for_regions=True, region_identifier=None, vvar_id_start=0, scratch=None, max_func_blocks=500, **kwargs)
- class angr.analyses.decompiler.optimization_passes.StackCanarySimplifier
Bases:
OptimizationPassRemoves stack canary checks from decompilation results.
- ARCHES = ['X86', 'AMD64']¶
- PLATFORMS = ['cgc', 'linux']¶
- STAGE = 6¶
- NAME = 'Simplify stack canaries'¶
- DESCRIPTION = 'Removes stack canary checks from decompilation results.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.StaticVVarRewriter
Bases:
OptimizationPassRewrite user-specified vvars as static values or fix-sized buffers. Also rewrites reads from pointers derived off of such vvars.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 7¶
- NAME = 'Static virtual variable rewriter'¶
- DESCRIPTION = 'Rewrite user-specified vvars as static values or fix-sized buffers. Also rewrites reads from pointers derived off\n of such vvars.'¶
- __init__(*args, static_buffers=None, static_vvars=None, **kwargs)
- Parameters:
static_buffers (dict[str, FixedBuffer] | None)
static_vvars (dict[int, FixedBufferPtr | Const] | None)
- class angr.analyses.decompiler.optimization_passes.SwitchDefaultCaseDuplicator
Bases:
OptimizationPassFor each switch-case construct (identified by jump tables), duplicate the default-case node when we detect situations where the default-case node is seemingly reused by edges outside the switch-case construct. This code reuse is usually caused by compiler code deduplication.
Ideally this pass should be implemented as an ISC optimization reversion.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 0¶
- NAME = 'Duplicate default-case nodes to undo default-case node reuse caused by compiler code deduplication'¶
- DESCRIPTION = 'For each switch-case construct (identified by jump tables), duplicate the default-case node when we detect\n situations where the default-case node is seemingly reused by edges outside the switch-case construct. This code\n reuse is usually caused by compiler code deduplication.\n\n Ideally this pass should be implemented as an ISC optimization reversion.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.SwitchReusedEntryRewriter
Bases:
OptimizationPassFor each switch-case construct (identified by jump tables), rewrite the entry into a goto block when we detect situations where an entry node is reused by edges in switch-case constructs that are not the current one. This code reuse is usually caused by compiler code deduplication.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 0¶
- NAME = 'Rewrite switch-case entry nodes with multiple predecessors into goto statements.'¶
- DESCRIPTION = 'For each switch-case construct (identified by jump tables), rewrite the entry into a goto block when we detect\n situations where an entry node is reused by edges in switch-case constructs that are not the current one. This code\n reuse is usually caused by compiler code deduplication.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.TagSlicer
Bases:
OptimizationPassRemoves unmarked statements from the graph.
- ARCHES = None¶
- PLATFORMS = None¶
- STAGE = 8¶
- NAME = 'Remove unmarked statements from the graph.'¶
- DESCRIPTION = 'Removes unmarked statements from the graph.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.WinStackCanarySimplifier
Bases:
OptimizationPassRemoves stack canary checks from decompilation results for Windows PE files.
we need to run this pass before performing any full-function simplification. Otherwise the effects of _security_cookie will be propagated.
- ARCHES = ['X86', 'AMD64']¶
- PLATFORMS = ['windows']¶
- STAGE = 2¶
- NAME = 'Simplify stack canaries in Windows PE files'¶
- DESCRIPTION = 'Removes stack canary checks from decompilation results for Windows PE files.\n\n we need to run this pass before performing any full-function simplification. Otherwise the effects of\n _security_cookie will be propagated.'¶
- __init__(*args, **kwargs)
- class angr.analyses.decompiler.optimization_passes.X86GccGetPcSimplifier
Bases:
OptimizationPassSimplifies __x86.get_pc_thunk calls.
- ARCHES = ['X86']¶
- PLATFORMS = ['linux']¶
- STAGE = 1¶
- NAME = 'Simplify getpc()'¶
- DESCRIPTION = 'Simplifies __x86.get_pc_thunk calls.'¶
- __init__(*args, **kwargs)
- angr.analyses.decompiler.optimization_passes.get_optimization_passes(arch, platform)¶
- angr.analyses.decompiler.optimization_passes.register_optimization_pass(opt_pass, *, presets=None)¶
- Parameters:
presets (list[str | DecompilationPreset] | None)
Submodules