Symbolic Execution¶
Symbolic execution is a program analysis technique used to explore multiple execution paths of a program simultaneously. Unlike normal execution, which runs the program with specific inputs, symbolic execution treats inputs as symbolic variables rather than concrete values. This means the execution can represent a wide range of inputs with symbolic expressions. Symbolic execution allows, at a time in emulation, to determine for a branch all conditions necessary to take a branch or not. Every variable is represented as a symbolic value, and each branch as a constraint. Thus, symbolic execution allows us to see which conditions allow the program to go from point A to point B by resolving these constraints. The execution paths are then analyzed by solving constraints generated by these symbolic expressions, allowing the discovery of bugs and vulnerabilities that might be missed in standard testing.
Example:¶
Consider the following simple program :
const char* check_value(int x) {
if (x > 10) {
return "Greater";
} else {
return "Lesser or Equal";
}
}
In normal execution, if x
is set to 5, the program will follow the path
where x <= 10
and return “Lesser or Equal”. In symbolic execution, x
is treated as a symbolic variable, X
. The execution engine explores both
paths:
Path 1:
X > 10
leading to the result “Greater”Path 2:
X <= 10
leading to the result “Lesser or Equal”
Constraints for both paths are generated and solved to understand all possible behaviors of the program.
In software verification, it helps ensure that the code behaves as expected across all possible inputs and states. For security analysis, symbolic execution can uncover vulnerabilities such as input validation errors, which could be exploited by attackers. Additionally, in automated testing, it aids in generating comprehensive test cases that cover edge cases and rare execution paths, enhancing the robustness and security of software systems. Overall, symbolic execution provides a powerful means to rigorously analyze and improve software and firmware reliability.
Basic Execution¶
Now let’s see an example use case of symbolic execution with angr. Consider the following example code,
void helloWorld() {
printf("Hello, World!\n");
}
void firstCall(uint32_t num) {
if (num > 50 && num <100)
HelloWorld();
}
The firstCall
function will accept a 32-bit number as an input and will call
helloWorld
function if the number is between 50 and 100.
You can perform a symbolic execution to find a correct and valid input to reach
the final helloWorld
function call with angr using the following sample code.
import angr, claripy
# Load the binary
project = angr.Project('./3func', auto_load_libs=False)
# Define the address of the firstCall function
firstCall_addr = project.loader.main_object.get_symbol("firstCall")
# Define the address of the helloWorld function
helloWorld_addr = project.loader.main_object.get_symbol("helloWorld")
# Create a symbolic variable for the firstCall arg
input_arg = claripy.BVS('input_arg', 32)
# Create a blank state at the address of the firstCall function
init_state = project.factory.blank_state(addr=firstCall_addr.rebased_addr)
# Assuming the calling convention passes the argument in a register
# (e.g., x86 uses edi for the argument)
init_state.regs.edi = input_arg
# Create a simulation manager
simgr = project.factory.simulation_manager(init_state)
# Explore the binary, looking for the address of helloWorld
simgr.explore(find=helloWorld_addr.rebased_addr)
# Check if we found a state that reached the target
if simgr.found:
input_value = simgr.found[0].solver.eval(input_arg)
print(f"Value of input_arg that reaches HelloWorld: {input_value}")
# Get the constraints for reaching the helloWorld function
constraints = simgr.found[0].solver.constraints
# Create a solver with the constraints
solver = claripy.Solver()
solver.add(constraints)
min_val = solver.min(input_arg)
max_val = solver.max(input_arg)
print(f"Function arg: min = {min_val}, max = {max_val}")
else:
print("Did not find a state that reaches HelloWorld.")
It will produce the output like the below with a valid example function arg that can reach the
function helloWorld
that you can use as a test case.
Value of input_arg that reaches HelloWorld: 71
Function arg: min = 51, max = 99