run()
function with however many arguments you like, and the SimProcedure runtime will automatically extract from the program state those arguments for you, via a calling convention, and call your run function with them. Similarly, when you return a value from the run function, it is placed into the state (again, according to the calling convention), and the actual control-flow action of returning from a function is performed, which depending on the architecture may involve jumping to the link register or jumping to the result of a stack pop.Project
class, the dict project._sim_procedures
is a mapping from address to SimProcedure
instances. When the execution pipeline reaches an address that is present in that dict, that is, an address that is hooked, it will execute project._sim_procedures[address].execute(state)
. This will consult the calling convention to extract the arguments, make a copy of itself in order to preserve thread safety, and run the run()
method. It is important to produce a new instance of the SimProcedure for each time it is run, since the process of running a SimProcedure necessarily involves mutating state on the SimProcedure instance, so we need separate ones for each step, lest we run into race conditions in multithreaded environments.run()
method. Pretty cool!run()
function, they came out as a weird <SAO <BV64 0xSTUFF>>
class. This is a SimActionObject
. Basically, you don't need to worry about it too much, it's just a thin wrapper over a normal bitvector. It does a bit of tracking of what exactly you do with it inside the SimProcedure---this is helpful for static analysis.0
from the procedure. This will automatically be promoted to a word-sized bitvector! You can return a native number, a bitvector, or a SimActionObject.cc = project.factory.cc_from_arg_kinds((True, True), ret_fp=True)
and project.hook(address, ProcedureClass(cc=mycc))
This method for passing in a calling convention works for all calling conventions, so if angr's autodetected one isn't right, you can fix that.run()
. This is actually shorthand for calling self.ret(value)
. self.ret()
is the function which knows how to perform the specific action of returning from a function.ret(expr)
: Return from a functionjump(addr)
: Jump to an address in the binaryexit(code)
: Terminate the programcall(addr, args, continue_at)
: Call a function in the binaryinline_call(procedure, *args)
: Call another SimProcedure in-line and return the resultsself.successors.add_successor(state, addr, guard, jumpkind)
. All of these parameters should have an obvious meaning if you've followed along so far. Keep in mind that the state you pass in will NOT be copied and WILL be mutated, so be sure to make a copy beforehand if there will be more work to do!self.call(addr, args, continue_at)
, addr
is expected to be the address you'd like to call, args
is the tuple of arguments you'd like to call it with, and continue_at
is the name of another method in your SimProcedure class that you'd like execution to continue at when it returns. This method must have the same signature as the run()
method. Furthermore, you can pass the keyword argument cc
as the calling convention that ought to be used to communicate with the callee.continue_at
function instead of run()
, with the same args and kwargs as the first time.IS_FUNCTION = True
local_vars
to a tuple of strings, where each string is the name of an instance variable on your SimProcedure whose value you would like to persist to when you return. Local variables can be any type so long as you don't mutate their instances.state.callstack
has an entry called .procedure_data
which is used by the SimProcedure runtime to store information local to the current call frame. angr tracks the stack pointer in order to make the current top of the state.callstack
a meaningful local data store. It's stuff that ought to be stored in memory in a stack frame, but the data can't be serialized and/or memory allocation is hard.full_init_state
for a linux program:run_initializer
function again. When we run out of initializers, we set up the entry state and jump to the program entry point.state.globals
. This is a dictionary that just gets shallow-copied from state to successor state. Because it's only a shallow copy, its members are the same instances, so the same rules as local variables in SimProcedure continuations apply. You need to be careful not to mutate any item that is used as a global variable unless you know exactly what you're doing.IS_FUNCTION
, which allows you to use the SimProcedure continuation. There are a few more class variables you can set, though these ones have no direct benefit to you - they merely mark attributes of your function so that static analysis knows what it's doing.NO_RET
: Set this to true if control flow will never return from this functionADDS_EXITS
: Set this to true if you do any control flow other than returningIS_SYSCALL
: Self-explanatoryADDS_EXITS
, you may also want to define the method static_exits()
. This function takes a single parameter, a list of IRSBs that would be executed in the run-up to your function, and asks you to return a list of all the exits that you know would be produced by your function in that case. The return value is expected to be a list of tuples of (address (int), jumpkind (str)). This is meant to be a quick, best-effort analysis, and you shouldn't try to do anything crazy or intensive to get your answer.Ijk_NoHook
jumpkind allows this to happen.state.regs.ip
, state.scratch.guard
, and state.scratch.jumpkind
set. The IP is the target instruction pointer, the guard is a symbolic boolean representing a constraint to add to the state related to it being taken as opposed to the others, and the jumpkind is a VEX enum string, like Ijk_Boring
, representing the nature of the branch.Project.hook_symbol
API to hook the address referred to by a symbol!rand()
with a function that always returns a consistent sequence of values:rand()
, it'll return the integers from the return_values
array in a loop.