run()function with however many arguments you like, and the SimProcedure runtime will automatically extract from the program state those arguments for you, via a calling convention, and call your run function with them. Similarly, when you return a value from the run function, it is placed into the state (again, according to the calling convention), and the actual control-flow action of returning from a function is performed, which depending on the architecture may involve jumping to the link register or jumping to the result of a stack pop.
Projectclass, the dict
project._sim_proceduresis a mapping from address to
SimProcedureinstances. When the execution pipeline reaches an address that is present in that dict, that is, an address that is hooked, it will execute
project._sim_procedures[address].execute(state). This will consult the calling convention to extract the arguments, make a copy of itself in order to preserve thread safety, and run the
run()method. It is important to produce a new instance of the SimProcedure for each time it is run, since the process of running a SimProcedure necessarily involves mutating state on the SimProcedure instance, so we need separate ones for each step, lest we run into race conditions in multithreaded environments.
run()method. Pretty cool!
run()function, they came out as a weird
<SAO <BV64 0xSTUFF>>class. This is a
SimActionObject. Basically, you don't need to worry about it too much, it's just a thin wrapper over a normal bitvector. It does a bit of tracking of what exactly you do with it inside the SimProcedure---this is helpful for static analysis.
0from the procedure. This will automatically be promoted to a word-sized bitvector! You can return a native number, a bitvector, or a SimActionObject.
cc = project.factory.cc_from_arg_kinds((True, True), ret_fp=True)and
project.hook(address, ProcedureClass(cc=mycc))This method for passing in a calling convention works for all calling conventions, so if angr's autodetected one isn't right, you can fix that.
run(). This is actually shorthand for calling
self.ret()is the function which knows how to perform the specific action of returning from a function.
ret(expr): Return from a function
jump(addr): Jump to an address in the binary
exit(code): Terminate the program
call(addr, args, continue_at): Call a function in the binary
inline_call(procedure, *args): Call another SimProcedure in-line and return the results
self.successors.add_successor(state, addr, guard, jumpkind). All of these parameters should have an obvious meaning if you've followed along so far. Keep in mind that the state you pass in will NOT be copied and WILL be mutated, so be sure to make a copy beforehand if there will be more work to do!
self.call(addr, args, continue_at),
addris expected to be the address you'd like to call,
argsis the tuple of arguments you'd like to call it with, and
continue_atis the name of another method in your SimProcedure class that you'd like execution to continue at when it returns. This method must have the same signature as the
run()method. Furthermore, you can pass the keyword argument
ccas the calling convention that ought to be used to communicate with the callee.
continue_atfunction instead of
run(), with the same args and kwargs as the first time.
IS_FUNCTION = True
local_varsto a tuple of strings, where each string is the name of an instance variable on your SimProcedure whose value you would like to persist to when you return.
state.callstackhas an entry called
.procedure_datawhich is used by the SimProcedure runtime to store information local to the current call frame. angr tracks the stack pointer in order to make the current top of the
state.callstacka meaningful local data store. It's stuff that ought to be stored in memory in a stack frame, but the data can't be serialized and/or memory allocation is hard.
full_init_statefor a linux program:
run_initializerfunction again. When we run out of initializers, we set up the entry state and jump to the program entry point.
state.globals. This is a dictionary that just gets shallow-copied from state to successor state. Because it's only a shallow copy, its members are the same instances, so the same rules as local variables in SimProcedure continuations apply. You need to be careful not to mutate any item that is used as a global variable unless you know exactly what you're doing.
IS_FUNCTION, which allows you to use the SimProcedure continuation. There are a few more class variables you can set, though these ones have no direct benefit to you - they merely mark attributes of your function so that static analysis knows what it's doing.
NO_RET: Set this to true if control flow will never return from this function
ADDS_EXITS: Set this to true if you do any control flow other than returning
ADDS_EXITS, you may also want to define the method
static_exits(). This function takes a single parameter, a list of IRSBs that would be executed in the run-up to your function, and asks you to return a list of all the exits that you know would be produced by your function in that case. The return value is expected to be a list of tuples of (address (int), jumpkind (str)). This is meant to be a quick, best-effort analysis, and you shouldn't try to do anything crazy or intensive to get your answer.
Ijk_NoHookjumpkind allows this to happen.
state.scratch.jumpkindset. The IP is the target instruction pointer, the guard is a symbolic boolean representing a constraint to add to the state related to it being taken as opposed to the others, and the jumpkind is a VEX enum string, like
Ijk_Boring, representing the nature of the branch.
Project.hook_symbolAPI to hook the address referred to by a symbol!
rand()with a function that always returns a consistent sequence of values:
rand(), it'll return the integers from the
return_valuesarray in a loop.