procedures
directory corresponds to some sort of standard, or a body that specifies the interface part of an API and its semantics. We call each folder a catalog of procedures. For example, we have libc
which contains the functions defined by the C standard library, and a separate folder posix
which contains the functions defined by the posix standard. There is some magic which automatically scrapes these folders in the procedures
directory and organizes them into the angr.SIM_PROCEDURES
dict. For example, angr/procedures/libc/printf.py
contains both class printf
and class __printf_chk
, so there exists both angr.SIM_PROCEDURES['libc']['printf']
and angr.SIM_PROCEDURES['libc']['__printf_chk']
.SimLibraries
which represent an actual shared library file, its functions, and their metadata. Take a look at the API reference for SimLibrary along with the code for setting up glibc to learn how to use it.procedures/definitions
. Files in here should contain an instance, not a subclass, of SimLibrary
. The same magic that scrapes up SimProcedures will also scrape up SimLibraries and put them in angr.SIM_LIBRARIES
, keyed on each of their common names. For example, angr/procedures/definitions/linux_loader.py
contains lib = SimLibrary(); lib.set_library_names('ld.so', 'ld-linux.so', 'ld.so.2', 'ld-linux.so.2', 'ld-linux-x86_64.so.2')
, so you can access it via angr.SIM_LIBRARIES['ld.so']
or angr.SIM_LIBRARIES['ld-linux.so']
or any of the other names.SIM_LIBRARIES
and their procedures (or stubs!) are hooked into the project's address space to summarize any functions it can. The code for this process is found here.angr.SIM_LIBRARIES[libname].add(name, proc_cls)
to do the registration.angr.Project
. Note also that adding the procedure to angr.SIM_PROCEDURES
, i.e. adding it directly to a catalog, will not work, since these catalogs are used to construct the SimLibraries only at import and are used by value, not by reference.hook_symbol
.Ijk_Sys
. This will cause the next step to be handled by the SimOS associated with the project, which will extract the syscall number from the state and query a specialized SimLibrary with that.linux_kernel
and adds them to the library, then adds several syscall number mappings, including separate mappings for mips-o32
, mips-n32
, and mips-n64
.SimUserland
, itself a SimOS subclass. This requires the class to call SimUserland's constructor with a super() call that includes the syscall_library
keyword argument, specifying the specific SimSyscallLibrary that contains the appropriate procedures and mappings for the operating system. Additionally, the class's configure_project
must perform a super() call including the abi_list
keyword argument, which contains the list of ABIs that are valid for the current architecture. If the ABI for the syscall can't be determined by just the syscall number, for example, that amd64 linux programs can use either int 0x80
or syscall
to invoke a syscall and these two ABIs use overlapping numbers, the SimOS cal override syscall_abi()
, which takes a SimState and returns the name of the current syscall ABI. This is determined for int80/syscall by examining the most recent jumpkind, since libVEX will produce different syscall jumpkinds for the different instructions.angr.SYSCALL_CC
be a map of maps {arch_name: {os_name: cc_cls}}
, where os_name
is the value of project.simos.name, and each of the calling convention classes must include an extra method called syscall_number
which takes a state and return the current syscall number. Look at the bottom of calling_conventions.py
to learn more about it. Not very object-oriented at all...project.simos.is_syscall_addr(addr)
and the syscall corresponding to the address can be retrieved with project.simos.syscall_from_addr(addr)
.angr/procedures/definitions
. These libraries don't have to specify any common name, but they can if they'd like to show up in SIM_LIBRARIES
for easy access.angr/procedures/linux_kernel
. As long as the class name matches one of the names in the number-to-name mapping of the SimLibrary (all the linux syscall numbers are included with recent releases of angr), it will be used.simos
directory, but this is not a magic directory like procedures
. Instead, you should add a line to angr/simos/__init__.py
calling register_simos()
with the OS name as it appears in project.loader.main_object.os
and the SimOS class. Your class should do everything described above.angr.SIM_LIBRARIES
. If you're this for linux you want angr.SIM_LIBRARIES['linux'].add(name, proc_cls)
.register_simos
method is just sitting there waiting for you as angr.simos.register_simos(name, simos_cls)
.project.simos.syscall_library
to manipulate an individual project's syscalls.Project
constructor via the simos
keyword argument, so you can specify the SimOS for a project explicitly if you like.auto_load_libs=False
, or perhaps some dependency is simply missing), things get tricky. It is not possible to guess in most cases what the value should be, or even what its size should be, so if the guest program ever dereferences a pointer to such a symbol, emulation will go off the rails.