# purelibc
A glibc overlay library for process self-virtualization

This is PURE\_LIBC: an overlay library for glibc that allows system call capturing.

(C) 2006,2008 Renzo Davoli University of Bologna (ITALY)
(C) 2006 Andrea Gasparini University of Bologna (ITALY)
 
This is LIBRE software: this work has been released under the LGPLv2,1+
license (see the file COPYING and the header note in the source files).

Pure\_libc converts glibc from a libc+system interfacing library into a 
libc-only library.
A process can trace the system call generated by itself by purelibc.
Pure\_libc is not complete yet. Stdio has been implemented onto the
fopencookie call. 
Due to current limitations of fopencookie, freopen may not work
properly when reopening files different from std{in,out,err}.

This function:
```C
fun _pure_start(sfun pure_syscall,int flags);
```
starts the syscall tracing.
All the system call of the programs are converted into calls of the
`pure_syscall` function.
`pure_socketcall` is meaningful only for architectures where 
all the berkeley socket calls get sent to the kernel using one shared
system call (`__NR_socketcall`)
if `pure_socketcall` is not NULL, purelibc calls it for each 
Berkeley socket call.
If `pure_socketcall` is NULL and  `__NR_socketcall` is defined purelibc calls 
```C
	pure_syscall(__NR_socketcall,socketcall_id,argv)
```
(purelibc mimics the same call received by the kernel).

### FLAGS
* `PUREFLAG_STDIN, PUREFLAG_STDOUT, PUREFLAG_STDERR`: 
The standard streams gets opened by libc before purelibc starts.
Without these flags stdio calls on standard streams will not be 
traced. (e.g. getchar, printf).
These flags force _pure_start to reopen the stdio standard streams to trace
the calls on them.
`PUREFLAG_STDALL` is a shortcut for 
`(PUREFLAG_STDIN|PUREFLAG_STDOUT|PUREFLAG_STDERR)`

### RETURN VALUE
_pure_start returns a pointer to the original libc syscall function.
this function must be stored in a global variable and must be used to 
bypass purelibc and send a system call to the kernel.

WARNING: libc '`syscall(2)`' call itself gets diverted to the `pure_syscall`
function, too.

## Installation

purelibc uses the cmake, so the standard procedure to compile and install the library is:
```sh
$ mkdir build
$ cd build
$ cmake ..
$ make
$ sudo make install
```

## Uninstallation

From the build directory run:
```sh
$ sudo make uninstall
```

## Examples
The following test program prints the number of each system call before actually calling it (it is a 'cat' like stdin to stdout copy, when EOF is sent it prints "hello world"):
```C
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
#include <stdarg.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <stdlib.h>
#include <purelibc.h>

static sfun _native_syscall;

static char buf[128];
static long int mysc(long int sysno, ...){
	va_list ap;
	long int a1,a2,a3,a4,a5,a6;
	va_start (ap, sysno);
	snprintf(buf,128,"SC=%d\n",sysno);
	_native_syscall(__NR_write,2,buf,strlen(buf));
	a1=va_arg(ap,long int);
	a2=va_arg(ap,long int);
	a3=va_arg(ap,long int);
	a4=va_arg(ap,long int);
	a5=va_arg(ap,long int);
	a6=va_arg(ap,long int);
	va_end(ap);
	return _native_syscall(sysno,a1,a2,a3,a4,a5,a6);
}

int main() {
	int c;
	_native_syscall=_pure_start(mysc,PUREFLAG_STDALL);
	while ((c=getchar()) != EOF)
		putchar(c);
	printf("hello world\n");
	return 0;
}
```

To run this example just compile it and link it together with the library
in this way:
```
$ gcc -o puretest puretest.c -lpurelibc
```
if you installed purelibc library in /usr/local/lib you need to add this 
directory to the linker search path,

with CSH:
```
$ setenv LD_LIBRARY_PATH /usr/local/lib
```
or with BASH:
```
$ export LD_LIBRARY_PATH="/usr/local/lib"
```
Unfortunately if you load purelibc as a dynamic library by dlopen
it does not work.

The following example solves the problem.
More specifically:

* It is possible to use purelibc to track the calling process and all
the dynamic libraries loaded at run time.
* The code does not depend on purelibc. If you run it un a host without
purelibc, it will not be able to track its system calls but it works.

```C
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
#include <stdarg.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <stdlib.h>
#include <dlfcn.h>
#include <purelibc.h>

static sfun _native_syscall;

static char buf[128];
static long int mysc(long int sysno, ...){
	va_list ap;
	long int a1,a2,a3,a4,a5,a6;
	va_start (ap, sysno);
	snprintf(buf,128,"SC=%d\n",sysno);
	_native_syscall(__NR_write,2,buf,strlen(buf));
	a1=va_arg(ap,long int);
	a2=va_arg(ap,long int);
	a3=va_arg(ap,long int);
	a4=va_arg(ap,long int);
	a5=va_arg(ap,long int);
	a6=va_arg(ap,long int);
	va_end(ap);
	return _native_syscall(sysno,a1,a2,a3,a4,a5,a6);
}

int main(int argc,char *argv[]) {
	int c;
	sfun (*_pure_start_p)();
	void *handle;
	/* does pure_libc exist ? */
	if ((_pure_start_p=dlsym(RTLD_DEFAULT,"_pure_start")) == NULL &&
			(handle=dlopen("libpurelibc.so",RTLD_LAZY))!=NULL) {
		char *path;
		dlclose(handle);
		/* get the executable from /proc */
		asprintf(&path,"/proc/%d/exe",getpid());
		/* preload the pure_libc library */
		setenv("LD_PRELOAD","libpurelibc.so",1);
		printf("pure_libc dynamically loaded, exec again\n");
		/* reload the executable */
		execv(path,argv);
		/* useless cleanup */
		free(path);
	}
	if ((_pure_start_p=dlsym(RTLD_DEFAULT,"_pure_start")) != NULL) {
		printf("pure_libc library found: syscall tracing allowed\n");
		_native_syscall=_pure_start_p(mysc,NULL,PUREFLAG_STDALL);
	}
	while ((c=getchar()) != EOF)
		putchar(c);
	printf("hello world\n");
	return 0;
}
```

To run this example just compile it and link it with the dl library
in this way:
```
$ gcc -o puretest2 puretest2.c -ldl
```
