amboar.github.io

Software development notes

View My GitHub Profile

27 December 2019

KASAN for ARM

by Andrew

KASAN (Kernel Address SANitizer, the kernel implementation of an existing userspace aid) for 32-bit ARM is not yet in mainline Linux, but v6 of a series adding support was posted in mid-2019. As part of tracking down squashfs decompression errors in OpenBMC I took v6 for a test drive. Ultimately it didn’t help my cause, but I had some fun debugging KASAN along the way.

The immediately obvious issue from the first run of the patches was that enabling KASAN caused a failure to boot, before any useful output had made it to the console. Failures like this are sometimes masked by the fact that the kernel won’t produce output until the UART driver is bound to the console device, but this is a well recognised issue and we have the EARLY_PRINTK Kconfig option to help deal with it. Turning on DEBUG_LL and EARLY_PRINTK restrict the kernel to a single UART definition and the resulting kernel isn’t portable, so the options are sometimes avoided in defconfigs.

The following config snippet will build support for the early console on ASPEED BMCs into the kernel:

CONFIG_DEBUG_LL=y
CONFIG_DEBUG_LL_UART_8250=y
CONFIG_DEBUG_UART_PHYS=0x1e784000
CONFIG_DEBUG_UART_VIRT=0xf8184000

However, configuring the kernel this way didn’t change the result - there was still no output with KASAN enabled and so it was time for another approach.

As KASAN is largely hardware independent (though in some cases it can take advantage of hardware assistance) I was testing the builds using the witherspoon-bmc machine model from QEMU. Helpfully, QEMU can provide assistance to gdb through a couple of commandline switches:

$ qemu-system-arm -h | grep -- '-[sS] '
-S              freeze CPU at startup (use 'c' to start execution)
-s              shorthand for -gdb tcp::1234
$ qemu-system-arm \
	-M witherspoon-bmc \
	-m 512 \
	-kernel arch/arm/boot/zImage \
	-dtb arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dtb \
	-initrd rootfs.cpio.xz \
	-nographic \
	-s -S 
...

Using the two switches in concert we can debug the kernel by catching the CPU before the first instruction, attaching a remote gdb session for the kernel and setting some appropriate breakpoints. We know that we’re not getting any output from the kernel on the console, so the issue is before the first printk() succeeds. The first line of output for ARM kernels looks like:

[    0.000000] Booting Linux on physical CPU 0x0

And this line is output from smp_setup_processor_id() in arch/arm/kernel/setup.c.

Returning to gdb, we invoke it from inside the kernel source tree against the vmlinux binary and set it up to connect to the running QEMU instance:

$ gdb-multiarch vmlinux
...
(gdb) target remote :1234
Remote debugging using :1234
0x80000000 in ?? ()
(gdb) break smp_setup_processor_id
Breakpoint 1 at 0x80900d40: smp_setup_processor_id. (2 locations)
(gdb) continue
...

Setting a breakpoint in smp_setup_processor_id() eventually revealed that the printk() itself was the problem, and it was a problem because we were blowing the stack due to memcpy() calling itself recursively as a result of creative KASAN implementation details.

With a patch to fix the recursive memcpy() (1) and another to fix false-positive KASAN warnings in both __copy_to_user_memcpy() and __clear_user_memset() (2), I got QEMU to cleanly boot the kernel and start userspace. KASAN didn’t end up providing the smoking gun I was hoping for to root-cause the problem of squashfs decompression failures, but it did at least rule out that bad kernel memory accesses were the source of the problem.

(1)

diff --git a/arch/arm/include/asm/string.h b/arch/arm/include/asm/string.h
index 1f9016bbf153..3e282d751a08 100644
--- a/arch/arm/include/asm/string.h
+++ b/arch/arm/include/asm/string.h
@@ -54,6 +54,11 @@ static inline void *memset64(uint64_t *p, uint64_t v, __kernel_size_t n)
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
 #define memmove(dst, src, len) __memmove(dst, src, len)
 #define memset(s, c, n) __memset(s, c, n)
+
+#ifndef __NO_FORTIFY
+#define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
+#endif
+
 #endif
 
 #endif

(2)

diff --git a/arch/arm/lib/uaccess_with_memcpy.c b/arch/arm/lib/uaccess_with_memcpy.c
index c9450982a155..3c1e512aff6c 100644
--- a/arch/arm/lib/uaccess_with_memcpy.c
+++ b/arch/arm/lib/uaccess_with_memcpy.c
@@ -116,7 +116,7 @@ __copy_to_user_memcpy(void __user *to, const void *from, unsigned long n)
                        tocopy = n;
 
                ua_flags = uaccess_save_and_enable();
-               memcpy((void *)to, from, tocopy);
+               __memcpy((void *)to, from, tocopy);
                uaccess_restore(ua_flags);
                to += tocopy;
                from += tocopy;
@@ -183,7 +183,7 @@ __clear_user_memset(void __user *addr, unsigned long n)
                        tocopy = n;
 
                ua_flags = uaccess_save_and_enable();
-               memset((void *)addr, 0, tocopy);
+               __memset((void *)addr, 0, tocopy);
                uaccess_restore(ua_flags);
                addr += tocopy;
                n -= tocopy;

tags: