binfmt_misc: The magic behind Linux/Windows interop
2024-01-03
I was running something in WSL, as you do, then I thought about it for a second. When I'm doing this in WSL:
$ clip.exe < file.txt
How does that actually work? It turns out this is done using /init
which is
two things:
PID 1, it's the init system, the parent of all processes in WSL.
An "interpreter" for Windows executables. When you run
clip.exe
, that's the actual Windows binary you're running directly. This works via thebinfmt_misc
mechanism of Linux, which allows you to register runners for any binary with specific magic bytes.
/init
is a bit hard to get at since it's a closed source component of WSL. We
can get some idea of how it might work by looking at (1) a Microsoft blog post
describing how this works at a high
level
and (2) cbwin, an open source implementation of
this.
We can also do fun things, like make Java jars directly executable without
needing to run them with java -jar
. But beware - if you have "fully
executable" jars with scripts embedded at the start (like the ones Spring Boot
makes),
binfmt_misc
can't possibly be able to tell that they're jars.
But java -jar
still works on them! Weird. Here are the questions we want to
answer:
What happens when you run a "normal" Linux executable? What about a shell script?
How does Linux tell that clip.exe is a Windows executable, and how does it run from inside Linux?
How can Java tell that a shell script with some binary junk at the bottom is really a jar, but the Linux kernel (via
binfmt_misc
) can't?!
Answers are below.
What happens when you run a normal executable?
Let's take cat
as an example. From inside your shell, you execute:
$ cat file.txt
Your shell will probably find cat
in the PATH
and use the
execve system call to
execute that file.
This is not mysterious at all. You can see the source code of execve
here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/exec.c?id=HEAD#n2030.
This blog post isn't supposed to be a deep dive into execve
, the point is that
execve
executes executables.
What about a shell script?
Believe it or not, also execve
! execve
reads the first two bytes of the
given file, if they're #!
, then the file gets executed in the way we're all
familiar with.
If file.txt
is given to execve
with these contents:
#!/bin/sh
echo Hello
then execve
will run /bin/sh file.txt
, and we go back to the first case: a
normal executable.
So far, so good, everyone should be familiar with this. The interesting part comes next.
What is binfmt_misc?
binfmt_misc
is documented very well here:
https://docs.kernel.org/admin-guide/binfmt-misc.html. At a high level,
binfmt_misc
is a feature of the Linux kernel that allows you to specify a
rule matching either a filename suffix or magic bytes at an offset in the file,
and an executable to use to run that file, similar to how a shell script is
run.
For example, to match the .txt
extension and cat the text file when "run",
you could run:
$ sudo sh -c 'echo ":cattxt:E::txt::/bin/cat:" > /proc/sys/fs/binfmt_misc/register'
$ vim file.txt
$ chmod +x file.txt
$ ./file.txt
this is my file
it has content
hello
This isn't very useful. The next part is more interesting.
How does WSL tell clip.exe is a Windows executable?
Let's look at clip.exe
:
$ vim /mnt/c/Windows/system32/clip.exe
Right at the start, you'll see the characters "MZ" - these are the first two bytes of any .exe file on DOS or Windows (and the initials of Mark Zbikowski).
MZ<90>^@^C^@^@^@^D^@^@^@ÿÿ^@^@¸^@^@^@^@^@^@^@@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@è^@^@^@^N^_º^N^@´ Í!¸^ALÍ!This program cannot be run in DOS mode.^M^M
...
Let's look at the binfmt_misc
registrations (this example only works in WSL,
of course):
$ ls /proc/sys/fs/binfmt_misc/
WSLInterop register status
It's too easy!
$ cat /proc/sys/fs/binfmt_misc/WSLInterop
enabled
interpreter /init
flags: PF
offset 0
magic 4d5a
And 4d5a is hex for "MZ". So when you execve
a Windows executable like
clip.exe
, Linux will invoke /init
to run clip.exe
. The magic is thus
inside /init
.
/init
is not open source. The blog post linked above has some
hints
and I encourage you to read it.
There's also https://github.com/ionescu007/lxss which contains some interesting proofs of concept for interacting across the Windows/Linux boundary.
How do fully executable jars work?
The interesting part about these is that they don't involve binfmt_misc
at
all, instead they use a different trick.
Go to https://start.spring.io/ and generate the example project. Add this
section to the build.gradle
to generate the "fully executable" jar:
bootJar {
launchScript()
}
Run ./gradlew build
to build the project. You get two jars:
$ ls build/libs/
demo-0.0.1-SNAPSHOT-plain.jar demo-0.0.1-SNAPSHOT.jar
The first jar is not executable and has no main. The second jar is, with either
java -jar
or directly:
$ java -jar build/libs/demo-0.0.1-SNAPSHOT.jar
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v3.2.1)
...
^C
$ build/libs/demo-0.0.1-SNAPSHOT.jar
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v3.2.1)
But what gives, there was no binfmt_misc
for Java jars?! The trick here is
that the jar isn't a jar, it's a shell script:
$ less build/libs/demo-0.0.1-SNAPSHOT.jar
#!/bin/bash
...
<shell script>
...
exit 0
<what looks like binary data>
The binary data after the exit 0
is the jar. This is clever: when run
directly, the shell script re-invokes the jar itself (the shell script itself!)
with java -jar
.
You can verify the binary data is a jar by looking at the magic bytes:
...
*)
echo "Usage: $0 {start|stop|force-stop|restart|force-reload|status|run}"; exit 1;
esac
exit 0
PK^C^D^T^@^H^H^H^@
...
PK^C^D
is exactly the magic byte string for a zip archive. A jar file is just
a zip file with special contents.
This explains how directly invoking the jar executes it without involving
binfmt_misc
.
How does java -jar
execute a jar with text at the start?
java
isn't doing anything clever here, it just treats the jar as any other
zip file - we can even extract the "fully executable" jar with unzip
:
$ unzip build/libs/demo-0.0.1-SNAPSHOT.jar
Archive: build/libs/demo-0.0.1-SNAPSHOT.jar
creating: META-INF/
inflating: META-INF/MANIFEST.MF
...
The cleverness here is in the zip file format itself, see https://en.wikipedia.org/wiki/ZIP_(file_format). A tool that reads a zip file must scan for the central directory data structure signature (some magic bytes) and read it from there. This means that we are allowed to have whatever preamble we want at the start of the file, including executable code, commonly used for self-extracting archives (e.g. an .exe you can run or open with your archive viewer).
This jar isn't self-extracting, but it is kind of self-running. I think it's a neat trick.
Conclusion: why we can't use binfmt_misc
for jars
It's pretty common for fully executable jars to not have a .jar
extension,
since the whole point of being fully executable is that it's like a "normal"
executable. This means we can't use binfmt_misc
's extension matching.
We can't use the magic byte matching either since:
Jars are just zip files, they don't have any unique magic bytes!
#!
is at the start (which we can't and shouldn't hijack), andPK
appears later, but we can't hijack that either, those are the zip file magic bytes and not all zip files are jars.Even if there were jar specific magic bytes, we don't know the offset! The shell script at the start can be any length.
So binfmt_misc
is useful for running files with a specific extension, magic
bytes at a specific offset (e.g. Windows executables!) but jars don't have any
of those.
Final verdict on binfmt_misc
binfmt_misc
doesn't really seem incredibly useful if you ask me. One cool use
case is registering QEMU as a handler for ARM executables while on an x86
machine, then you can run those binaries as if they were native. That doesn't
seem like a real use case to me.
The WSL interop use case actually seems the most compelling to me, but is that a reason to have a whole kernel thing? I don't know.