Over The Air - Vol. 2, Pt. 2: Exploiting The Wi-Fi Stack on Apple Devices ~ Ally Fashion

Posted by Gal Beniamini, Project Zero

In this blog post we’ll continue our journey towards over-the-air exploitation of the iPhone, by means of Wi-Fi communication alone. This part of the research will focus on the firmware running on Broadcom’s Wi-Fi SoC present on the iPhone 7.

We’ll begin by performing a deep dive into the firmware itself; discovering new attack surfaces along the way. After auditing these attack surfaces, we’ll uncover several vulnerabilities. Finally, we’ll develop a fully functional exploit against one of the aforementioned vulnerabilities, thereby gaining code execution on the iPhone 7’s Wi-Fi chip. In addition to gaining code execution, we’ll also develop a covert backdoor, allowing us to remotely control the chip over-the-air.

Along the way, we’ll come across several new security mechanisms developed by Broadcom. While these mechanisms carry the potential to make exploitation harder, they remained rather ineffective in this particular case. By exploring the mechanisms themselves, we were able to discover methods to bypass their intended protections. Nonetheless, we remain hopeful that the issues highlighted in this blog post will help inspire stronger mitigations in the future.

All the vulnerabilities presented in this blog post (#1, #2, #3, #4, #5) were reported to Broadcom and subsequently fixed. I’d like to thank Broadcom for being highly responsive and for handling the issues in a timely manner. While we did not perform a full analysis on the breadth of these issues, a minimal analysis is available in the introduction to the previous blog post.

And now, without further ado, let’s get to it!

Exploring The Firmware

Combining the extracted ROM image we had just acquired with the resident RAM image, we can finally piece together the complete firmware image. With that, all that remains is to load the image into a disassembler and begin exploring.

While the ROM image on the BCM4355C0 is slightly larger than that of previously analysed Android-resident Wi-Fi chips, it’s still rather small (spanning only 896KB). Consequently, Broadcom has once again employed the same tricks in order to conserve as much memory as possible; including compiling the bulk of the code using the Thumb-2 instruction set and stripping away most of the symbols.

As for the ROM’s layout, it follows the same basic structure as that of its Android counterparts; beginning with a code chunk, followed by a blob of constant data (including strings and CRC polynomials), and ending with “trampolines” into detection points in the Wi-Fi firmware’s RAM (and some more spare data).

The same cannot be said of the RAM image; while some similarities exist between the current image and the previously analysed ones, their internal layouts are substantially different. Whereas Android-resident firmwares contained interspersed heap chunks between code and data blobs, this quirk is no longer present in the current RAM image. Instead, the heap chunks are mostly placed in a linear fashion. One exception to this rule is the initialisation code present in the RAM -- once the firmware’s bootup process completes, this blob is reclaimed, and is thereafter converted into an additional heap chunk.

Curiously, the stack is no longer located after the heap, but rather precedes it. This modification has the advantage of preventing potential collisions between the stack and the heap (which were possible in previous firmware versions).

To further our understanding of the firmware, let’s attempt to identify the set of supported high-level features. While ROM images typically contain a wealth of features, not all OEMs choose to utilise every single feature. Instead, the supported features are governed by the RAM’s contents, which selectively adds support for the capabilities chosen by the OEM.

In the context of Android-resident firmware images, identifying the supported features was made easier due to the inclusion of “feature tags” within the version string embedded in the firmware’s RAM image. Each tag indicated support for a corresponding feature within the firmware image. Unfortunately, the iPhone’s Wi-Fi firmware images made away with the detailed version strings, and instead opted for a generic string containing the build type and the chip’s revision:

Nevertheless, we can still gain some insight into the firmware’s feature set, by reverse-engineering the firmware itself. Let’s take a look inside and see what we can find!

It’s Bigger On The Inside

Although most symbols have been stripped from the firmware’s ROM image, whatever symbols remain hint at the features supported by the image. Indeed, going over the strings in the combined image (mostly the ROM, the RAM is nearly devoid of strings), we come across many of the features we’ve identified in the past. Surprisingly, however, we also find a great deal of new features.

While adding features can (sometimes) result in better user experience, it’s important to remember that the Wi-Fi chip is a highly privileged component.

First, as a network interface, the chip has access to all of the host’s Wi-Fi traffic (both inbound and outbound). Therefore, attackers controlling the Wi-Fi firmware can leverage this vantage point to inject or manipulate data viewed by the host. One avenue of attack would therefore be to manipulate the host’s unencrypted web traffic and insert a browser exploit, allowing attackers to gain control over the corresponding process on the host. Other applications on the host which rely on unencrypted communications may be similarly attacked.

In addition to the aforementioned attack surface, the Wi-Fi firmware itself can carry out attacks against the host. As we’ve seen in previous blog posts, the chip communicates with the host using a variety of control messages and through a privileged physical interface (e.g., PCIe); if we were to find errors in the host’s processing of any of those, we might be able to take over the host itself (indeed, we’ll carry out such attacks in the next blog post!).

Due to the above risks, it’s important that the TCB constituted by the Wi-Fi chip remain relatively small. Any components added to the Wi-Fi firmware carry the risk of vulnerabilities being introduced into the firmware which would subsequently allow attackers to assume control over the chip (and perhaps even the entire system).

This risk is compounded by the fact that the firmware employs far fewer defence mechanisms than modern operating systems. Most notably, it does not employ ASLR, does not have stack cookies and fails to implement safe heap unlinking. Therefore, even relatively weak primitives may be exploitable within the Wi-Fi firmware (in fact, we’ll see just such an example later on!).

With that in mind, let’s take a look at the features present in the Wi-Fi firmware. It’s important to note that the mere presence of these code paths in the firmware does not imply that they are enabled by default. Rather, in most cases the host chooses whether to enable specific features, depending on user or network related configurations.

Apple-Specific Features

After some preliminary exploration, we come across a group of unfamiliar strings referencing a feature called “AWDL”. This acronym refers to “Apple Wireless Direct Link”; an Apple-specific protocol designed to provide peer-to-peer connectivity, notably used by AirDrop and AirPlay. The presence of Apple-specific functionality within the ROM affirms the suspicion that these Wi-Fi chips are used exclusively by Apple devices.

From a security perspective, it appears that the attack surface exposed by this functionality within the Wi-Fi firmware is rather limited. The firmware contains mechanisms for the configuration of AWDL-related features, but such operations are driven primarily by host-side logic (via AppleBCMWLANCore and IO80211Family).

Moving right along, we come across another group of unexpected strings:

This code originates from mDNSResponder, Apple’s open-source implementation of Multicast DNS (a standardised zero-configuration service commonly used within the Apple ecosystem).

Reverse-engineering the above fragments, we come to the realisation that it is a stripped-down version of mDNSResponder, mostly responsible for performing wake on demand via mDNS (for networks that include a Bonjour Sleep Proxy). Consequently, it does not offer all the functionality provided by a fully-fledged mDNS client. Nonetheless, embedding code from complex libraries such as mDNSResponder could carry undesired side effects.

For starters, mDNSResponder itself has been affected by several security issues in the past. More subtly, non-security bugs in libraries can become security-relevant when migrating between systems whose characteristics differ so widely from one another. Concretely, on the Wi-Fi chip address zero points to the firmware’s interrupt vectors -- a mapped, writable address. Being able to modify this address would allow attackers to gain code-execution on the chip, thereby converting a class of “benign” bugs, such as a null-pointer accesses, to RCEs.

Offloading Mechanisms

Since the Wi-Fi SoC’s ARM core is less power-hungry than the application processor, it stands to reason that some network related functionality be relegated to the firmware, when possible. This concept is neither new, nor is it unique to the mobile settings; offloading to the NIC occurs in desktop environments as well, primarily in the form of TCP offloading (via TOE) .

Regardless, while the advantages of offloading are clear, it’s important to be aware of the potential downsides as well. For starters, as we’ve mentioned before, the more features are added into the Wi-Fi firmware, the less auditable it becomes. Additionally, offloading features often require parsing of high-level protocols. Since the Wi-Fi firmware does not contain its own TCP/IP stack, it must resort to parsing all layers in the stack manually, up to the layer at which the offloading occurs.

Inspecting the firmware reveals at least two features which allow offloading of high-level protocols to the Wi-Fi firmware: ICMPv6 offloading and TCP KeepAlive offloading. Apple’s host-side drivers contain controls for enabling and disabling these offloading features (see AppleBCMWLANCore), subsequently handing over control over these packets to the Wi-Fi firmware rather than the host.

While beyond the scope of this blog post, auditing both of the aforementioned offloading features revealed two security bugs allowing attackers to either leak constrained data from the firmware or to crash the firmware running on the Wi-Fi SoC (for more information see the bug tracker entries linked above).

Generic Attack Surfaces

While the aforementioned attack surfaces may be interesting from an exploratory point of view, each of the outlined features was rather constrained in scope. Perhaps, we could find a more fruitful attack surface within the Wi-Fi firmware?

...This is where some familiarity with the Wi-Fi standards comes in handy!

Wi-Fi frames can be split into three distinct categories. Each frame is assigned a category by inspecting the “type” and “subtype” fields in its MAC header:

The categories are as follows:

Data Frames - Carry data (and QoS data) over the Wi-Fi network.
Control Frames - Body-less frames assisting with the delivery of other frames (using ACKs, RTS/CTS, Block ACKs and more).
Management Frames - Perform complex operations, including connecting to a network, modifying the state of individual stations (STAs), authenticating and more.

Let’s take a second to consider the categories above from a security PoV.

While data frames contain interesting attack surfaces (such as frame aggregation via A-MSDU/A-MPDU), they hide little complexity otherwise. Conversely, features present directly on the RX-path, such as the aforementioned offloading mechanisms, are generally also accessible through data frames, thereby increasing the exposed attack surface. Control frames, on the other hand, offer limited complexity and therefore do not significantly contribute to the attack surface.

Unlike the first two categories, management frames are where the majority of the firmware’s complexity lays. Many of the mechanisms encapsulated by management frames make for interesting targets in their own right; including authentication and association. However, we’ll choose to focus on one subtype offering the largest attack surface of all -- Action Frames.

Most of the logic behind “advanced” Wi-Fi features (such as roaming, radio measurements, etc.) is implemented by means of Action Frames. As their name implies, these frames trigger “actions” in stations within the network, potentially modifying their state.

Curiously, unlike data frames, management frames (and therefore action frames as well) are normally unencrypted, even if networks employ security protocols such as WPA/WPA2. Instead, certain management frames can be encrypted by enabling 802.11w Protected Management Frames. When enabled, 802.11w allows for confidentiality of those frames’ contents, as well as a form of replay protection.

Summing up, action frames constitute a large portion of the attack surface -- they are mostly unprotected frames, using which state-altering functionality is carried out. To discover the exact extent of their exposed attack surface, let’s explore those frames in more depth.

Action Frames

Consulting IEEE 802.11-2016 (9-47), full list of action frame categories is quite formidable:

To illustrate the amount of complexity encapsulated by action frames, let’s return to the iPhone’s Wi-Fi firmware. Tracing our way through the RX-path, we quickly reach the function at which action frames are handled within the firmware (referred to as “wlc_recv_mgmtact” in the ROM):

wlc_recv_mgmtact - 0x1A79F4

As we can see, the function performs some preliminary operations, before handing off processing to one of the numerous handlers within the firmware. Each action frame category is delegated to a single handler. Counting the action frame handlers and corresponding frame types supported by the iPhone’s firmware, we find 13 different supported categories, resulting in 34 different supported frame types. This is a substantial attack surface to explore!

To assess the handlers’ security, we’ll reverse-engineer each of the above functions. While this is a slow and rather tedious process, recall that each vulnerability found in the above handlers implies a triggerable over-the-air vulnerability in the Wi-Fi chip.

Before attempting a manual audit, we also “fuzzed” the action frame handlers. To do so, we developed an on-chip Wi-Fi fuzzer, allowing injection of frames directly into the aforementioned handler functions (without transmitting the frames over-the-air). While the fuzzer allowed for high-speed injection of frames (namely, thousands of frames per second), running it using a small corpus of action frames and inducing bit-flips in them was unfruitful... One possible explanation for this approach’s failure is due to the strict structure mandated by many action frames. Perhaps these results could be improved by fuzzing based on a grammar derived from the Wi-Fi standard, or enforcing structure constraints on the fuzzed content.

Regardless, we can employ some tricks to speed up our manual exploration. Recall that Wi-Fi primarily relies on Information Elements (IEs), tagged bundles of data, to convey information. The same principle applies to action frames -- their payloads typically consist of multiple IEs, each encapsulating different pieces of information relating to the handled frame. As the IE tags are (mostly) unique within the Wi-Fi standard, we can simply lookup the tag value corresponding to each processed IE, allowing us to quickly familiarise ourselves with the surrounding code.

After going through the handlers outlined above, we identified a number of vulnerabilities.

First, we discovered a vulnerability in 802.11v Wireless Network Managements (WNM). WNM is a set of standards allowing clients to configure themselves within a wireless network and to exchange information about the network’s topology. Within the WNM category, the “WNM Sleep Mode Response” frame serves to update the Group Temporal Key (GTK) when the set of peers in the network changes. As it happens, reverse-engineering the WNM handler revealed that the corresponding function failed to verify the length of the encapsulated GTK, thereby triggering a controlled heap overflow (see the bug tracker for more information).

By cross-referencing the GTK handling method, we were able to identify a similar vulnerability in 802.11r Fast BSS Transition (FBT). Once again, the firmware failed to verify the GTK’s length, resulting in a heap overflow.

While both of the above vulnerabilities are interesting in their own right, we will not discuss them any further in this blog post. Instead, we’ll focus on a different vulnerability altogether; one with a weaker primitive. By demonstrating how even relatively “weak” primitives can be exploited on the Wi-Fi firmware, we’ll showcase the need for stronger exploit mitigations.

To make matters more interesting, we’ll construct our entire exploit using nothing but action frames. These frames are so feature-rich, that by leveraging them we will be able to perform heap shaping, create allocation primitives, and of course, trigger the vulnerability itself.

802.11k Radio Resource Management

802.11k is an amendment to the Wi-Fi standard aiming to bring Radio Resource Management (RRM) capabilities to Wi-Fi networks. RRM-capable stations in the network are able to perform radio measurements (and receive them), allowing access points to reduce the congestion and improve traffic utilisation in the network. The concept itself is not new in the mobile sphere; in fact, it’s been around in cellular networks for over two centuries.

Within the Wi-Fi ecosystem, RRM is commonly utilised in tandem with 802.11r FBT (or the proprietary CCKM) to enable seamless access point assisted roaming. As stations decide to “handover” to different access points (based on their radio measurements), they can consult the access points within the network in order to obtain a list of potential neighbours with which they may reassociate.

To implement all of the above, a set of action frames (and a new action category) have been added to the Wi-Fi standard. Consequently, clients can perform radio and link measurement requests, receive the corresponding reports, and even process reports containing their neighbouring access points (should they decide to roam).

All of the above functionality is also present in the Wi-Fi firmware on the iPhone:

Auditing the handlers above, we come across one function of particular note; the handler for Neighbor Report Response frames.

The Vulnerability

Neighbor Report Response (NRREP) frames are reports delivered from the access point to stations in the network, informing stations of neighbouring access points in their vicinity. Upon roaming, the stations may use these parameters to reassociate with the aforementioned neighbours. Providing this information spares the stations the need to perform extensive scans on their own -- a rather time consuming operation. Instead, they may simply rely on the report, informing it of the specific channels and operating classes inhabited by each neighbour.

Like many action frames, NRREPs also contain a “dialog token” (9.6.7.7). This 1-byte field is used to correlate between requests issued by a client, and their corresponding responses. As their name implies, Neighbour Report Responses are typically transmitted in response to a corresponding request made by the client earlier on (commonly as a result of a radio measurement indicating that a roam may be imminent). As we’d expect, upon sending a Neighbor Report Request, the client generates and embeds a dialog token, which in later verified by it when processing the corresponding NRREP returned by the access point.

However, reading more carefully through the specification reveals another interesting scenario! It appears that NRREPs may also be entirely unsolicited. In such a case, the dialog token is simply set to zero, indicating that no matching request exists.

IEEE 802.11-2016, 9.6.7.7

Consequently, NRREPs may be transmitted over the network to any client at any time, so long as it supports 802.11k RRM. Upon reception of such a report (with a zeroed dialog token), the client will simply parse the request and handle the data therein.

Continuing to read through the standard, we can piece together the frame’s overall structure; starting from the action frame header, all the way the encapsulated IE:

As we can see above, the bulk of the data in the NRREP frame is conveyed through the “Neighbour Report” IE. NRREPs may contain one or more such IEs, each indicating the presence of a single neighbour.

Now that we have a firm understanding of the frame’s structure, let’s take a look at the firmware’s implementation of the functionality described above. Following along from the initial NRREP handler, we quickly come to a ROM function responsible for handling the reports. Reverse-engineering the function, we arrive at the following high-level logic:

1. int wlc_rrm_recv_nrrep(void* ctx, ..., uint8_t* body, uint32_t bodylen) {

3. //Ensuring the request is valid

3. if (bodylen <= 2 || !g_rrm_enabled || body[2] != stored_dialog_token)

4. ... //Handle error

6. //Freeing all the previously stored reports

7. free_previous_nrreps(ctx, ...);

9. //Stripping the action Header

10 uint8_t* report_ie = body + 3;

11. bodylen -= 3;

12.

13. //Searching for the report IE

14. do {

15. ... //Verify the IE is valid

16. if (report_ie[0] == 52 && report_ie[1] > 0xC) //Tag, Length

17. break; //Found a matching IE!

18. } while (report_ie = bcm_next_tlv(report_ie, &bodylen));

19. if (!report_ie)

20. ... //Handle error

21.

22. //Handle the report

23. uint8_t* nrrep_data = malloc(28);

24. if (!nrrep_data)

25. ... //Handle error

26.

27. memcpy(nrrep_data + 6, report_ie + 2, 6); //Copying the BSSID

28. ... //Copying other elements...

29. nrrep_data[16] = report_ie[12]; //Operational Class

30. nrrep_data[17] = report_ie[13]; //Channel Number

31.

32. //Processing the report

33. void* elem = wlc_rrm_regclass_neighbor_count(ctx, nrrep_data, ...);

34. ...

35. }

As we can see above, the function begins by performing some cursory validation of the received request. Namely, it ensures that RRM is enabled within the network, that the report is sufficiently long, and that the received dialog token matches the stored one (if a solicited request was initiated by the client, otherwise the stored token is set to zero).

After performing the necessary validations and locating the report IE, the function proceeds to extract the encoded report information and store it within a structure of its own. Finally, the newly created structure is passed on for processing within wlc_rrm_regclass_neighbor_count. Let’s take a closer look:

1. void* wlc_rrm_regclass_neighbor_count(void* ctx, uint8_t* nrrep_data, ...) {

3. //Searching for previous stored elements with the same Operational

4. //Class and Channel Number

5. if (find_nrrep_buffer_and_inc_channel_idx(ctx, nrrep_data, ...))

6. return NULL;

8. //Creating a new element to hold the NRREP data

9. uint8_t* elem = zalloc(456);

10. if (!elem)

11. ... //Handle error

12. elem[4] = nrrep_data[16]; //Operational Class

13. ((uint16_t*)(elem + 6))[nrrep_data[17]]++; //Channel Number

14.

15. //Adding the element to the linked list of stored NRREPs

16. *((uint8_t**)elem) = ctx->previous_elem;

17. ctx->previous_elem = elem;

18. return elem;

19. }

As shown in the snippet above, the firmware keeps a linked list of buffers, one per “Operational Class”. Each buffer is 456 bytes long, and contains the operational class, an array holding the number of neighbours per channel, and a pointer to the next buffer in the list.

While not shown above, find_nrrep_buffer_and_inc_channel_idx performs a similar task - it goes over each element in the list, looking for an entry matching the current operational class. Upon finding a matching element, it increments the neighbour count at the index corresponding to the given channel number, and returns 1, indicating success.

So why are these handlers interesting? Consider that valid 802.11 channel numbers range from 1-14 in the 2.4GHz spectrum, all the way up to 196 in the 5GHz spectrum. Since each neighbour count field in the array above is 16-bits wide, we can deduce that the neighbour count array can be used to reference channel numbers up to 224 ((456 - 6)/sizeof(uint16_t) == 224).

However, looking a little closer it appears that the functions above make no attempt to validate the Channel Number field! Therefore, malicious attackers can encode whatever value they desire in that field (up to 255). Encoding a value larger than 224 will therefore trigger a 16-bit increment to be performed out-of-bounds (see line 13), thereby corrupting memory after the NRREP buffer!

Understanding The Primitive

Before we move on, let’s take a second to understand the exploit primitive -- as mentioned above, we are able to perform 16-bit increments (which are also 16-bit aligned), spanning up to 60 bytes beyond our allocated buffer.

Oddly, while the standards specify that each NRREP may contain several encoded reports (each of which should be handled by the receiving station), it appears that the handler functions above only processes a single IE at a time. Therefore, each NRREP we send will be able to trigger a single OOB increment.

9.6.7.7

This last fact ties in rather annoyingly with another quirk in the firmware’s code -- namely, upon reception of each NRREP, the list of stored NRREP elements is freed before proceeding to process the current element (see line 7, where free_previous_nrreps is invoked). It remains unclear whether this is intended behaviour or a bug, but the immediate consequence of this oddity is that following each OOB increment, the buffers are subsequently freed, allowing other objects to take their place.

Lastly, the reception of each NRREP triggers two allocations of distinct sizes; one for the linked list element (456 bytes), and another to store the report’s data (28 bytes). As a result, any heap shaping or grooming we’ll perform will have to take both allocations into consideration.

Triggering The Vulnerability

Configuring The Network

To begin developing our exploit, we’ll use the same test network environment we described in the previous blog post, using the following topology:

As we’re going to leverage NRREPs, it’s important to set up our test network to support neighbour reports. Like many auxiliary Wi-Fi features, support for NRREPs is indicated by setting the corresponding bit in the capability IEs broadcast in the network’s beacon. RRM-related functionality is encoded using the “RM Enabled Capabilities” information element.

Since we’re using hostapd to broadcast our network, we’ll enable the rrm_neighbor_report setting in our network’s configuration. Enabling this feature should set the corresponding field in the “RM Enabled Capabilities” IE to indicate support for neighbour reports. Let’s inspect a beacon frame to make sure:

Alright, seems like our network configuration is valid! Next, we’ll want to construct an interface allowing us to send arbitrary neighbour reports to peers in the network.

To do so, we’ll extend hostapd by adding new commands to its control interface. Each new command will correspond to a single frame type we’d like to inject. After adding our code to hostapd, we can simply connect to the control interface and issue the corresponding commands, thereby triggering the transmission of the requested frames from the access point to the selected peer. You can find our patches to hostapd in the exploit bundle on the bug tracker.

It should be noted that this approach is not infallible. Since we’re utilising a SoftMAC dongle to transmit our internal network, the SoftMAC layer of the Linux Kernel is responsible for some of the MLME processing done on the host. Therefore, it’s possible that processing done by this layer will interfere with the frames we wish to send (or receive) during the exploit’s flow. To get around this limitation, we’ve taken care to construct the frames in a manner that does not clash with Linux’s SoftMAC stack.

Sending NRREPs

After configuring and broadcasting our network, we can finally attempt to trigger the vulnerability itself. This brings us to a rather important question; how will we know whether the vulnerability was triggered successfully or not? After all, a single 16-bit increment may be insufficient to cause significant corruption of the firmware’s memory. Therefore it’s entirely possible that while the OOB access will occur, the firmware will happily chug along without crashing, leaving no observable effects indicating the vulnerability was triggered.

Remembering our Wi-Fi debugger from the previous blog post, one course of action immediately springs to mind -- why not simply hook the NRREP processing function with our own handler, and see whether our handler is invoked upon transmitting a malicious NRREP? This is easier said than done; it turns out most of the NRREP handling functionality (especially the actual vulnerability trigger, which we’re interested in) is located within the ROM, preventing us from inserting a hook.

As luck would have it, a new feature developed by Broadcom can be leveraged to solve this issue. To allow tracing different parts of the firmware’s logic, including the ROM, Broadcom have introduced a set of logging functions embedded throughout the firmware. Curiously, this mechanism was not present in the Android-resident firmwares we had analysed in the past.

Reverse-engineering this mechanism, it appears to operate in the following manner: each trace is assigned an identifier, ranging from 0x0 to 0x50. When a trace is requested, the firmware inspects an internal array of the same size stored in the firmware’s RAM, to gage whether the trace with the given identifier has been enabled or not. Each identifier has a corresponding 8-bit mask representing the types of traces enabled for it. As we are able to access the firmware’s RAM, we can simply enable any trace we like by setting the corresponding bits in the trace array. Subsequently, traces with the same ID will be outputted to the firmware’s console, allowing us to handily dump them using our Wi-Fi firmware debugger.

This functionality has also been incorporated into our Wi-Fi debugger, which exposes functions to read and modify the log status array as well as API to read out the firmware’s console.

Using the above API, we can now enable the traces referenced in the NRREP’s ROM handlers. Taking a closer look at the NRREP handling function in the firmware, we come across the following traces:

Alright, so we’ll need to enable log identifier 0x16 to observe these traces. After enabling the trace, sending an NRREP and reading out the firmware’s console, we are greeted with the following result:

Great! Our traces are being hit, indicating that the NRREP is successfully received by the station. With that, let’s move on to the next step - devising an exploit strategy.

An Exploit Strategy

Understanding The Heap

Since the vulnerability in question is a heap memory corruption, it’s important that we take a second to familiarise ourselves with the allocator’s implementation. In short, it is a “best-fit” allocator, which performs forward and backward coalescing, and keeps a singly linked list of free chunks. When chunks are allocated, they are carved from the end (highest address) of the best-fitting free chunk (smallest chunk that is large enough).

Free chunks consist of a 32-bit size field and a 32-bit “next” pointer, followed by the chunk’s contents. In-use chunks contain a single 32-bit size field, of which the top 30 bits denote the chunk’s size, and the bottom 2 bits indicate status bits. Putting it together, we arrive at the following layout:

Sketching An Exploit Strategy

Before we rush ahead, let’s begin by devising a strategy. We already know that our exploit primitive allows us to perform 16-bit increments, spanning up to 60 bytes beyond our allocated buffer.

It’s important to note that the heap’s state, perhaps surprisingly, is incredibly stable -- little to no allocations are performed. What little allocations are made, are immediately freed thereafter. As for frames carrying traffic (received or transmitted); they are not carved from the heap, but rather drawn from a special “pool”. As such, the presence of traffic should not affect the heap’s state.

The heap’s stability is a double-edged sword; on the one hand, we are guaranteed relative convenience when shaping and modifying the heap’s state, as no allocations other than our own will interfere with the heap’s structure. On the other hand, the set of allocations that can be made (and subsequently, targeted by us using the vulnerability primitive) is limited.

Indeed, going over the action frame handlers and searching for objects which may serve as viable targets for modification, we come up empty handed. The only data types that may be allocated either store their “interesting” data farther than 56 bytes away from their origin (accounting for the in-use chunk’s header), or simply do not contain “interesting” data for modification.

Perhaps, instead, we could leverage the heap itself to hijack the control flow? If we were able to hijack the “next” pointer of a free chunk and subsequently point it at a location of our choosing, we could overwrite the target address with a subsequent allocation’s contents. This prospect sounds rather alluring, so let’s try and pursue this route.

Writing An Exploit

Hijacking A Free Chunk

To hijack a free chunk, we’ll need to commandeer a chunk’s “next” pointer. Recall that our exploit primitive allows us some degree of control over neighbouring data structures. As such, let’s consider the following placement in which a free chunk is within range of the NRREP buffer:

Leveraging our OOB increment, we can directly modify the chunk’s “next” pointer by sending an NRREP request with the corresponding channel number. Naively, this would allow us to gain control over a free chunk in the heap, by simply directing the “next” at a location of our choosing.

However, this approach turns out to be infeasible.

In order to direct the “next” pointer at a meaningful address, we’d have to either know its value in advance (in order to calculate the number of increments required to convert the pointer from its current value to the target value), or we’d have to know the relative offset between its current value and the desired target.

As we do not know the exact addresses of heap chunks (nor would we want to resort to guessing them), the former option is ruled out. What about the latter approach? Recall that our primitive allows for 16-bit increments. Therefore, we can either increase the pointer’s value by 1 (by increment the bottom half word), or by 65536 (by incrementing the top half word).

Incrementing the pointer by 1 will result in an unaligned chunk address in the freelist. Recall, however, that our vulnerability primitive triggers deallocations on every invocation. As it happens, the allocator’s “free” function validates that each chunk in the freelist is aligned. When an unaligned block in encountered, it generates a fault and halts the firmware. Thus, an increment on of bottom half-word will result in the firmware crashing.

Incrementing the top half-word similarly fails; since all the heap’s chunks are less than 65536 away from the RAM’s end address, incrementing the top half-word will result in the “free” function attempting to access memory beyond the RAM, triggering an access violation and halting the firmware.

So how can we commandeer a free chunk nevertheless?

To do so we’ll need to use a more subtle approach - instead of modifying the free chunk’s contents directly, we’ll aim to achieve a layout in which two free chunks overlap one another, thereby causing allocations carved from one chunk to overwrite the metadata of the other (leading to control over the latter’s “next” pointer).

Heap Shaping

Achieving a predictable heap layout is key for a reliable exploit. As our current goal is to create a specific layout (namely, two overlapping heap chunks), we require setting up the heap in a manner which would allow us to achieve such a layout.

Classically, heap shaping is performed by leveraging primitives allowing for control either over an allocation’s lifetime, or optionally over the allocation’s size. Triggering allocations within the heap without immediately freeing them, allows us to fill “holes” in the heap, leading to a more predictable layout.

The allocator used in the firmware is a “best-fit” allocator which allocates from high addresses to lower ones. Consequently, if all “holes” in the heap of a certain size are filled, subsequent allocations of the same size (or larger) would be carved from the best-fitting chunk, proceeding from top to bottom, thus creating a linear allocation pattern.

To understand the Wi-Fi firmware’s heap layout, let’s take a snapshot of the heap’s state using our Wi-Fi debugger (repeating the process multiple times to account for any variability in the state):

As we can see, several small chunks are strewn across the heap, alongside a single large chunk. From the structure above, we can deduce that in order to create a predictable allocation pattern for our NRREP buffer, we’d simply need a shaping primitive allowing us to fill all the “holes” whose sizes match that of the NRREP buffer.

However, this is easier said than done. As we’ve mentioned before, little allocations occur during routine operations, and those that do are immediately freed thereafter. Combing through all the action frame handlers, we fail to find even a single instance of a memory leak (i.e., an allocation with infinite lifetime), or even an allocation that persists beyond the scope of the handlers themselves. Be that as it may, we do know of one mechanism, governed by action frames, which could offer a solution.

Normally, each Wi-Fi frame received by a station is individually acknowledged by transmitting a corresponding acknowledgement frame in response. However, many use-cases exist in which multiple frames are expected to be sent at the same time; requiring an acknowledgement for each individual frame in those cases would be rather inefficient. Instead, the 802.11n standard (expanding on 802.11e) introduced “Block Acknowledgements” (BA). Under the new scheme, stations may acknowledge multiple frames at once, by transmitting a single BA frame.

To utilise BAs, a corresponding session must first be constructed. This is done by transmitting an ADDBA Request (IEEE 802.11-2016, 9.6.5.2) from the originating peer to the responder, resulting in an ADDBA Response (9.6.5.3) being sent in the opposite direction, acknowledging a successful setup. Similarly, BAs can be torn down by sending a DELBA frame, indicating the BA should no longer be active. Each BA is identified by a unique Traffic Identifier (TID). While the standard specifies up to 16 supported TIDs, the firmware only supports the first 8, restricting the number of BAs possible in firmware to the same limit.

Since the lifetime of BAs is explicitly controlled by the construction of the corresponding BA sessions, they may constitute good heap shaping candidates. Indeed, going over the corresponding action frame handler in the firmware, it appears that every allocated BA results in a 164-byte allocation being made, holding the BA’s contents. The allocation persists until the corresponding DELBA is received, upon which the BA structure corresponding to the given TID is freed.

To use BAs in our network, we’ll add a new command to hostapd, allowing injection of both ADDBA and DELBA requests with crafted TIDs. Furthermore, we’ll take care to compile hostapd with support for 802.11n (CONFIG_IEEE80211N) and to enable it in our network (ieee80211n).

Putting the above together, we arrive at a pretty powerful heap shaping primitive! By sending ADDBA Requests, we can trigger the allocation of up to eight distinct 164-byte allocations. Better yet, we can selectively delete the allocations corresponding to each BA by sending a DELBA frame with the corresponding TID.

Having said that, two immediate downsides also spring to mind. First, the allocation size is fixed, therefore we cannot use the primitive to shape the heap for allocations smaller than 164 bytes. Secondly the contents of the BA buffers are uncontrolled by us (they mostly contain bit-fields used for reordering frames in the BA).

Attempting Overlapping Chunks

Using our shiny new shaping primitive, we can now proceed to shape the heap in a manner allowing the creation of overlapping chunks. To do so, let’s begin by allocating all the BAs, from 0 through 7. The first few allocations will fill in whatever holes can accommodate them within the heap. Subsequently, the rest of the allocations will be carved for the main heap chunk, advancing linearly from high addresses to lower ones.

(Grey blocks indicate free chunks)

Quite conveniently, as the allocation primitive is much larger than the “small buffer” allocated during the NRREP request, it allows the smaller holes in the heap, those large enough to hold the 28 byte allocation, to persist. Consequently, the “smaller buffer” is simply carved from one of the remaining holes, allowing us to safely ignore it.

Getting back to the issue at hand - in order to create an overlapping allocation, all we’d need to do is use the vulnerability primitive to increment the size field of one of the BAs. After growing the size by whichever amount we desire, we can proceed to delete the newly expanded BA, along with its neighbouring BAs, causing an overlapping allocation.

Unfortunately, running through the above scenario results in a resounding failure -- the firmware crashes upon any attempt to free a block causing an overlapping allocation…

To get down to the bottom of this odd behaviour, we’ll need to locate the source of the crash. Inspecting the AppleBCMWLANBusInterfacePCIe driver, it appears that whenever a trap is generated by the firmware, the driver simply collects the aforementioned crash data, and outputs it to the device’s syslog. Therefore, to inspect the crash report, we’ll simply dump the syslog using idevicesyslog. After generating a crash we are presenting with the following output:

Inspecting the source address of the crash in the firmware’s image, we come across an unfamiliar block of code in the “free” function, which was not present in prior firmware versions. In fact, the entire function seems to have many of these blocks… To understand this new code, let’s dig a little deeper.

New Mitigations

Going over the allocator’s “free” function, we find that in addition to freeing the blocks themselves, the function now performs several additional verifications on the heap’s structure, meant to ensure that it is not corrupted in any way. If any violations are detected, the firmware calls an “abort” function, subsequently causing the firmware to crash.

After reverse-engineering all the above validations, we arrive at the following list of mitigations:

The chunk’s bounds are compared against a pre-populated list of “allowed” regions.
The chunk is compared against all other chunks in the freelist, searching for overlaps.
The chunk is checked against a list of “disallowed” regions.
The chunk is ensured to be 4-byte aligned.

If any violation is detected, the firmware triggers the “abort” function, thereby halting execution.

It appears that Broadcom has done some hardening on the allocator! This is great from a security perspective, but rather bleak news for our exploit, as it appears that any attempt to create an overlapping pair of chunks will result in a crash. Perhaps we’re out of luck…

Bypassing Mitigation #1

...Or are we?

Instead of first increasing a heap block’s size, then freeing it to create overlapping chunks, let’s opt for different approach. We’ll arrange for the following layout; first, we’ll create two free chunks, which are not immediately adjacent to one another (to prevent the allocator from coalescing them). Then we’ll use the NRREP primitive to slowly increment the size of one block, until it overlaps the other.

However, as the NRREP primitive only allows us to modify data extending up to 60 bytes after the buffer, and each BA buffer is much larger in size (164 bytes), we’ll first need to devise a plan to get our NRREP buffer closer to a free chunk, without it actually impeding on the chunk (and thereby coalescing with it).

We’ll do so by leveraging a little trick. After allocating all the BAs, we’ll proceed to slightly increment the last BA’s size using the vulnerability primitive. Once that chunk is freed, a free chunk is subsequently created in its place, spanning the new expanded size instead of the original allocation’s size. Since the new free chunk extends into neighbouring BAs, the next BA allocation will therefore overlap a previously allocated BA. This allows us to effectively “sink” an allocation into neighbouring blocks, advancing the scope of influence of our NRREP buffer to previously unreachable objects!

As the allocator’s “malloc” function zeroes every chunk upon allocation, following the plan above will lead to BA6’s size being set to zero. However, there’s no need to fret, we can simply increase it using our NRREP primitive (as we’re now within range of BA6).

Next, we’ll increase BA6’s size slightly until it nearly overlaps with BA5. Then, we can free both BAs, and proceed to use the NRREP buffer to increase BA6’s free chunk until it overlaps with BA5’s. It’s important to note that since both “holes” are much smaller than the NRREP buffer, it won’t be placed within them, leaving us to utilise them as we please.

Bypassing Mitigation #2

Having created a pair of overlapping free-chunks, our first instinct is to carve an allocation from the encompassing chunk, thereby overwriting the other chunk’s metadata. To do so, we’ll need to find an allocation primitive allowing for control over its contents.

Recall that we have already searched for (and failed to locate) allocations with a controlled lifetime. Therefore, any allocation primitive we do find would be one with a limited lifespan. But alas, freeing an allocation carved from any of the overlapping chunks will lead us once again to the “free” function’s overlapping chunk mitigation, subsequently halting the firmware (and thwarting our attempt). Let’s take a closer look at the mitigation and see whether we can find a way around it.

Going through the code, it appears to have the following high-level logic:

1. //Calculating the current chunk’s bounds

2. uint8_t* start = (uint8_t*)cur + sizeof(uint32_t);

3. uint8_t* end = start + (cur->size & 0xFFFFFFFC);

5. //Checking for intersection between the current chunk and each free-chunk

6. for (freechunk_t* p = get_freelist_head(); p != NULL; p = p->next) {

7. uint8_t* p_start = (uint8_t*)p;

8. uint8_t* p_end = p_start + (p->size & 0xFFFFFFFC) + 2 * sizeof(uint32_t);

9. if (end > p_start && p_end > start)

10. CRASH();

11. }

As we can see above, the code snippet above lacks checks for integer overflows! Therefore, by storing a sufficiently large size in a free chunk, the calculation of p_end will result in an integer overflow, leading the value stored to become a low address. Consequently, the expression at line 9 will always evaluate to “false”, allowing us to bypass the mitigation.

Great, so all we need to do is ensure that when overwriting BA5’s free chunk, we also set its size to an exorbitantly large value. Moreover, as we’re dealing with a “best-fit” allocator, such a chunk will never be the best fitting (as smaller chunks will always exist), therefore there’s no need to worry about the allocator using our malformed chunk in the interim.

Creating Overlapping Chunks

To proceed, we’ll need to locate an allocation primitive allowing control over its contents, and preferably also offering a controlled size. Using such a primitive, we’ll be able to create an allocation for which BA6’s free chunk is the best fitting, subsequently overwriting BA5’s free-chunk header.

Going through the action frame handlers once again, we find a near-fit; Spectrum Measurement Requests (SPECMEAS). In short, SPECMEAS frames (9.6.2.2) are action frames belonging to the Spectrum Management category. These requests are used by access points to instruct stations to perform various measurements and report the results back to the network.

Broadcom’s Wi-Fi firmware supports two different types of measurements; a “basic” measurement, and a “Clear Channel Assessment” (CCA) measurement. Upon receiving a SPECMEAS request, the firmware allocates a buffer in order to store the report’s data. For every “CCA” measurement received, 5 bytes are added to the buffer’s size. However, for every “CCA” measurement encountered, 17 bytes are added to the buffer, of which many contain attacker-controlled data!

Using this primitive we can therefore trigger allocations of sizes that are linear combinations of 5 and 17. For every 17-byte block corresponding to a “basic” measurement, we can control several of the embedded bytes (namely, those at indices [5,15], 2).

While not a perfect allocation primitive, it’ll have to do. Since there are more than eight subsequent controlled bytes for each “basic” measurement request, we can use them in order to overwrite BA5’s free chunk header (the 32-bit size and pointer fields). By using a linear combination of the sizes above, we’ll guarantee that the controlled bytes are aligned with BA5’s free chunk header. Lastly, the size of the allocation performed must also be chosen so that BA6’s free chunk is the best fitting (therefore forcing the allocation to be carved from it, rather than other free chunks). Putting it all together, we arrive at the following layout:

Overwrite Candidates

Now that we’re able to commandeer free chunks, we just need to find some overwrite candidates in order to hijack the control flow.

Whereas in the previous firmware versions we researched, in-use chunks contained the same fields as a free chunks (namely, 32-bit size and next fields), the current chunks’ formats make them incompatible with free chunks. Therefore, in-use chunks normally do not constitute valid targets to impersonate free chunks.

Nonetheless, it is not entirely impossible that such objects exists. For example, any in-use allocation starting with a 32-bit zero word would be a valid free chunk. Similarly, chunks could begin with 32-bit pointers to other data types, which themselves may constitute a valid chain of free chunks. Even better yet, any data structure in the firmware’s RAM (not only heap chunks) could conceivably masquerade as a free chunk, so long as it follows the above format.

To get to the bottom of this, we’ve written a short script that goes over the firmware’s contents, searching for blocks that match the aforementioned description. Each block we discover constitutes a potential overwrite target by directing our fake free chunk at it. Running the script on a RAM dump of the firmware, we are greeted with the following result:

Great, there appear to be several candidates for overwrite!

In our previous exploration of the Wi-Fi firmware, we identified a certain class of objects that made good targets for hijacking control flow -- timers. These structures hold function pointers denoting periodically invoked timer functions. While many such timers exist in the current firmware as well, they are rather hard to overwrite using the above primitive. First, they do not start with a 32-bit zero field (but rather with the magic value “MITA”). Second, each timer is a link in a doubly-linked list, whose contents are constantly manipulated. To overwrite a timer, we’d need to insert a valid element into the list.

Instead, going over the list of candidates above, we come across a structure within the “persist” segment, containing a block of function pointers. Using our firmware debugger, we can indeed verify that several of the function pointers within this structure are periodically invoked. Therefore, by finding a free chunk candidate within this block, we should be able to commandeer one of the aforementioned function pointers, directing it at a location of our choice.

Unfortunately, attempting to do so results in a resounding failure.

Bypassing Mitigation #3

Each attempt to allocate data on top of the aforementioned block of function pointers using SPECMEAS frames, immediately causes the firmware to halt. Inspecting the source of the crash leads us back to one of the mitigations we mentioned earlier on; the “disallowed ranges” list.

Apparently, the entire “persist” block is contained in the list of regions within which “free” operations must not occur. Consequently, without bypassing this mitigation, we will not be able to overwrite data within the aforementioned range.

Thinking about this mitigation for a moment, we come up with an interesting proposition: perhaps we could use our commandeered free chunk in order to overwrite the “disallowed ranges” list itself?

While the list’s contents lays within one of the disallowed zones, recall that this validation is only performed by the “free” function, whereas “malloc” will happily carve allocations at any address, without consulting the above list. Therefore, by pointing our free chunk to a location overlapping the “disallowed ranges” list, we can use a SPECMEAS frame to overwrite its contents (thereby nullifying its effect). While SPECMEAS frames are immediately freed after they are allocated, this is no longer a concern, as by the time the “free” occurs, the “disallowed ranges” will have already been overwritten!

Putting It All Together

Using the steps above, we can disable the “disallowed ranges” list, allowing us to subsequently use the commandeered free chunk in order to hijack one of the function pointers in the persist block. Finally, we simply require a means of stashing some shellcode in a predictable location within the firmware’s RAM. By doing so, we will be able to direct the aforementioned function pointer at our shellcode, leading to arbitrary code execution.

Since the addresses within the “persist” block are fixed, they make for prime candidates to store our shellcode. Searching through the block, we come across several potential overwrite candidates, any of which can be hijacked with our “fake” free chunk.

However, there’s one more hurdle to overcome -- the code we’re about to store must not be overwritten at any point. If the code is inadvertently overwritten, the firmware will attempt to execute a corrupted chunk of code, possibly leading it to crash.

To get around this limitation, we’ll use one more action frame: Radio Measurement Requests (RMREQ). These frames are part of the 802.11k RRM standard, and allow for periodic measurements to be performed (and reported) by the firmware. Similarly to SPECMEAS frames, their handler allocates several bytes of data for each measurement IE encoded in the request.

Most importantly, RMREQ frames include a field denoting the number of repetitions that stations should perform when receiving the scan request. Going through the specification reveals that this field also has a “special” value, allowing scans to continue indefinitely:

IEEE 802.11-2016, 9.6.7.3

By encoding this value in an RMREQ frame, we can guarantee that the corresponding allocated buffer will not be subsequently freed, therefore allowing safe storage of our code.

Lastly, we need to consider the problem of the shellcode’s internal structure. Unlike SPECMEAS frames which allowed us to control multiple bytes in each chunk of the allocated buffer, RMREQ frames only provide control over four subsequent bytes out of every 20-bytes allocated. Luckily, as Thumb is a dense instruction set, it allows us to cram two instruction into each 32-bit controlled word. Therefore, we can break up our shellcode using the following pattern: the first 16-bit word will encode an instruction of our choosing, whereas the second word will contain a relative branch to the next controlled chunk. Formatting our shellcode in this manner allows us to construct arbitrarily large chunks of shellcode:

Building a Backdoor

Combining all the primitives above, we can finally stash and execute a chunk of shellcode on the Wi-Fi firmware!

To allow for easier exploration of the firmware, it’s worth taking a moment to convert this rudimentary form of access to a more refined toolset. Otherwise, we’d have to resort to encoding all the post-exploitation logic using segmented chunks of shellcode -- not an alluring prospect.

We’ll begin by using the shellcode above to write a small “backdoor” into the firmware, which we’ll call the “initial payload”. This payload constitutes the most minimal backdoor imaginable; it simply intercepts the NRREP handler, and reads two 32-bit words from it, storing the value of one word into the address denoted by the other. The initial payload therefore allows us to perform arbitrary 32-bit writes to the firmware’s RAM, by sending crafted NRREP frames over-the-air.

Next, we’ll use the initial payload in order to write a more sophisticated one, which we’ll refer to as the “secondary payload”. This payload also intercepts the NRREP handler (replacing the previous hook), but allows for a far richer set of commands, namely:

Reading data from the firmware RAM
Writing to the firmware’s RAM
Executing a shellcode stub
Performing a CRC32 calculation on a block of data

The capabilities above allow us to fully control the firmware’s over-the-air, from the safety of a python script. Indeed, not unlike our research platform, we’ve implemented the protocols for communicating with the backdoor in python, allowing for APIs implementing all of the functionality above.

In fact, the two are so similar, that several of the research framework’s modules can be directly executed using the secondary payload, by simply replace the memory access APIs in the research framework with those offered by the secondary payload.

The Exploit

Summing up all the work above, we’ve finally written a complete exploit, allowing code execution on the Wi-Fi chip of the iPhone 7. You can find the complete exploit here.

The exploit has been tested against the Wi-Fi firmware present in iOS 10.2 (14C92). The vulnerability is present in versions of iOS up to (and including) iOS 10.3.3. Researchers wishing to utilise the exploit on different iDevices or different versions, would be required to adjust the necessary symbols used by the exploit (see “exploit/symbols.py”).

Note that the exploit continuously attempts to install the backdoor into the Wi-Fi firmware, until it is successful. For any unsuccessful attempt, the firmware simply silently reboots, allowing the exploit to continue along. Moreover, due to a clever feat of engineering by Apple, rebooting the firmware does not interrupt ongoing connections; instead, they are continued as the chip reboots, allowing for a rather stealthy attack.

Wrapping Up

Over the course of this blog post we performed a deep dive into the Wi-Fi firmware present on the iPhone 7. Our exploration led us to discover new attack surfaces, several added mitigations, and multiple vulnerabilities.

By exploiting one of the aforementioned vulnerabilities, we were able to gain control over the Wi-Fi SoC, allowing us to gain a foothold on the device itself, directly over-the-air. In doing so, we also bypassed several of the firmware’s exploit mitigations, demonstrating how they can be reinforced in future versions.

In the next blog post, we’ll complete our journey towards full control over the target device, by devising a full exploit chain allowing us to leverage our newly acquired control over the Wi-Fi chip in order to launch an attack against iOS itself. Ultimately, we’ll construct an over-the-air exploit allowing complete control over the iOS kernel.

Ally Fashion

Tuesday, October 3, 2017

Over The Air - Vol. 2, Pt. 2: Exploiting The Wi-Fi Stack on Apple Devices