Fixed
Status Update
Comments
mj...@google.com <mj...@google.com> #2
[Empty comment from Monorail migration]
mj...@google.com <mj...@google.com> #3
Fixed in the January 2023 Patch Tuesday.
CVE-2023-21748 - Creation of stable subkeys under volatile keys
CVE-2023-21772 - Creation of keys with invalid security descriptors
CVE-2023-21773 - Stale KCBs after partial key replication success
CVE-2023-21774 - Stale KCBs of symbolic links and predefined keys
CVE-2023-21748 - Creation of stable subkeys under volatile keys
CVE-2023-21772 - Creation of keys with invalid security descriptors
CVE-2023-21773 - Stale KCBs after partial key replication success
CVE-2023-21774 - Stale KCBs of symbolic links and predefined keys
mj...@google.com <mj...@google.com> #4
[Empty comment from Monorail migration]
ti...@google.com <ti...@google.com> #5
Derestricting 30 days after patch.
Description
Registry virtualization is an internal Windows feature present in the OS since Windows Vista. It is responsible for redirecting accesses that reference global, admin-writable keys (such as most of HKEY_LOCAL_MACHINE) and transparently pointing them to user-accessible locations, so that legacy applications that expect to always run with administrative privileges can continue to work under more restricted user accounts. For example, with registry virtualization enabled, writes to HKLM\Software are silently translated to HKCU\Software\Classes\VirtualStore\Machine\Software by the kernel.
An important part of registry virtualization is key replication - it is the exact process of replicating the subkey structure from a virtualizable hive (e.g. HKLM\Software) to the virtual store (HKCU\Software\Classes\VirtualStore), whenever an operation is performed that requires the virtual key to exist. The two top-level kernel routines related to this functionality are CmKeyBodyReplicateToVirtual and CmpReplicateKeyToVirtual (the latter called by the former), and they are used in the following cases:
Key replication is overall a complex procedure consisting of several logical steps:
a) Freeing the SD of the leaf key,
b) Allocating a new SD with the old settings of Owner/Group/DACL, and the new SACL part.
It is important to note that the virtual store is fully under the user's control and can be set to an adversarial state. However, the key replication code doesn't seem to take that into account, and mostly assumes that the user classes hive is in a pristine state. This may lead to several general types of issues:
When replicating the key tree, the kernel uses internal functions to allocate new keys and to attach them to existing ones - such as CmpCreateEmptyKey and CmpAddSubKeyEx. This is a very direct way to create new keys, as it bypasses the entire logic in CmpParseKey/CmpDoParseKey/CmpCreateChild etc. that is normally involved when creating keys through standard means, including bypassing various sanity checks that are implemented there.
A single path replication operation is made up of several smaller steps, and each of these steps can potentially fail (due to memory/hive space exhaustion or other conditions). It is imperative that the logic is implemented in an error-proof manner, such that any failure along the way causes the previously completed stages to be reverted, or not get applied at all until the entire process succeeds.
Programs running in the system may have open handles to keys in the virtual store being operated on by registry virtualization. This means that any changes made to such keys in the hive must also be reflected in the Key Control Block (KCB) structures corresponding to them (even if replication ends with just partial success).
Based on the above ideas, we have identified four specific bugs/vulnerabilities, which make it possible to:
All of these problems are mostly distinct, but they share a significant amount of context, so they are reported collectively here. Each of them is discussed in a separate section below.
========== Creation of stable subkeys under volatile keys ==========
In Windows Registry, 'stable' keys are the default, persistent keys that are written to disk and accessible across reboots, while the 'volatile' ones are well... volatile, i.e. only existent in memory, and only for as long as the hive is loaded in the OS. Stable keys may have volatile subkeys, but for obvious reasons, the opposite wouldn't make sense so it is disallowed by the internal CmpCreateChild function, which returns STATUS_CHILD_MUST_BE_VOLATILE (0xC0000181) if such an attempt is made.
This limitation can be bypassed because registry virtualization tries to preserve the volatility of keys being virtualized, but it doesn't check the volatility of existing keys in the virtual store. Let's assume we first create the following registry path:
HKCU\Software\Classes\VirtualStore\Machine\Software
_/__________/
| |
stable volatile
And then perform an operation that requires the following path to be replicated:
HKLM\Software\Microsoft
_____________________/
|
stable
This will prompt the kernel to create a stable "Microsoft" subkey under "Software" in the virtual store, resulting in the following structure:
HKCU\Software\Classes\VirtualStore\Machine\Software\Microsoft
_/_____/_/
| | |
stable volatile stable
This behavior breaks the canonical rule of registry key storage types, but what are the security implications here? This is less clear, as at the time of writing, we haven't found a direct way to convert this mismatch to a memory corruption primitive. The stable keys with volatile parents don't survive reboots and unloading of the hive in general, since as soon as the volatile keys are lost, there is no direct connection from the root of the hive to the dangling stable keys (even though they remain present as allocated cells in the hive file).
Furthermore, this behavior also has the potential to corrupt the linked list of security descriptors. In normal circumstances, all stable SDs (associated with stable keys) are connected in a single linked list via the _CM_KEY_SECURITY.Flink/Blink fields. On the other hand, volatile SDs are each in their own single-entry lists (Flink/Blink pointing at itself), since they are not persistent and don't have to be tracked. When a stable subkey with a unique security descriptor is created under a volatile key, the volatile SD of the parent is erroneously used to add the subkey's SD to the linked list in CmpInsertSecurityCellList. This creates a linked list of mixed stable/volatile descriptors that is disconnected from the main descriptor list (pointed to by the root cell), and which will be discarded on the next reloading of the hive - thus resetting the security of any other stable keys in the hive that might have started to share the descriptor at some point.
Overall, we believe there is some potential for this issue to lead to memory corruption either in code related to handling subkey lists or security descriptors, but we haven't investigated further. In addition to addressing the stable/volatile problem, we also recommend analyzing what other checks executed during normal key creation are skipped during key replication, and either adding them to the virtualization code, or redesigning the feature entirely to achieve key replication through the more standardized interface (NT Object Manager).
Attached is a proof-of-concept program that triggers the behavior described above. We have tested it on Windows 11 (October 2022 update). It is easiest to examine the resulting state of the registry with WinDbg attached as a kernel debugger, by using the !reg extension to confirm the stable/volatile types of the keys and the corrupted security descriptor list.
========== Creation of keys with invalid security descriptors ==========
As mentioned earlier in the report, a part of the key replication process is to replicate the SACL portion of the replicated key's security descriptor, to the descriptor of the virtual key. This entire task is performed by the CmpCopySaclToVirtualKey function, which is called as the last step in CmpReplicateKeyToVirtual (after creating the desired key structure). It does so by:
Now the problem here is that if step 3 fails, then step 2 is not reverted and the key remains with an invalid index of the security cell. And the step can indeed fail, because the new descriptor may be larger than the previous one (because of the new SACL part), so in a case where the hive is full, the allocation may fail.
What we need to try to trigger the issue is a key in a virtualizable hive that grants read access to regular users, has a SACL component in its security descriptor, and doesn't have the REG_KEY_DONT_VIRTUALIZE bit set. There don't seem to be too many keys like that in a default Windows installation, but one suitable one that meets all these requirements is:
HKLM\Software\Microsoft\Windows Advanced Threat Protection
The second part of the attack is to be able to fill up the hive as much possible, so that even after freeing one SD, allocating a slightly bigger one will fail because there is no space left. Since each of stable/volatile storage types are limited to 2 GiB, this can be achieved mostly reliably and within a limited CPU time/memory overhead. In our proof-of-concept, we achieve this by creating a series of values with descending lengths starting from 1 MiB, in order to pack the hive structure as tightly as possible and allocate every last free chunk. We can choose whether we want to perform the attack in the stable or volatile space. In our demonstration, we have chosen volatile because it's more consistent (we start with an empty storage and are independent of any previous state of the hive) and is in-memory only, so it doesn't generate excessive writes to disk and doesn't persistently bloat the size of the UsrClass.dat hive file.
Once we trigger the bug and obtain a key with _CM_KEY_NODE.Security set to -1, the last piece of the puzzle is how to exploit it for some kind of memory corruption. One way we have found is through the very same CmpCopySaclToVirtualKey function, which references the virtual key's security cell to get its Owner/Group/DACL. So once the VirtualStore key structure and the OOM condition are set up, our exploit boils down to two consecutive RegRenameKey API calls, first to trigger the bug, and then to trigger a kernel panic.
An example crash log, generated on Windows 11 (October 2022 update), is shown below:
--- cut ---
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
SYSTEM_SERVICE_EXCEPTION (3b)
An exception happened while executing a system service routine.
Arguments:
Arg1: 00000000c0000005, Exception code that caused the bugcheck
Arg2: fffff8076207e89c, Address of the instruction which caused the bugcheck
Arg3: ffff800ce12109a0, Address of the context record for the exception that caused the bugcheck
Arg4: 0000000000000000, zero.
[...]
CONTEXT: ffff800ce12109a0 -- (.cxr 0xffff800ce12109a0)
rax=0000000000000000 rbx=00000000ffffffff rcx=0000000000000fff
rdx=ffffe580108cbfe8 rsi=ffff800ce1211420 rdi=ffffe58fe8018000
rip=fffff8076207e89c rsp=ffff800ce12113c0 rbp=ffff800ce1211461
r8=ffff800ce1211424 r9=00000000000001ff r10=ffffe58fe8018000
r11=ffff800ce1211398 r12=00000291d2145914 r13=00000000ffffffff
r14=ffffe58fe5092000 r15=0000000000000000
iopl=0 nv up ei pl nz na po nc
cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00050206
nt!HvpGetCellPaged+0x7c:
fffff807
6207e89c 8b01 mov eax,dword ptr [rcx] ds:002b:00000000
00000fff=????????Resetting default scope
PROCESS_NAME: Registry
STACK_TEXT:
ffff800c
e12113c0 fffff807
62315320 : ffffe58fe8018000 00000000
00000000 ffffe58fe8018000 fffff807
6207e8ca : nt!HvpGetCellPaged+0x7cffff800c
e12113f0 fffff807
623145bd : 0000000000000000 00000000
00000001 0000000000000000 ffffe58f
ea4fe464 : nt!CmpCopySaclToVirtualKey+0xd4ffff800c
e12114c0 fffff807
62313282 : ffff800ce12116c0 ffff800c
e1211a10 0000000000000000 ffffe58f
f0dc0390 : nt!CmpReplicateKeyToVirtual+0x281ffff800c
e12115c0 fffff807
6230be52 : 0000000000000000 00000000
00000000 00000000000000bc fffff807
621c5533 : nt!CmKeyBodyReplicateToVirtual+0x1baffff800c
e1211970 fffff807
61e2d275 : ffffc484bc89e080 00000000
00004001 ffffc484bc89e080 00000000
00004001 : nt!NtRenameKey+0x362ffff800c
e1211ae0 00007ffc
d1e46ad4 : 0000000000000000 00000000
00000000 0000000000000000 00000000
00000000 : nt!KiSystemServiceCopyEnd+0x25000000e2
1eeffaf8 00000000
00000000 : 0000000000000000 00000000
00000000 0000000000000000 00000000
00000000 : 0x00007ffc`d1e46ad4--- cut ---
========== Stale KCBs after partial key replication success ==========
In Windows Registry, an important part of making any changes to the hive data is to make sure that these changes are reflected in the in-memory cache, i.e. the KCB structures allocated for all currently opened keys. This also applies to the key replication process, and especially so given that it may modify a number of keys in the virtual store within the scope of a single replication operation. Currently, this synchronization is achieved through the following function call in either CmKeyBodyReplicateToVirtual or CmpDoParseKey:
CmpSearchKeyControlBlockTreeEx(CmpSyncKcbCacheForHive, VirtualStoreHive, ...);
Since replication touches multiple keys - all identified by their names/nodes and not KCBs - then instead of trying to find and update the relevant KCBs, all of them (associated with the relevant hive) are iterated and synchronized with their corresponding key nodes. In the above line of code, CmpSearchKeyControlBlockTreeEx is the function that iterates over the whole hive, and CmpSyncKcbCacheForHive is a callback routine that performs the synchronization of a particular KCB.
The vulnerability here is that the above call is made only if all of the following functions fully succeed (i.e. return STATUS_SUCCESS):
That means that if the CmpReplicateKeyToVirtual call succeeds only partially (having already made some changes to the hive), or if it succeeds fully but one of the subsequent three calls fails, CmpSearchKeyControlBlockTreeEx is not invoked and the KCBs associated with any keys modified in CmpReplicateKeyToVirtual will become inconsistent with their corresponding key nodes. This includes information about subkeys and key security, both of which are modified in CmpReplicateKeyToVirtual.
One (but not only) example of how such a situation can arise is the bug discussed in the previous section ("Creation of keys with invalid security descriptors"). If we trigger a hive OOM condition that results in the freeing of the leaf key's security descriptor without assigning a new one, not only _CM_KEY_NODE.Security becomes -1, but also CmpReplicateKeyToVirtual returns STATUS_INSUFFICIENT_RESOURCES, so CmpSearchKeyControlBlockTreeEx is never called to update the KCBs. If during the time of the replication we had an open handle to the leaf key, and its previous security descriptor was unique (and thus freed), then the _CM_KEY_CONTROL_BLOCK.CachedSecurity pointer starts to point at a freed pool allocation. The use of the dangling pointer can be triggered in a variety of ways, e.g. by querying the key's descriptor via RegGetKeySecurity.
The proof-of-concept for this issue is very similar to the previous one, with the only two differences being:
The bug is easiest to reproduce with Special Pools enabled for ntoskrnl.exe. An example crash log, generated on Windows 11 (October 2022 update), is shown below:
--- cut ---
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced. This cannot be protected by try-except.
Typically the address is just plain bad or it is pointing at freed memory.
Arguments:
Arg1: ffffd509a1326f90, memory referenced.
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.
Arg3: fffff8042b8d1efd, If non-zero, the instruction address which referenced the bad memory
address.
Arg4: 0000000000000000, (reserved)
[...]
TRAP_FRAME: ffffa603a31bf5f0 -- (.trap 0xffffa603a31bf5f0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=ffffd509a1326f90 rbx=0000000000000000 rcx=0000000000000000
rdx=000002b473348ea0 rsi=0000000000000000 rdi=0000000000000000
rip=fffff8042b8d1efd rsp=ffffa603a31bf780 rbp=ffffa603a31bf880
r8=000002b473348ea0 r9=ffffa603a31bf8d0 r10=ffffa603a31bfa88
r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei ng nz na po nc
nt!SeQuerySecurityDescriptorInfo+0x4d:
fffff804
2b8d1efd 0f1000 movups xmm0,xmmword ptr [rax] ds:ffffd509
a1326f90=????????????????????????????????Resetting default scope
STACK_TEXT:
ffffa603
a31beb38 fffff804
2b768ee2 : ffffa603a31beca0 fffff804
2b5665c0 ffffc2814c4c6180 00000000
00000000 : nt!DbgBreakPointWithStatusffffa603
a31beb40 fffff804
2b768721 : ffffc28100000003 ffffa603
a31beca0 fffff8042b636900 00000000
00000050 : nt!KiBugCheckDebugBreak+0x12ffffa603
a31beba0 fffff804
2b620d47 : 0000000000000000 00000000
00000000 0000bb5800000010 ffffd509
a1326f90 : nt!KeBugCheck2+0xa71ffffa603
a31bf310 fffff804
2b6a351d : 0000000000000050 ffffd509
a1326f90 0000000000000000 ffffa603
a31bf5f0 : nt!KeBugCheckEx+0x107ffffa603
a31bf350 fffff804
2b46af96 : ffffa60300000000 00000000
00000000 ffffa603a31bf550 00000000
00000000 : nt!MiSystemFault+0x1c117dffffa603
a31bf450 fffff804
2b62f8f5 : 000000021f7cc025 00000000
00000000 ffffa603a31bfa78 fffff804
2b43fe5a : nt!MmAccessFault+0x2a6ffffa603
a31bf5f0 fffff804
2b8d1efd : 0000000000000000 00000000
00000000 0000000000000000 fffff804
2b8a6bbb : nt!KiPageFault+0x335ffffa603
a31bf780 fffff804
2b8d1e26 : ffffa603a31bfa78 000002b4
73348ea0 ffffa603a31bfa88 fffff804
2b47047f : nt!SeQuerySecurityDescriptorInfo+0x4dffffa603
a31bf840 fffff804
2b8d1c66 : ffffd509992ccf58 ffffa603
a31bfa78 000002b473348e01 00000000
00000001 : nt!CmpQueryKeySecurity+0xc2ffffa603
a31bf8b0 fffff804
2b987894 : 00000000ffffffff 00000000
000000d8 00000000000a0008 00007ff7
f978f810 : nt!CmpSecurityMethod+0x146ffffa603
a31bf9e0 fffff804
2b633275 : ffffe50ceb2b4600 00000000
00000004 000000ea33aff9e8 00000000
00000000 : nt!NtQuerySecurityObject+0x144ffffa603
a31bfa70 00007ffb
fbe267b4 : 00007ffbf957c55e 00000000
00000000 00000000000000b4 00000000
00020019 : nt!KiSystemServiceCopyEnd+0x25000000ea
33aff9c8 00007ffb
f957c55e : 0000000000000000 00000000
000000b4 0000000000020019 000000ea
33af0000 : ntdll!NtQuerySecurityObject+0x14--- cut ---
========== Stale KCBs of symbolic links and predefined keys ==========
The final vulnerability described in this report is also related to KCB synchronization. If CmpSearchKeyControlBlockTreeEx(CmpSyncKcbCacheForHive) is called as it should be, the control flow goes through the following functions:
The actual KCB synchronization takes place in CmpRebuildKcbCacheFromNode and the lower-level functions, while CmpSyncKcbCacheForHive and CmpRebuildKcbCache are thin wrappers that only check for a few special conditions and bail out early if needed. Two of the conditions being checked are whether the key is:
If either condition is true, the KCB is not refreshed for that key. This is a problem, because both symlinks/predefined keys can have subkeys and do have security descriptors that can be operated on, so that information must be kept in sync with the hive for those special types of keys as well.
This bug is even easier to trigger than the previous one: instead of spraying the hive to achieve an OOM condition, we create the virtual store leaf key as a symbolic link (flag REG_OPTION_CREATE_LINK for RegCreateKeyExW). Then once we try to rename the HKLM key under registry virtualization and the replication process is triggered, the security descriptor of the leaf key is replaced with a new one, CmpReplicateKeyToVirtual succeeds and CmpSearchKeyControlBlockTreeEx is called. However due to the logic in CmpSyncKcbCacheForHive, the KCB refresh is omitted for the leaf key, again leading to a stale pointer in _CM_KEY_CONTROL_BLOCK.CachedSecurity and a system crash when it is subsequently accessed in SeQuerySecurityDescriptorInfo via RegGetKeySecurity.
Attached is a proof-of-concept exploit for this issue that has been successfully tested on Windows 11 (October 2022 update). The observable kernel bugcheck is identical to the one in the previous section, so we won't re-paste it here.
This bug is subject to a 90-day disclosure deadline. If a fix for this issue is made available to users before the end of the 90-day deadline, this bug report will become public 30 days after the fix was made available. Otherwise, this bug report will become public at the deadline. The scheduled deadline is 2023-01-23.