Li, Shaohu and wang, Yubo and Zhou, Jin and Li, Xinxin and Meng, Weizhi and Gong, Bei and Wang, Yong (2026) ROMA : Enhancing Container OOM Resilience via Reinforced Isolation and Adaptive Shared Resource Reclamation. IEEE Transactions on Services Computing. ISSN 1939-1374
Full text not available from this repository.Abstract
Container-based virtualization is a cornerstone of modern cloud orchestration, but the shared-kernel architecture also introduces subtle risks to memory isolation. Our study shows that Linux cgroups and the default Out-of-Memory (OOM) mechanism lack sufficient container context when selecting victim processes. As a result, a malicious container may disrupt critical co-located services and leave behind unreclaimed shared resources, such as POSIX/SysV shared memory, message queues, semaphores, and tmpfs files. These residual resources can accumulate over time and eventually lead to denial-of-service conditions. To address this problem, we propose ROMA, an adaptive memory-governance framework for containerized environments. ROMA introduces container awareness into the OOM handling path while maintaining low runtime overhead. It combines eBPF-based monitoring with two lightweight LSM hooks to confine OOM victim selection to the offending container and to proactively reclaim shared resources left behind after OOM events. Extensive experiments show that ROMA incurs only a 6.94% throughput overhead across eight workloads. Under up to eight concurrent attackers, ROMA preserves isolation, avoids collateral kills, reclaims all leaked resources, and keeps recovery time within 6.9 seconds. In 24-hour runs with up to 64 containers, ROMA remains stable with low CPU and memory overhead, negligible event loss, and limited impact on benign services.