Improving Network and Middlebox Resilience with Virtualisation

Hill, Lyn and Rotsos, Charalampos and Edwards, Christopher and Hutchison, David (2025) Improving Network and Middlebox Resilience with Virtualisation. PhD thesis, Lancaster University.

[thumbnail of 2025HillPhD]
Text (2025HillPhD)
2025HillPhD.pdf - Published Version
Available under License Creative Commons Attribution.

Download (4MB)

Abstract

Modern networks strive to balance performance and resilience in their designs and operations, the former for maintaining a competitive edge and the latter for ensuring continued service during periods of disruption. These goals are not diametrically opposed but are difficult to cater to simultaneously, a problem made more difficult by the use of high-performance hardware solutions known as ``middleboxes''. These middleboxes limit the applicability and effectiveness of established resilient design practices for the networks they are utilised in, especially in regards to the preservation of state. State is the contents of memory retained by hardware to aid in its operations, and its loss is the cause of observable disruption to end-users. Middleboxes are popular with network operators for their high-performance and ease of use for enacting network policy, but their blackbox design and widespread use have created a distinct vulnerability to disruption. Prior research in this domain has proposed their replacement with network virtualisation/softwarisation, both to enable greater network elasticity and allow for more complex resilience techniques. These proposals have seen limited adoption due to industry prioritising performance scalability over resilience in the name of competitiveness and guaranteeing SLAs, with hardware middleboxes orders of magnitude faster than current virtualisation solutions and unlikely to be replaced in the near future. The popularity of SDN and NFV will continue to rise in industry, but certain network applications will require hardware solutions to fulfil and cannot be replaced through virtualisation. This thesis takes the position that SDN and NFV can instead find use in enhancing the resilience of this existing infrastructure rather than replace it, so that the flexibility of software can be exploited without sacrificing the performance of hardware. These blackbox middleboxes represent a key issue for research: if internal state cannot be observed or extracted, it must be captured or recreated externally through novel means that are sufficiently quick, accurate and reliable for real-world use. To address this problem, this thesis presents Remediate (REsilient MiddlEbox Defence Infrastructure ARchiTEcture), a state recovery framework that explores multiple approaches to preserving state for middlebox devices with differing degrees of accessibility. This proof-of-concept implementation is divided into two major publications, ``Middlebox Minions'' and ``Katoptron'', that each explore different techniques for recreating or transferring state. The first contribution, ``Katoptron'', targets blackbox hardware by recreating state using traffic filtering and packet sampling. The second contribution, ``MiMi'', targets white and greybox software using inserted drivers and logging interpretation respectively. Remediate incorporates these two contributions as its mechanisms for enabling stateful failover in multiple kinds of middleboxes, distributing state in a platform-agnostic and scalable approach using message streaming and datastores. Overall, this framework allows for state recreation and retention across failovers for both hardware and software in any combination or direction. This is especially demonstrated in its viability across multiple popular stateful mechanisms for networking and security, as well as the reduction in traffic necessary to ensure accurate failover by 95\% and provide continuation of service without visible disruption.

Item Type:
Thesis (PhD)
ID Code:
227647
Deposited By:
Deposited On:
20 Feb 2025 09:40
Refereed?:
No
Published?:
Published
Last Modified:
13 Mar 2025 00:47