Hardware failures occur more often than we think. Environmental hazards such as electrical fires and liquid spills are easily detected and measured. Programming errors and defective hardware components (such as hard disk spindle defects) often lead to invalid operations, and we understand how and why. However, other less predictable forms of environmental stress such as radiation, thermal influence, and energy fluctuations exist and can induce hardware faults. We will explore what happens when a soft error, such as a bit-flip modified Word, remains valid to the Operating System. We will propose a priority-based scheduling solution to detect these faults at the software level with minimal overhead.

Authors: Alon Hillel-Tuch, Aspen Olmsted

Published in: International Conference for Internet Technology and Secured Transactions (ICITST-2021)

  • Date of Conference: 7-9 December 2021
  • DOI: 10.20533/ICITST.2021.0022
  • ISBN: 978-1-913572-39-6
  • Conference Location: Virtual (London, UK)