Text size
  • Small
  • Medium
  • Large
Contrast
  • Standard
  • Blue text on blue
  • High contrast (Yellow text on black)
  • Blue text on beige

    Operating System Support to Detect Application Hangs

    Second International Workshop on Verification and Evaluation of Computer and Communication Systems (VECoS 2008)

    Leeds, UK, 2 - 3 July 2008

    AUTHORS

    Gabriella Carrozza, Marcello Cinque, Domenico Cotroneo and Roberto Natella

    ABSTRACT

    On-line failure detection is an essential means to control and assess the dependability of complex and critical software systems. In such context, effective detection strategies are required, in order to minimize the possibility of catastrophic consequences. This objective is however difficult to achieve in complex systems, especially due to the several sources of non-determinism (e.g., multi-threading and distributed interaction) which may lead to software hangs, i.e., the system is active but no longer capable of delivering its services. The paper proposes a detection approach to uncover application hangs. It exploits multiple indirect data gathered at the operating system level to monitor the system and to trigger alarms if the observed behavior deviates from the expected one. By means of fault injection experiments conducted on a research prototype, it is shown how the combination of several operating system monitors actually leads to an high quality of detection, at an acceptable overhead.

    PAPER FORMATS

    PDF filePDF Version of this Paper (345kb)