Doctoral thesis

Predicting failures in complex multi-tier systems


77 p

Thèse de doctorat: Università della Svizzera italiana, 2020

English Complex multi-tier systems are composed of many distributed machines, feature multi-layer architecture and offer different types of services. Shared complex multi-tier systems, such as cloud systems, reduce costs and improves resource utilization efficiency, with a considerable amount of complexity and dynamics that challenge the reliability of the system. The new challenges of complex multi-tier systems motivate a new holistic self-healing approach, which must be accurate, lightweight and proactive, to ensure reliable cloud applications. Self-healing techniques work at runtime, thus they offer automatic and flexible ways to increase reliability by detecting errors, diagnosing errors, and either fixing the errors or mitigating their effects. Self-Healing Systems leverage the time between the activation of a fault and the failure by taking actions to avoid failures. Self-Healing systems shall predict failures, localize the faults and fix or mask them before the failure occurrence. In my Ph.D, I focused on predicting failures and localizing faults. In this thesis I present an approach, DyFAULT, that predicts failures by detecting anomalous systems states early enough to diagnose the causing errors and fix them before the failure occurrence, and localizes faults by leveraging the collected data to pinpoint the location of error and possibly the type of the fault. The contribution of my Ph.D work includes: (i) an approach to accurately predict failures and localize faults that requires training with fault seeding. (ii) an approach to predict failures and localize faults that requires training with data from normal execution only. (iii) a prototype implementation of the two approaches (iv) a set of experimental results that evaluate the proposed approaches of DyFAULT.
  • English
Computer science and technology
License undefined
Persistent URL

Document views: 146 File downloads:
  • 2020INFO017.pdf: 136