The Wayback Machine - https://web.archive.org/web/20071031232813/http://dependability.cs.virginia.edu:80/research/willow/
Dependable Systems Research Group

Updated 19-Sep-2005


Willow Survivablity Architecture

Introduction | Papers | Software | Related Projects

Introduction

The Willow system is designed to support the survivability of large distributed information systems. As part of its approach, Willow deals broadly with their faults, applying: willow tree image

  • fault avoidance by disabling vulnerable network elements intentionally when a threat is detected or predicted,
  • fault elimination by replacing system software elements when faults are discovered, and
  • fault tolerance by reconfiguring the system if non-maskable damage occurs.

The reactive component of Willow supplements the usual information system fabric with a comprehensive fault-tolerance mechanism referred to as a survivability architecture or information survivability control system (paper). The key to the architecture is a powerful reconfiguration mechanism that is combined with a general control loop structure in which network state is sensed, analyzed, and required changes effected.

willow tree image The main challenges Willow overcomes are those of scalability and complexity. The information systems of interest are composed of (1) complex arrangements of very large numbers of computing nodes (hundreds of thousands to millions), (2) apply extensive communication facilities, and (3) operate across many domains of ownership and responsibility. Resultingly, the damage to such systems may be complex and widespread. This damage then necessitates extensive, complex error recovery. Our implementation strategy explictly deals with precisely these issues of scalability and complexity in both detection and response.

Willow is based on declarative specification of large-scale fault-tolerance programming. These specifications of fault detection and reaction are used directly to generate a specific instantiation of Willow. Willow provides demonstrably efficient, automated fault tolerance over large and complex systems with an innovative distributed architecture. The Willow system is fully implemented, allowing us to investigate its application and efficiency with respect to survivability problems.

The major components of Willow are:

  • Siena - a content-based networking infrastructure that allows scalable and efficent event notification services
  • Selective Notification - a unified notification paradigm where notifications can be targeted based on formally specified sender qualifications, notification contents, or receiver attributes
  • Laminarflow - a workflow system and language that allows for both process and constraint tasks with enumerated, bounded-time conflict resolution via reasons and intentions
  • ANDREA - a command system that uses Selective Notification to project workflows to approriate receivers and provide scalable feedback
  • SPARTAN - a hierarchical, adaptive control system to detect and respond to network sensor events
  • TEDL - an object-oriented specification language that uses sets of events to detect abstract events that require response

We are also applying the paradigms of Willow to the domain of emergency response in the STILT project.

Selected Papers

  • Rowanhill, Jonathan C., Philip E. Varner and John C. Knight.
    Efficient Hierarchic Management For Reconfiguration of Networked Information Systems
    Accepted to
    DSN 2004 (PDF)
  • Hill, Jonathan C., John C. Knight
    Selective Notification: Combining Forms of Decoupled Addressing for Internet-Scale Command and Alert Dissemination
    Technical Report CS-2003-14, University of Virginia, Department of Computer Science (April 2003) (PDF)
  • Varner, Philip E.
    Policy Specification for Non-Local Fault Tolerance in Large Distributed Information Systems
    M.S. Thesis, May 2003 (PDF)
  • Knight, John C., Dennis Heimbigner, Alexander Wolf, Antonio Carzaniga, Jonathan Hill, Premkumar Devanbu, Michael Gertz
    The Willow Architecture: Comprehensive Survivability for Large-Scale Distributed Applications
    submitted to: DSN-2002 The International Conference on Dependable Systems and Networks, Washington DC (June 2002) (PDF)

Software

Willow is fully implememented in Java as a distributed architecture. If you are interested in the software please contact us.

Related Projects


University of Virginia Computer Science � 2003-2005 Dependability Research Group