Fault-Tolerant Systems, 2nd Edition
- Length: 416 pages
- Edition: 2
- Language: English
- Publisher: Morgan Kaufmann
- Publication Date: 2020-10-15
- ISBN-10: 0128181052
- ISBN-13: 9780128181058
- Sales Rank: #0 (See Top 100 Books)
Fault-Tolerant Systems, Second Edition, is the first book on fault tolerance design utilizing a systems approach to both hardware and software. No other text takes this approach or offers the comprehensive and up-to-date treatment that Koren and Krishna provide. The book comprehensively covers the design of fault-tolerant hardware and software, use of fault-tolerance techniques to improve manufacturing yields, and design and analysis of networks. Incorporating case studies that highlight more than ten different computer systems with fault-tolerance techniques implemented in their design, the book includes critical material on methods to protect against threats to encryption subsystems used for security purposes.
The text’s updated content will help students and practitioners in electrical and computer engineering and computer science learn how to design reliable computing systems, and how to analyze fault-tolerant computing systems.
Cover image Title page Table of Contents Copyright Preface to the Second Edition Acknowledgments Chapter 1: Preliminaries 1.1. Fault Classification 1.2. Types of Redundancy 1.3. Basic Measures of Fault Tolerance 1.4. Outline of This Book 1.5. Further Reading References Chapter 2: Hardware Fault Tolerance 2.1. The Rate of Hardware Failure 2.2. Failure Rate, Reliability, and Mean Time to Failure 2.3. Hardware Failure Mechanisms 2.4. Common-Mode Failures 2.5. Canonical and Resilient Structures 2.6. Other Reliability Evaluation Techniques 2.7. Fault-Tolerance Processor-Level Techniques 2.8. Timing Fault Tolerance 2.9. Tolerance of Byzantine Failures 2.10. Further Reading 2.11. Exercises References Chapter 3: Information Redundancy 3.1. Coding 3.2. Resilient Disk Systems 3.3. Data Replication 3.4. Algorithm-Based Fault Tolerance 3.5. Further Reading 3.6. Exercises References Chapter 4: Fault-Tolerant Networks 4.1. Measures of Resilience 4.2. Common Network Topologies and Their Resilience 4.3. Fault-Tolerant Routing 4.4. Networks on a Chip 4.5. Wireless Sensor Networks 4.6. Further Reading 4.7. Exercises References Chapter 5: Software Fault Tolerance 5.1. Acceptance Tests 5.2. Single-Version Fault Tolerance 5.3. N-Version Programming 5.4. Recovery Block Approach 5.5. Preconditions, Postconditions, and Assertions 5.6. Exception Handling 5.7. Software Reliability Models 5.8. Fault-Tolerant Remote Procedure Calls 5.9. Further Reading 5.10. Exercises References Chapter 6: Checkpointing 6.1. What Is Checkpointing? 6.2. Checkpoint Level 6.3. Optimal Checkpointing: an Analytical Model 6.4. Cache-Aided Rollback Error Recovery (CARER) 6.5. Checkpointing in Distributed Systems 6.6. Checkpointing in Shared-Memory Systems 6.7. Checkpointing in Real-Time Systems 6.8. Checkpointing While Using Cloud Computing Utilities 6.9. Emerging Challenges: Petascale and Exascale Computing 6.10. Other Uses of Checkpointing 6.11. Further Reading 6.12. Exercises References Chapter 7: Cyber-Physical Systems 7.1. Structure of a Cyber-Physical System 7.2. The Controlled Plant State Space 7.3. Sensors 7.4. The Cyber Platform 7.5. Actuators 7.6. Further Reading 7.7. Exercises References Chapter 8: Case Studies 8.1. Aerospace Systems 8.2. NonStop Systems 8.3. Stratus Systems 8.4. Cassini Command and Data Subsystem 8.5. IBM POWER8 8.6. IBM G5 8.7. IBM Sysplex 8.8. Intel Servers 8.9. Oracle SPARC M8 Server 8.10. Cloud Computing 8.11. Further Reading References Chapter 9: Simulation Techniques 9.1. Writing a Simulation Program 9.2. Parameter Estimation 9.3. Variance Reduction Methods 9.4. Splitting 9.5. Random Number Generation 9.6. Fault Injection 9.7. Further Reading 9.8. Exercises References Chapter 10: Defect Tolerance in VLSI Circuits 10.1. Manufacturing Defects and Circuit Faults 10.2. Probability of Failure and Critical Area 10.3. Basic Yield Models 10.4. Yield Enhancement Through Redundancy 10.5. Further Reading 10.6. Exercises References Chapter 11: Fault Detection in Cryptographic Systems Abstract 11.1. Overview of Ciphers 11.2. Security Attacks Through Fault Injection 11.3. Countermeasures 11.4. Further Reading 11.5. Exercises References Index
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.