Modern System Administration: Building and Maintaining Reliable Systems
As development has accelerated in the past 20 years, the role of operations has changed drastically. How does any one individual stay relevant with the increased complexity in systems and services? This practical guide helps anyone in operations–sysadmins, automation engineers, IT professionals, and site reliability engineers–understand the essential concepts of the role today.
The collaboration, automation, and evolution of the systems in use has changed how we do operations work. Author Jennifer Davis, senior cloud advocate at Microsoft, provides examples to help you progress your current skills to modern practices. You’ll understand the operations path from version control to production and identify areas of work where you need to upgrade your skills.
- Development and testing: Version control, fundamentals of virtualization and containers, testing, and architecture review
- Deploying and configuring services: Infrastructure management, networks, security, storage, serverless, and release management
- Scaling administration: Monitoring and observability, capacity planning, log management and analysis, and security and compliance
I. Foundations 1. Introduction Principles Modernization of Compute, Network and Storage Compute Network Storage Infrastructure Management Scaling Production Readiness A Role by any Other Name DevOps Site Reliability Engineering (SRE) How do Devops and SRE Differ? System Administrator Finding Your Next Opportunity 2. Infrastructure Strategy Understanding Infrastructure Lifecycle Lifecycle of Physical Hardware Lifecycle of Cloud Services Challenges to Planning Infrastructure Strategy Infrastructure Stacks Infrastructure as Code Wrapping Up II. Principles 3. Version Control Fundamentals of Git Branching Working with Remote Git Repositories Resolving Conflicts Fixing Your Local Repository Advancing Collaboration with Version Control Wrapping Up 4. Local Development Environments Choosing an Editor Minimizing required mouse usage Integrated Static Code Analysis Easing editing through auto completion Indenting code to match team conventions Collaborating while editing Integrating workflow with git Extending the development environment Selecting Languages to Install Installing and Configuring Applications Wrapping Up 5. Testing Why should Sysadmins Write Tests? Differentiating the Types of Testing Linting Unit Tests Integration Tests End-to-End Tests Examining the Shape of Testing Strategy Existing Sysadmin Testing Activities When Tests Fail Environment Problem Flawed Test Logic Assumptions Changed Code Defects Failures in Test Strategy Flaky Tests Wrapping Up 6. Security Collaboration in Security Borrow the Attacker Lens Design for Security Operability Qualifying Issues Wrapping Up III. Principles in Practice 7. Infracode Building Machine Images Building with Packer Building With Docker Provisioning Infrastructure Resources Provisioning with Terraform Configuring Infrastructure Resources Configuring with Chef Getting Started with Infracode Wrapping Up 8. Testing in Practice Writing Unit Tests for Infracode Writing Unit Tests with Chefspec Writing Unit Tests for Datadog Install Recipe Writing Integration Tests for Infracode Writing Integration Tests for Datadog Install Recipe Linting Chef Code with Rubocop and Foodcritic Wrapping Up 9. Security and Infracode Managing Identity and Access How should you control access to your system? Who should have access to your system? Managing Secrets Password Managers and Secret Management Software Defending Secrets and Monitoring Usage Securing Compute Infrastructure Managing Networking Recommendations for your Security Infracode IV. Scaling Production Readiness 10. Monitoring Theory Why Monitor? How Monitoring and Observability Differ? Monitoring Building Blocks Events Monitors Data: Metrics, Logs, and Tracing What does Monitoring look like? Event Detection Data Collection Data Reduction Data Analysis Data Presentation Monitoring for Sustainable Work 11. Presenting Information Know your audience Choosing your channel Choose your story type Presenting Data in Action Charts are Worth A Thousand Words. Telling the Same Story With a Different Audience The Key Takeaway Know your visuals Visual Cues Chart types Recommended Visualization Practices 12. Building Resilient On-Call Teams What Is On-Call? Adding Developers to the On-call Rotation Updating On-Call Processes Monitor the On-Call Experience Adopt a Whole Team(s) Approach On-Call in Practice Preparing for On-Call Wrapping Up 13. Managing Incidents What is an Incident? What is Incident Management? Roles and Responsibilities Pre-emptive Planning Handling the Incident Post-incident meeting Practice Failure
How to download source code?
1. Go to:
2. Search the book title:
Modern System Administration: Building and Maintaining Reliable Systems, sometime you may not get the results, please search the main title
3. Click the book title in the search results
Publisher resources section, click
Download Example Code.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.