Definition
DevOps is the integration of people, practices, and tools to optimize the use of resources and the flow of work in the delivery and management of software and services. It accomplishes this by aligning authority and responsibility between operations and development engineers and continuously improving the automation and standardization of the entire service life-cycle, while maintaining the confidentiality, integrity and availability of the systems.
Values
At UCSB we are adopting the CALMS framework to define the values that guide our DevOps initiatives:
- Culture: The goal of DevOps is to change and improve the relationship between development and operations teams by advocating better communication and collaboration, and by distributing the responsibility for system stability.
- Automation: A primary approach to DevOps is to automate critical functions in the service delivery pipeline and to utilize APIs when available.
- Lean: Keep everything to a minimum - tools, meetings, sprints and teams.
- Measurement: A successful DevOps implementation collects performance metrics, process metrics, and even people metrics. Collecting measures helps to improve the processes that are being optimized.
- Sharing: Creating a culture where people share ideas and problems.
Principles
The following principles have been adopted to guide us in the DevOps initiatives. The Phoenix Project introduced “the three ways” as the core principles that guide DevOps methodology.
System thinking (The First Way): Emphasis on understanding the whole system thinking of the overall outcomes to optimize it well. Use system thinking when defining success metrics and evaluating outcome of changes. Approach the flow of work (from development to operations to the customer) in “small batches” while optimizing the comprehensive, global performance of the system.
Amplifying feedback loops (The Second Way): Implement effective feedback loops to enable faster detection and resolution of issues early in the process. This DevOps principle includes the necessary and disruptive practice of stopping production when there are deployment failures. This principle champions automation whenever it is available to ensure that code is always optimized.
Culture of continuous experimentation and learning (The Third Way): Create a culture that embraces experimentation and understands that daily practice and repetition lead to mastery. This principle embodies the “fail fast” mentality popular among the titans of tech (including Facebook and Google). A culture of doing instead of over-analyzing enforces a learning environment where success and failure occur at regular intervals. Failure and problem solving lead to more secure, reliable, and innovative systems. The third way requires a high-trust leadership environment that reinforces improvements through risk-taking.
Methodologies
- People over process over tools: Define who is responsible for a job function, then define the process that happens around them, and then select the tool to help perform that process.
- Continuous delivery: The practice of coding, testing and delivering functionality in very small batches so we can improve the overall quality and velocity.
- Lean management: An approach to running an organization that supports the concept of continuous improvement, a long-term approach to work that systematically seeks to achieve small, incremental changes in processes in order to improve efficiency and quality. It leads to better organizational outputs and greater employee satisfaction.
- Change management: Simplified verification and notification processes that are automated to ensure visibility, awareness, and approval to impacted parties and stakeholders. Infrastructure as code: The process of treating systems like code - check systems specifications into source control, do code reviews, build, test, and manage programmatically.
- Resiliency: The ability to recover to a stable state within a timely manner. This includes the monitoring of systems for indicators of an impending failure, detecting occurring failures and returning the systems to a stable state through notification/manual interview or automated self healing.
- Observability: The ability to monitor occurring events within the systems which have real business value for operators, developers, and business users. The events should be queryable to develop reporting, monitoring and alerting based upon critical events or important metrics.
- Build security in instead of bolting it on: Built-in threat modeling, defensive design, secure coding, and risk-based security testing.
Common Practices
- Incident Command System - for small and large incidents.
- Developers on Call - helps to resolve issues very fast.
- Status Pages - improves communication, increases customer satisfaction and retains trust
- Blameless Postmortems.
- Embedded Teams - reorganize the teams to have an operations engineer within the development team.
- The Cloud - provides APIs to create and control infrastructure.
- Andon Cords - allows you to put a stop at any time of the production line.
- Dependency Injection (inversion of control) - passing external dependencies at runtime. Very important pattern for infrastructure as a code environment.
- Blue/Green Deployment - use two identical systems, blue and green. One is live, and the other isn't. You upgrade the offline system, test it and bring it to production if all goes well.
- Chaos Monkey - bring the system down on purpose to test its resiliency.
Building Blocks of DevOps
Agile
Agile software development describes a set of values and principles for software development under which requirements and solutions evolve through the collaborative effort of self-organizing cross-functional teams. It advocates adaptive planning, evolutionary development, empirical knowledge, and continual improvement, and it encourages rapid and flexible response to change.
LEAN
Lean is achieved by removing “Waste,” which is activity not required to complete a process. It is about empowering everyone involved in the process to identify and eliminate areas of waste. There are 8 defined "wastes":
- Defects – Products or services that are out of specification that require resources to correct.
- Overproduction – Producing too much of a product before it is ready to be sold.
- Waiting – Waiting for the previous step in the process to complete.
- Non-Utilized Talent – Employees that are not effectively engaged in the process.
- Transportation – Transporting items or information that is not required to perform the process from one location to another.
- Inventory – Inventory or information that is sitting idle (not being processed).
- Motion – People, information or equipment making unnecessary motion due to workspace layout, ergonomic issues or searching for misplaced items.
- Extra Processing – Performing any activity that is not necessary to produce a functioning product or service.
SixSigma
Six Sigma is named after a statistical concept where a process only produces 3.4 defects per million opportunities (DPMO). Six Sigma seeks to improve the quality of process outputs by identifying and removing the causes of defects (errors) and minimizing variability in (manufacturing and business) processes.
DMAIC
DMAIC is a data driven improvement cycle used for improving, optimizing and stabilizing business processes and designs.
- Define: The purpose of this step is to clearly articulate the business problem, goal, potential resources, project scope and high-level project timeline.
- Measure: The purpose of this step is to objectively establish current baselines as the basis for improvement.
- Analyse: The purpose of this step is to identify, validate and select root cause for elimination.
- Improve: The purpose of this step is to identify, test and implement a solution to the problem; in part or in whole.
- Control: The purpose of this step is to embed the changes and ensure sustainability, this is sometimes referred to as making the change 'stick'.
In summary, Lean exposes sources of process variation and Six Sigma aims to reduce that variation, enabling a virtuous cycle of iterative improvements towards the goal of continuous flow.
Areas for DevOps Tools
- Configuration management
- Test system
- Build Server
- Application deployment
- Version control
- Monitoring tools