DevOps and SRE: Where do you draw the line?

As customers expectations for application performance are high, in most of case development team spends most of the time and efforts building valuable new systems. The delivering teams are under pressure to maintain the stability and reliability of the business systems. This scenario has created an adaptation for DevOps and SREs as a formal practice in the development and Quality assurance community. Let’s discuss what actually DevOps and SRE mean and how they could partner together in an organization.

 

DevOps Engineer:

DevOps is the practice of operations and development engineers participating together in the entire service lifecycle, from design through the development process to production support and suggest that the best approach is to hire engineers who can be excellent coders as well as handle all the Ops functions.

Skills required for a DevOps Engineer:

  • Excellent at scripting languages
  • Knowledge and proficiency with Ops and Automation tools
  • Good Understanding of the Ops challenges and they can support/address during design and development
  • Efficient with frequent testing and incremental releases
  • Soft skills for better collaboration across the teams

More about how to become a great DevOps Engineer:

https://blog.shippable.com/how-to-be-a-great-devops-engineer

 

 

Site Reliability Engineer (SRE):

As Ben Treynor (Founding father of SRE) puts it, “SRE, fundamentally, it’s what happens when you ask a software engineer to design an operations function”. The main goals are to create ultra-scalable and highly reliable software systems.

SRE is a job function that focuses on the reliability and maintainability of systems. It is also a mindset and a set of engineering practices to run better production services. An SRE has to be able to engineer creative solutions to problems, strike the right balance between reliability and feature velocity and target appropriate levels of service quality.

Skills required for an SRE:

  • Ability to post-mortem unexpected incidents to solve future hazards
  • Skilled in evaluating new possibilities and capacity planning aptitudes
  • Comfortable with handling the operations, monitoring and alerting
  • Knowledge and experience in building processes and automation to support other teams
  • Ability to persuade organizations to do what needs to be done

More about how to become an SRE:

https://hackernoon.com/so-you-want-to-be-an-sre-34e832357a8c

What does differentiate an SRE (Site Reliability Engineering) from DevOps? Aren’t they the same?

  1. SRE is to DevOps what Scrum is to Agile. DevOps is not a role, it is sort of a cultural aspect and can’t be assigned to an individual, should be done as a team. While SRE is the practice of creating and maintaining a highly available infrastructure/service and it is a role assigned to software professional.
  2. SREs at times practice DevOps. Wherein DevOps, as considered in the organizations focuses more on the automation part, SREs focus is more on the aspects like system availability, observability, instrumentation and scale considerations.
  3. Both SRE and DevOps are used for the management of the production environment. Primarily, SRE finds and solve some of the production problems themselves. However, the purpose of the DevOps is to find problems and then dispatch them to the dev team for solutions.
  4. Although the concept of DevOps is about handling and coping with issues before they fail, failure is something that we, unfortunately, can’t avoid. DevOps embraces this by accepting failure as something that is bound to happen, and which can help the team learn and grow. In the world of the SREs, this objective is delivered by having a formula for balancing accidents and failures against new releases. In other words, SREs want to make sure that there aren’t too many errors or failures, even if it’s something that we can learn. This formula is measured with two key identifiers: Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
  5. SRE works closely with monitoring applications or services in production where automation is crucial to improve the system’s health and availability. While DevOps primarily focuses on empowering developers to build and manage services by giving the developers measurable metrics to prioritize tasks. There seem to be very fewer people in these segments who can handle senior SRE or DevOps role as it should be someone with a combination of a software engineer, architect, system engineer, and an experienced master.

Even though SRE has been recently into the spotlight. The key is to partner DevOps and SRE to get systems into production effectively and efficiently by thinking about how a system will run in production before it is built. One way to achieve this is by integrating the quality team in the end-to-end DevOps workflow to agree on metrics like code coverage and downtime thresholds.

SRE and DevOps concept can still cause dilemma at some level but it all depends on the company and job profile interpretation. The roles and name might vary but the end of the day the whole world needs a solution and technology becoming a more dynamic and enriching experience.

Reference:

  1. https://landing.google.com/sre/
  2. https://en.wikipedia.org/wiki/Site_Reliability_Engineering
  3. https://hackernoon.com/so-you-want-to-be-an-sre-34e832357a8c
  4. https://blog.shippable.com/how-to-be-a-great-devops-engineer

 

Hope you enjoyed my blog post!

Regards// Ravikiran Talekar