Get a DemoStart Free TrialSign In


5 min read

In the next edition of our interview series that focuses on senior players in the global tech arena, we welcome Harsha Patil an Engineering Manager overseeing cloud compute infrastructure at an E-commerce company in California, USA. We will explore his career development, offer tips for aspiring engineers, and discuss the key lessons he has learned along the way in this feature.


Can you share a little bit about yourself?

I am Harsha Patil, an Engineering Manager overseeing cloud compute infrastructure at an E-commerce company in California, USA. My background includes architecting infrastructure, particularly in Kubernetes and cloud compute infrastructure. I hold a master's degree in Electrical and Computer Engineering. I am passionate about technology, particularly understanding the why’s and how’s of something. I have experience in team leadership and strategic planning.

What is a project you've worked on that you are proud of?

I have had opportunities to work on some pretty exciting projects over the last six years, these range from building Kubernetes clusters to support cloud migration initiatives to building automations to improve application performance and scalability. I have also built a service from scratch that provides on demand scaling for Kubernetes applications which has not only helped with reducing latencies of applications during datacenter outages but also has brought substantial cost savings.

What did the career path to your current role look like?

I started developing interest and learning more about DevOps, SRE and Platform Engineering when I first started working on my internship at a startup incubator, where I was a technical developer for several companies that my company was funding. As I learned more about how different companies work on different technologies, yet they share the same common challenge of understanding infrastructure and platform, I started looking more into solving this challenge for the companies and that’s how I learned the importance of this being a mindset, more than its own skill.

Can you talk about a challenge you've faced as an engineer?

The most important and frequent challenge I have experienced personally, and observed in almost all levels of engineers is that it is not only important to keep up with the fast changing technologies but also to understand the business value in the work being done. I have met, spoken and worked with some incredibly smart and highly technical individuals who can solve the most complex problems, but one key skill that gets overlooked is understanding the business, organization or your customer’s needs. This skill gets even more crucial as you grow more in your role. Thanks to the guidance of some really great mentors along the way, I began to understand the importance of this skill. I am still working to master it and coach others in my current role.

Could you give us the best example you have seen of DevOps benefiting a business?

One of the best examples I have seen and read about DevOps benefiting a business is at Netflix! Their adoption of DevOps practices has helped them scale, innovate and deliver features seamlessly to millions of users worldwide. Some key examples, and I think most companies can utilize the same, are:

  • Continuous integration and continuous deployments and Testing (CI/CD) - Implemented Robust testing and automated deployment to support and increase developer velocity for faster and safer deployments
  • Microservice Architecture - Monolith was broken down into microservices which helped with deploying, scaling and maintaining services independently, resulting in faster development cycles. This also allowed each service to be scaled and optimized independently, leading to greater optimizations and increased autonomy.
  • Infrastructure as Code (IaaC) - Netflix uses tools like Terraform and AWS CloudFormation to create and manage their infrastructure, which streamlines the process and reduces room for human error.
  • Monitoring and Incident Management - Monitoring and Incident management is yet another important skill set for any engineer, especially DevOps. Netflix uses tools like Chaos Monkey that test the resiliency of their infrastructure by randomly, but intentionally causing failures.

What are the biggest challenges for adopting and scaling DevOps in the enterprise?

Even though DevOps could mean different things in different companies as it’s really not a job role and more of a mindset or practice, there could be challenges in adopting and scaling. These challenges include culture or mindset shift, organizational buy-ins, legacy services which might not support certain advanced DevOps tools, skills gap, measurements and metrics. It is important to thoroughly assess the business and technical need for DevOps at any organization.

Between Azure, GCP & AWS which of these cloud offerings suits you best in your role within DevOps? Why would you choose one over the other?

This again really depends on the organizational needs and other factors like costs, negotiations with different cloud vendors, availability etc. All the major cloud providers have similar tools that are built on top of the core concepts, such as compute, redis, and dns, though the names of these tools differ for different providers, like AWS EC2 vs GCE. It is also really important to understand the need of the cloud deployment as it might not be a one size fits all solution for everything.

What is the most rewarding aspect of your role?

The most rewarding aspect of my role is the opportunity to drive significant impact through innovative solutions and collaboration. Whether it's architecting resilient infrastructure, optimizing Kubernetes deployments, or developing tools that save costs, I find immense satisfaction in solving complex problems and seeing tangible results that are directly related to business. Additionally, mentoring and coaching others, contributing to open-source projects, and engaging with the community allows me to share knowledge and foster growth, which is incredibly fulfilling.

What advice would you give to someone wishing to start their career as a DevOps Engineer?

My advice for someone starting their career in DevOps/SRE roles would be to learn any of the programming languages, get your basics strong, especially in Networking and Linux, understanding and learning thoroughly about the cloud technologies from any of the major cloud providers. Understanding key automation concepts and best practices like CI/CD, thinking about reliability, high availability, Infrastructure as a Code (IaaC), and familiarity with monitoring and container tools like Thanos, Prometheus, Kubernetes, Docker etc. would also promote growth in these roles. As stated before, DevOps/SRE is more of a mindset and practice. Staying curious and learning new things everyday definitely helps in the long run.

How would you differentiate between DevOps, SRE and Platform engineering?

Again, this is specific to companies as these are not standard skillsets or roles but each of them serve different purposes.

DevOps - The goal is to bridge the gap between development and operations teams and collaborate with skill sets like configuration management, CI/CD, monitoring, logging, and automations to improve the software delivery lifecycle.

SRE (Site Reliability Engineering) - This was introduced by Google. Though most of the tools and technologies used are similar to DevOps, this emphasizes on reliability, availability, and performance of services. SREs apply software engineering principles to system administration tasks. The goal for SRE is to create highly available, scalable and reliable systems through automations, monitoring and proactive measures.

Platform Engineering - This focuses on building and maintaining platforms to enable developers to manage and deploy their services in a faster and an effective manner. Platform Engineering also focuses on building self-serving tools to reduce the operational toil on development teams.

What is your experience of using metrics analysis or log management tools and how does analysing log files assist you within your role?

Monitoring and logging is another key aspect to think about as a DevOps/SRE/Platform engineer. I have had a chance to explore, build and analyze various monitoring and logging for the platform using Prometheus, Thanos, Elastic stack, Grafana, Tremor etc. I have used these for various cases like capacity planning, troubleshooting, performance fine tuning, alerting and believe it or not, support with cost savings.

Are there any books, blogs, or any other resources that you highly recommend on the subject of DevOps?

There are a lot of resources out there to help someone learn about these. My favorites are tech blogs from major companies like Netflix, Google, AirBnB etc., along with some books:

The DevOps Handbook: How to Create World-Class Agility, Reliability, & Security in Technology Organizations by Patrick Debois, Jez Humble, Gene Kim, John Willis.

Site Reliability Engineering: How Google Runs Production Systems by Chris Jones

The Site Reliability Workbook: Practical Ways to Implement SRE by David K. Rensin, Kent Kawahara, Niall Richard Murphy, Betsy Beyer, Stephen Thorne.

Kubernetes Up & Running: Dive into the Future of Infrastructure by Brendan Burns, Joe Beda, and Kelsey Hightower

Cloud Native DevOps with Kubernetes: Building, Deploying, and Scaling Modern Applications in the Cloud by John Arundel and Justin Domingus.

When Kubernetes Isn’t the Right Choice: Understanding the Limitations by Harsha Patil via Medium blog.

Get the latest elastic Stack & logging resources when you subscribe

© 2024 Ltd, All rights reserved.