Hemanthakumar
Open to work

Hi, I'm Hemanth

Site Reliability Engineer

I'm an SRE with 5+ years of experience in cloud infrastructure, Kubernetes, and CI/CD automation on AWS. At GoGuardian, I drive reliability and cost efficiency across EKS-based systems — including AI-augmented workflows that cut hours of manual security analysis to minutes.

I'm drawn to problems where automation has a clear leverage point: a 90-minute DDoS outage that prompted a full architecture redesign, a 3-day manual patch cycle that became a 4-hour automated run, a 60-minute vulnerability review that now takes 2 minutes. My focus is building infrastructure that's observable, secure, and boring to operate.

Professional Experience

Site Reliability Engineer

GoGuardian July 2023 – Present
Current
  • Migrated Jenkins from EC2 to EKS — cut CI/CD costs by 50%, eliminated 20–30 min agent queues
  • Designed CloudFront + WAF DDoS defense that blocked 4 attacks over 2 years with zero production impact
  • Led EKS cluster upgrade v1.23 → v1.28 via blue-green strategy with under 5 min of user-facing impact
  • Built AI skill files (Claude/Codex) cutting vulnerability analysis from 60 min to under 2 min — enabling engineers to self-serve security investigations without SRE involvement
  • Automated kernel patching across the EC2 fleet, reducing security vulnerabilities by 80% and cutting patch cycle from 2–3 days to 4 hours
  • Developed Python automation scripts reducing manual operational effort by 70%, freeing teams to focus on higher-impact work
  • Built centralized Datadog dashboards for all-services health, traffic breakdowns, and DDoS monitoring; defined SLOs with burn-rate alerts to surface reliability risk before customer impact
  • Owned end-to-end incident management — configured PagerDuty routing and escalation policies, authored runbooks for common failure scenarios, and led post-mortem reviews
  • Designed disaster recovery strategy with defined RTO targets: stateless workloads reprovisionable in <1 hr via Terraform; databases restorable in 1–6 hrs with tested restore procedures
  • Centralized secrets management with AWS Secrets Manager, eliminating hardcoded credentials and enforcing least-privilege IAM policies across services
  • Led MongoDB Atlas version upgrades via staged rollout (dev → QA → prod), validated index integrity post-upgrade, and maintained a tested revert plan

DevOps Engineer

42Gears Mobility Systems Nov 2020 – July 2023
  • Replaced Cluster Autoscaler with Karpenter — 25% compute cost reduction, node startup 3–5 min → 45 sec
  • Built GitLab CI/CD pipelines from scratch across 5+ microservices, reducing manual intervention by 60%
  • Developed Python scripts to automate log analysis, reporting, and data processing — 300% increase in processing speed
  • Designed AWS VPC architecture with public/private subnet segmentation, routing tables, IGWs, and security group/NACL rules across dev, QA, and production
  • Set up and managed GKE clusters including node pool configuration, workload deployment, and version upgrades across environments
  • Managed EKS cluster upgrades across environments with zero downtime
  • Mentored 2 junior engineers on CI/CD, Kubernetes, and cloud infrastructure — both delivering independently within 3 months

Technical Skills

Cloud Platforms

AWSGCP

Infrastructure as Code

TerraformAnsiblePacker

Containers & Orchestration

KubernetesDockerEKSGKEHelmKarpenter

CI/CD

JenkinsGitLab CIGitHub ActionsArgoCD

Programming & AI

PythonAI/LLM ToolingClaudeCodex

Observability

DatadogPrometheusGrafanaCloudWatchSLO Monitoring

Security

AWS WAFAWS ShieldSecrets ManagerIAMDDoS ProtectionPatch Management

Incident Management

PagerDutyOn-CallPost-MortemsRunbooks

Databases

MongoDB AtlasSQL

Education

Bachelor of Engineering

Electronics and Communication Engineering

Visvesvaraya Technological University

2015 – 2019