Gowtham Sarveswaran
Profile photo of Gowtham

Based in Denver. Building cloud infrastructure at Spectrum, and building AI infrastructure tools on nights and weekends because I like understanding systems all the way down.

I build cloud platforms that get calmer as they scale.

Site Reliability Engineer / Cloud Platform Engineer / ML infrastructure builder

I work on Kubernetes, AWS, Terraform, observability, incident response, FinOps, and automation for production systems. I like the kind of engineering where the proof is simple: fewer repeated pages, safer deploys, clearer failure modes, and systems that are easier for the next person to operate.

Projects

Small tools, real systems, public receipts.

  • Redline

    Local-first eval infrastructure that turns prompt logs into regression suites and blocks unsafe prompt/model changes in CI.

  • Jawbreaker

    MiniCPM5-1B scam detection with a custom LoRA, 632-case eval suite, model, dataset, and Gradio Space.

  • Link

    Local, source-backed memory for AI agents: Markdown wiki, reviewed memories, CLI/MCP/skills, and a local viewer.

  • Picochat

    End-to-end small-language-model factory with dashboard training, honest evals, contamination checks, release gates, serving, and H100/H200 runbooks.

Hackathons

Short builds, real constraints.

  • OpenAI Parameter Golf Challenge

    Accepted non-record submissions. Worked through 8xH100 training throughput, resource contention, scheduling pressure, and hardware constraints at scale.

  • Hugging Face Build Small Hackathon

    Built Jawbreaker: a local scam-detection SLM for a real family safety problem, with MiniCPM fine-tuning, custom LoRA, a 632-case eval suite, published model, open dataset, and live Gradio Space.

Work

Production infrastructure, not slideware.

  • Public Cloud Engineer, Spectrum 2025-present

    Cloud foundations work across AWS, Kubernetes, observability, automation, identity/access management, cost optimization, and on-call reliability.

  • Cloud Systems Engineer IV, Spectrum 2021-2025

    Kubernetes and Docker operations, Terraform/Ansible delivery patterns, ArgoCD GitOps, monitoring automation, Tailscale access, deploy safety, and stateful workload reliability on RDS Aurora.

  • Cloud Systems Engineer II/III, Spectrum 2016-2021

    Python and Bash automation, Helm and GitLab CI/CD, AWS best-practices training, and infrastructure tooling proofs of concept.

Systems

Things I like working on.

  • Kubernetes platforms and EKS operations
  • Terraform, Terragrunt, GitOps, CI/CD
  • Prometheus, Grafana, Splunk, alert quality
  • Datadog, CloudWatch, ELK, service health
  • Incident response and post-incident automation
  • SSO, Cognito, Entra ID, SCIM, access reviews
  • Cloud cost optimization and FinOps
  • GPU clusters, eval pipelines, MCP tooling
  • Model and prompt release gates
  • Runbooks that become automation

Education

Short version.

  • MS in Computer Engineering, University of Colorado Denver.
  • MBA in Global Leadership, Colorado Technical University.
  • AWS SysOps Certified. Certified Kubernetes Administrator.