Gowtham Sarveswaran
Profile photo of Gowtham

Based in Denver. Building cloud infrastructure at Spectrum, and building AI infrastructure tools on nights and weekends because I like understanding systems all the way down.

I build cloud platforms that get calmer as they scale.

Site Reliability Engineer / Cloud Platform Engineer / ML infrastructure builder

Projects

Small tools, real systems, public receipts.

  • Link GitHub

    Local, source-backed memory for AI agents: Markdown wiki, reviewed memories, CLI/MCP/skills, and a local viewer.

  • Jawbreaker GitHub

    MiniCPM5-1B scam detection with a custom LoRA, 632-case eval suite, model, dataset, and Gradio Space.

  • Picochat GitHub

    End-to-end small-language-model factory with dashboard training, honest evals, contamination checks, release gates, serving, and H100/H200 runbooks.

  • Redline GitHub

    Local-first eval infrastructure that turns prompt logs into regression suites and blocks unsafe prompt/model changes in CI.

Hackathons

Short builds, real constraints.

  • OpenAI Parameter Golf Challenge

    Accepted non-record submissions. Worked through 8xH100 training throughput, resource contention, scheduling pressure, and hardware constraints at scale.

  • Hugging Face Build Small Hackathon

    Built Jawbreaker: a local scam-detection SLM for a real family safety problem, with MiniCPM fine-tuning, custom LoRA, a 632-case eval suite, published model, open dataset, and live Gradio Space.

Work

Production infrastructure, not slideware.

  • Public Cloud Engineer, Spectrum 2025-present

    Cloud foundations work across AWS, Kubernetes, observability, automation, identity/access management, cost optimization, and on-call reliability. Guide architecture decisions, Terraform module work, deploy safety, and PR review across a small cloud infrastructure team.

  • Cloud Systems Engineer IV, Spectrum 2021-2025

    Kubernetes and Docker operations, Terraform/Ansible delivery patterns, ArgoCD GitOps, monitoring automation, Tailscale access, deploy safety, backup/restore, capacity planning, and stateful workload reliability on RDS Aurora.

  • Cloud Systems Engineer II/III, Spectrum 2016-2021

    Python and Bash automation, Helm and GitLab CI/CD, AWS best-practices training, and infrastructure tooling proofs of concept.

Systems

Things I like working on.

Platform

  • Kubernetes and EKS operations
  • AWS Organizations, IAM, VPC, DNS
  • Terraform, Terragrunt, Ansible, Helm
  • GitOps and CI/CD delivery patterns

Reliability

  • Prometheus, Grafana, Datadog
  • Splunk, ELK, CloudWatch
  • Alert quality and SLO/SLA management
  • Incident response and post-incident automation

Identity & cost

  • SSO, Cognito, Entra ID, SCIM
  • Tailscale and access reviews
  • Cloud cost optimization and FinOps

ML infrastructure

  • GPU clusters and eval pipelines
  • MCP tooling
  • Model and prompt release gates

Education

Short version.

  • MS in Computer Engineering, University of Colorado Denver.
  • MBA in Global Leadership, Colorado Technical University.
  • AWS SysOps Certified. Certified Kubernetes Administrator.