Hello, I'm

SOWMITHKUPPA

Building reliable AI and RAG systems with scalable MLOps infrastructure.

About

Hi there! I'm Sowmith Kuppa, an M.Sc. in Computer Science graduate from Old Dominion University, where my focus lies in machine learning, MLOps, and reliable AI systems.

I've built and deployed deep learning models in PyTorch and TensorFlow, designed Retrieval-Augmented Generation (RAG) pipelines, and developed data-driven experiments that improved model performance and research outcomes. My recent work combines AI reliability, cloud automation, and data engineering, turning complex research ideas into scalable, production-ready systems.

Before diving into AI, I worked in IT consulting and systems administration, where I managed large-scale Windows and Linux environments, automated infrastructure with Ansible and Terraform, and led CI/CD initiatives using Jenkins and Kubernetes. That background gives me a strong understanding of real-world system reliability and the importance of clean, automated deployment pipelines.

I'm passionate about building responsible, reproducible AI solutions, from data preprocessing to monitoring and model refinement and I'm always looking for ways to bridge research, automation, and practical impact.

Skills

PyTorch
PyTorch
TensorFlow
TensorFlow
Kubernetes
Kubernetes
Docker
Docker
Terraform
Terraform
Ansible
Ansible
Jenkins
Jenkins
GCP
GCP
AWS
AWS
Python
Python
Go
Go
PostgreSQL
PostgreSQL
MySQL
MySQL
Prometheus
Prometheus
Grafana
Grafana
ArgoCD
ArgoCD
Helm
Helm
Java
Java
GitLab
GitLab

Machine Learning & AI

PyTorchPyTorch
TensorFlowTensorFlow
S
Scikit-learn
RAGAgentic AIModel DeploymentHyperparameter TuningML Pipeline AutomationApache Spark

Data Science & Analytics

PythonPython
PandasPandas
NumPyNumPy
Data PreprocessingStatistical AnalysisData Visualization

MLOps & Automation

KubernetesKubernetes
DockerDocker
TerraformTerraform
AnsibleAnsible
JenkinsJenkins
ArgoCDArgoCD
HelmHelm
GitLabGitLab
GitLab CI/CDCanary/Blue-Green Deployments

Cloud & Containerization

GCPGCP
AWSAWS
KubernetesKubernetes
DockerDocker
Cloud RunCloud StorageVertex AIVector SearchGKE

Databases

PostgreSQLPostgreSQL
MySQLMySQL
Google Cloud Storage

Programming Languages

PythonPython
GoGo
JavaJava
JavaScriptJavaScript
HTMLHTML
CSSCSS
BashBash
C++C++
SQLPowerShell

Observability & Monitoring

PrometheusPrometheus
GrafanaGrafana
ML MonitoringDatadog

Collaboration & Workflow Tools

JIRAJIRA
ServiceNowSharePoint

Experience

Current

AI Research Intern

Old Dominion University

January 2025 - Present (1 year)

Norfolk, Virginia, United States

  • • Designed and implemented machine learning models using PyTorch to support research on reliable AI systems, focusing on model robustness and performance evaluation.
  • • Developed and integrated Retrieval-Augmented Generation (RAG) pipelines to enhance dataset quality and context relevance, improving model accuracy and research insights by 20%.
  • • Created data mining and drift detection algorithms to analyze large datasets, generating actionable visual reports to monitor and explain model behavior over time.
  • • Performed iterative model refinement and testing, leading to measurable improvements in performance and contributing to findings on reliable and scalable AI techniques.

Graduate Research Assistant

Old Dominion University

October 2023 - May 2025 (1 year 8 months)

Norfolk, Virginia, United States

Key Responsibilities:

  • Kubernetes: Hands-on experience with Jenkins job configurations, pipelines, and plugins
  • Infrastructure as Code: Terraform for provisioning VMs and managing multi-provider environments
  • Containerization: Docker, Container Orchestration, Debugging Docker logs
  • Configuration Management: Ansible and Ansible-Lint for orchestration and automation
  • Monitoring: Zabbix, Graylog, Prometheus & Grafana for Cluster Monitoring
  • Windows/Windows Server: Active Directory, GPO Configurations, DNS & DHCP, SCCM
  • Resolved 100+ tickets related to computer networks, configurations and hardware troubleshooting

Machine Learning Intern

Campalin Innovations

July 2022 - September 2022 (3 months)

  • • Developed an AI customer support chatbot using a RAG pipeline with a ChromaDB vector database for efficient knowledge retrieval, integrated with a React.js and CSS frontend, reducing average response time by ~50%.
  • • Implemented a Node.js backend to manage conversation workflows, API requests, and response selection, handling simulated user queries with ~85% correct response accuracy.
  • • Collaborated with mentors and teammates to optimize data preprocessing, improve model performance, and enhance UI/UX, resulting in a 30% reduction in repetitive support queries during testing.

Projects

Active

MedSim

Virtual Patient Simulator | Jan 2025 – Present

Comprehensive medical simulation platform with AI-powered evaluation capabilities. Re-architected into microservices with Hybrid RAG pipeline on GCP.

  • • Microservices architecture with React.js & FastAPI
  • • Hybrid RAG pipeline using Vertex AI Vector Search & Gemini 2.5 Pro
  • • Automated OSCE-style evaluation engine for real-time clinical performance analysis
  • • Containerized deployment on Google Cloud Run with optimized cold-start
React.jsFastAPIGCPVertex AIRAGDockerCloud Run
AI/ML

Clinical-Copilot

AI-Powered Clinical Assistant | Oct 2025

Advanced AI system designed to assist healthcare professionals with clinical decision-making and medical knowledge retrieval.

  • • AI-powered clinical assistance and decision support
  • • Medical knowledge retrieval and processing
  • • Python-based implementation
PythonAIMLHealthcare
View on GitHub
AI Agent

Manuscript

AI Agent for Documentation | Sep 2025

An AI-agent for internal documentation and querying. Streamlines knowledge management and enables intelligent document search and retrieval.

  • • AI-powered document querying and retrieval
  • • Internal documentation management system
  • • Intelligent search and knowledge extraction
AIDocumentationNLPSearch
View on GitHub
MLOps

AI-monitoring

MLOps Infrastructure | Feb 2025

Learning MLOps with Kubeflow and DevOps principles. Comprehensive monitoring and management solution for machine learning workflows in production.

  • • Kubeflow-based MLOps pipeline
  • • DevOps principles for ML lifecycle management
  • • Production-ready monitoring and orchestration
KubeflowMLOpsKubernetesDevOps
View on GitHub
AI/ML

Transgaurd

Fraud Detection AI Model | Jun 2022

An artificial intelligence model detecting if a transaction is fraudulent or not using transaction details. Machine learning-based fraud detection system.

  • • AI-powered fraud detection using transaction data
  • • Machine learning classification model
  • • Real-time transaction analysis
PythonMachine LearningAIFraud Detection
View on GitHub

Education

2023 - 2025

Master of Science - MS, Computer Science

Old Dominion University

August 2023 - May 2025

2019 - 2023

Bachelor of Technology - BTech, Computer Science

National Institute of Technology Puducherry

August 2019 - May 2023

Certifications

CKA: Certified Kubernetes Administrator

The Linux Foundation

April 2025

Networking Essentials

Professional Certification

June 2022

Introduction to Cybersecurity Tools & Cyber Attacks

Professional Certification

July 2022

Get in touch

Send a Message

Have a project in mind? Let's discuss it.