Active across 50+ public developer, AI, open source, and research surfaces
Code, datasets, package registries, preprints, reproducible demos, agent workflows, and OSS contribution traces.
▰▰▰ The Lab ▰▰▰ Publications ▰▰▰ Bridge ▰▰▰ Stack ▰▰▰ Track Record ▰▰▰ Credentials ▰▰▰
8+ years building production systems at Fortune 100 scale
Former SDE at Amazon Web Services • Currently at Southwest Airlines
Deep expertise in ML systems, distributed architectures, and full-stack engineeringNow: shipped the @mukundakatta/agent* reliability stack (fit → guard → snap → vet → cast), 6 matching MCP servers in the official MCP Registry, 3 new GitHub Actions on the Marketplace, and published four new GitLab-born agent packages on PyPI. Plus 40+ open PRs across MCP SDKs, FastMCP, claude-code-action, and Anthropic's agent SDK.
🦊 THIS GITLAB 🐙 GITHUB
───────────────────── ─────────────────────
4 agent-infra repos 222 original repos
All public + published 105 merged upstream PRs
GitLab Registry + PyPI npm + PyPI package workThe Lab
Four sibling repos under
mukunda.vjcs6-group. Each one solves a single concrete problem; together they form a personal agent stack.
Fresh Contributions
| Surface | Latest proof |
|---|---|
| OpenAI GPT Store | Agent Eval Lab - public GPT for lightweight agent evaluation and scenario walkthroughs |
| Poe | AgentEvalLab - public bot for agent-eval prompts and scoring flows |
| Poe | OpsScorecardLab - public bot for turning eval scenarios into operations scorecards |
| Poe | RepoLandscapeLab - public bot for mapping premium agent repo surfaces |
| Replicate | agent-eval-lab - public model/app page for eval-oriented agent interactions |
| Replicate | ops-scorecard-lab - public app page for ops scorecard generation |
| Replicate | repo-landscape-lab - public app page for repo-surface mapping |
| Hugging Face | Agent Labs Portfolio - curated collection tying together the live Spaces and datasets |
| Hugging Face | Ops Scorecard Lab - public Space for turning rough workflows into operator-facing scorecards |
| Modal | agent-eval-lab endpoint - live API endpoint returning structured eval JSON |
| Modal | ops-scorecard-lab endpoint - live API endpoint for scorecard generation |
| Modal | repo-landscape-lab endpoint - live API endpoint for repo-landscape mapping |
| Modal | Agent Labs Portal - public two-panel demo surface for evaluation plans and ops scorecards |
| OpenRouter | Agent Eval Lab - public OpenRouter app analytics page seeded from the live Hugging Face Space |
| OpenRouter | Ops Scorecard Lab - public OpenRouter app analytics page for the scorecard Space |
| Netlify | Agent Eval Lab Static Portal - verified public portal for the agent-eval research and demo surface |
| Observable | Agent Eval Notebook - public notebook surface for lightweight scorecard exploration |
| Streamlit | Agent Eval Scorecard - live app for scoring agent behavior with compact operational criteria |
| Replit | agent-eval-replit-demo - public Replit project with a hosted demo surface |
| Cloudflare Pages | Agent Evaluation Field Notes - static field-notes surface for scorecards, replay, and RAG guardrails |
| Firebase Hosting | Agent Evaluation Field Notes - Firebase-hosted mirror of the field-notes surface |
| Codeberg Pages | MukundaKatta.codeberg.page - public portfolio page routing across the non-GitHub footprint |
| GitHub | agent-eval-public-notes - public notes for scorecards, replay debugging, and RAG guardrail checks |
| GitLab | agent-eval-public-notes - GitLab mirror of the reusable agent-eval notes |
| GitHub | agent-eval-platform-starters - starter artifacts for publishing field notes across cloud, notebook, data, and docs platforms |
| GitLab | agent-eval-platform-starters - GitLab mirror of the platform-starter artifacts |
| Gitea | agent-eval-platform-starters - Gitea mirror of the platform-starter artifacts |
| StackBlitz | agent-eval-platform-starters workspace - browser-editable workspace URL for the starter artifacts |
| Gitpod | agent-eval-platform-starters workspace - cloud workspace launch URL for the starter artifacts |
| Read the Docs | Agent Evaluation Field Notes - hosted documentation for the field-notes project |
| Val Town | agent-scorecard-val - public TypeScript scorecard function for agent evaluation notes |
| Google Colab | Agent Evaluation Field Notes Scorecard - public notebook for scenario scoring and scorecard walkthroughs |
| CodeSandbox | Agent Evaluation Field Notes - public sandbox preview for the scorecard app |
| GitHub Gist | Operational scorecard template - standalone template for tool-using agent reviews |
| GitHub Gist | Trajectory replay debugging checklist - replay checklist for agent workflow regressions |
| GitHub Gist | RAG guardrail smoke tests - prompt-injection and vector-poisoning smoke tests |
| GitLab Snippet | Operational scorecard template - public snippet mirror for scorecard evaluation |
| GitLab Snippet | Trajectory replay debugging checklist - public snippet mirror for replay debugging |
| GitLab Snippet | RAG guardrail smoke tests - public snippet mirror for RAG guardrails |
| Kaggle | Premium Agent Repo Landscape - public dataset mapping premium agent repos by surface, stack, and focus |
| Kaggle | Agent Eval Scenarios - public eval dataset for lightweight agent benchmarking |
| Kaggle | building-a-lightweight-agent-eval-benchmark - clean public notebook replacement with a successful run and resilient dataset loading |
| Codeberg | premium-agent-landscape - public showcase repo for agent portfolio mapping and presentation |
| Codeberg | agent-eval-lab - public repo for evaluation artifacts and benchmark framing |
| Codeberg | apache-contribution-atlas - public tracker for Apache-facing contribution work |
| Codeberg | Documentation PR #784 - clarified HTTPS auth with 2FA and token-based Git usage |
| GitHub | agent-eval-lab-static - source repo for the Netlify research portal |
| Apache | fluss PR #3243 - added a blog contribution guide for the Fluss website community docs |
| Apache | fluss PR #3244 - added an FIP contribution guide for the Fluss contributor workflow |
| Apache | pulsar-site PR #1139 - fixed failover standby mapping in the 3.0.x docs |
Publications
|
|
|
|
┌───────────────────────────┐
│ agent-skills-playbook │ ← reusable behaviors
└─────────────┬─────────────┘
│ loaded by
▼
┌───────────────────────────┐
│ personal-agent-harness │ ← the runtime
└─────────────┬─────────────┘
│ specialized into
┌─────────────┴─────────────┐
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ browser-research │ │ ml-intern-lab │
└──────────────────┘ └──────────────────┘Published Packages
pip install agent-skills-playbook
pip install personal-agent-harness
pip install browser-research-agent
pip install ml-intern-lab| Package | PyPI | GitLab |
|---|---|---|
agent-skills-playbook |
PyPI | Repo |
personal-agent-harness |
PyPI | Repo |
browser-research-agent |
PyPI | Repo |
ml-intern-lab |
PyPI | Repo |
Bridge to the Wider Portfolio
|
|
|
Stack
┌─────────────────────┬────────────────────────────────────────────────────────────┐
│ ML Systems │ Fault prediction · embedding pipelines · evaluation │
│ Agentic AI │ RAG · LangGraph · query routing · hallucination detection │
│ Cloud │ AWS Bedrock/SageMaker · GCP · Azure · K8s · Terraform │
│ Full-Stack │ React/TS · Java/Python APIs · CI/CD · zero-downtime │
└─────────────────────┴────────────────────────────────────────────────────────────┘Track Record
| Era | Role · Company | What I owned |
|---|---|---|
| 2025 - now | AI/ML Engineer · Southwest Airlines | production ML, agentic RAG, Bedrock migration |
| 2024 - 2025 | AI/ML Engineer · GPS IT Solutions | RAG platforms, model-risk governance, vector search |
| 2022 - 2024 | SDE · Amazon Web Services | enterprise cloud systems, React/Java/Python, CI/CD |
| 2022 - 2022 | Data Engineer · GPS IT Solutions | AWS Glue, PySpark, on-prem → cloud pipelines |
| 2017 - 2020 | Software Engineer · American Express | Python REST APIs at high-volume transaction scale |
Numbers worth showing
- 78% infra cost reduction on the SageMaker → Bedrock migration ($1,740 → $371/mo)
- 600x retrieval-latency improvement on the ML prediction system
- 30K+ entries in the 9-stage agentic RAG pipeline (LangGraph + Bedrock Nova + FAISS + BM25)
- 5 prediction types in the aircraft-maintenance fault-prediction system
- 23 automated evaluation tests for the AI model-risk governance framework
- 40% content production-time reduction on the GPT-4 + RAG content platform
Credentials
Education · M.S. Big Data Analytics & IT, University of Central Missouri (2021-2022) · B.Tech Mechanical Engineering, SRM University (2012-2016)
Personal projects
View all- Loading