User profile picture Busy

Mukunda Katta

@mukunda.vjcs6
🤘 Photon in a double slit
  • mukunda.vjcs6
  • README.md
Typing SVG

Active across 50+ public developer, AI, open source, and research surfaces
Code, datasets, package registries, preprints, reproducible demos, agent workflows, and OSS contribution traces.



GitHub Bitbucket Kaggle Codeberg Apache Hugging%20Face ORCID Zenodo SSRN Academia.edu Authorea GPT Store Poe Replicate Modal OpenRouter Codeberg Pages Portfolio LinkedIn X Email

▰▰▰  The Lab  ▰▰▰  Publications  ▰▰▰  Bridge  ▰▰▰  Stack  ▰▰▰  Track Record  ▰▰▰  Credentials  ▰▰▰


  8+ years building production systems at Fortune 100 scale
  Former SDE at Amazon Web Services  •  Currently at Southwest Airlines
  Deep expertise in ML systems, distributed architectures, and full-stack engineering

Now: shipped the @mukundakatta/agent* reliability stack (fit → guard → snap → vet → cast), 6 matching MCP servers in the official MCP Registry, 3 new GitHub Actions on the Marketplace, and published four new GitLab-born agent packages on PyPI. Plus 40+ open PRs across MCP SDKs, FastMCP, claude-code-action, and Anthropic's agent SDK.


🦊  THIS GITLAB                                    🐙  GITHUB
─────────────────────                             ─────────────────────
4 agent-infra repos                               222 original repos
All public + published                            105 merged upstream PRs
GitLab Registry + PyPI                            npm + PyPI package work

The Lab

Four sibling repos under mukunda.vjcs6-group. Each one solves a single concrete problem; together they form a personal agent stack.


Fresh Contributions

Surface Latest proof
OpenAI GPT Store Agent Eval Lab - public GPT for lightweight agent evaluation and scenario walkthroughs
Poe AgentEvalLab - public bot for agent-eval prompts and scoring flows
Poe OpsScorecardLab - public bot for turning eval scenarios into operations scorecards
Poe RepoLandscapeLab - public bot for mapping premium agent repo surfaces
Replicate agent-eval-lab - public model/app page for eval-oriented agent interactions
Replicate ops-scorecard-lab - public app page for ops scorecard generation
Replicate repo-landscape-lab - public app page for repo-surface mapping
Hugging Face Agent Labs Portfolio - curated collection tying together the live Spaces and datasets
Hugging Face Ops Scorecard Lab - public Space for turning rough workflows into operator-facing scorecards
Modal agent-eval-lab endpoint - live API endpoint returning structured eval JSON
Modal ops-scorecard-lab endpoint - live API endpoint for scorecard generation
Modal repo-landscape-lab endpoint - live API endpoint for repo-landscape mapping
Modal Agent Labs Portal - public two-panel demo surface for evaluation plans and ops scorecards
OpenRouter Agent Eval Lab - public OpenRouter app analytics page seeded from the live Hugging Face Space
OpenRouter Ops Scorecard Lab - public OpenRouter app analytics page for the scorecard Space
Netlify Agent Eval Lab Static Portal - verified public portal for the agent-eval research and demo surface
Observable Agent Eval Notebook - public notebook surface for lightweight scorecard exploration
Streamlit Agent Eval Scorecard - live app for scoring agent behavior with compact operational criteria
Replit agent-eval-replit-demo - public Replit project with a hosted demo surface
Cloudflare Pages Agent Evaluation Field Notes - static field-notes surface for scorecards, replay, and RAG guardrails
Firebase Hosting Agent Evaluation Field Notes - Firebase-hosted mirror of the field-notes surface
Codeberg Pages MukundaKatta.codeberg.page - public portfolio page routing across the non-GitHub footprint
GitHub agent-eval-public-notes - public notes for scorecards, replay debugging, and RAG guardrail checks
GitLab agent-eval-public-notes - GitLab mirror of the reusable agent-eval notes
GitHub agent-eval-platform-starters - starter artifacts for publishing field notes across cloud, notebook, data, and docs platforms
GitLab agent-eval-platform-starters - GitLab mirror of the platform-starter artifacts
Gitea agent-eval-platform-starters - Gitea mirror of the platform-starter artifacts
StackBlitz agent-eval-platform-starters workspace - browser-editable workspace URL for the starter artifacts
Gitpod agent-eval-platform-starters workspace - cloud workspace launch URL for the starter artifacts
Read the Docs Agent Evaluation Field Notes - hosted documentation for the field-notes project
Val Town agent-scorecard-val - public TypeScript scorecard function for agent evaluation notes
Google Colab Agent Evaluation Field Notes Scorecard - public notebook for scenario scoring and scorecard walkthroughs
CodeSandbox Agent Evaluation Field Notes - public sandbox preview for the scorecard app
GitHub Gist Operational scorecard template - standalone template for tool-using agent reviews
GitHub Gist Trajectory replay debugging checklist - replay checklist for agent workflow regressions
GitHub Gist RAG guardrail smoke tests - prompt-injection and vector-poisoning smoke tests
GitLab Snippet Operational scorecard template - public snippet mirror for scorecard evaluation
GitLab Snippet Trajectory replay debugging checklist - public snippet mirror for replay debugging
GitLab Snippet RAG guardrail smoke tests - public snippet mirror for RAG guardrails
Kaggle Premium Agent Repo Landscape - public dataset mapping premium agent repos by surface, stack, and focus
Kaggle Agent Eval Scenarios - public eval dataset for lightweight agent benchmarking
Kaggle building-a-lightweight-agent-eval-benchmark - clean public notebook replacement with a successful run and resilient dataset loading
Codeberg premium-agent-landscape - public showcase repo for agent portfolio mapping and presentation
Codeberg agent-eval-lab - public repo for evaluation artifacts and benchmark framing
Codeberg apache-contribution-atlas - public tracker for Apache-facing contribution work
Codeberg Documentation PR #784 - clarified HTTPS auth with 2FA and token-based Git usage
GitHub agent-eval-lab-static - source repo for the Netlify research portal
Apache fluss PR #3243 - added a blog contribution guide for the Fluss website community docs
Apache fluss PR #3244 - added an FIP contribution guide for the Fluss contributor workflow
Apache pulsar-site PR #1139 - fixed failover standby mapping in the 3.0.x docs

Publications

Type Title Venue
Landing Page Lightweight Evaluation and Operational Scorecards for Tool-Using AI Agents GitHub Pages
Preprint Lightweight Evaluation and Operational Scorecards for Tool-Using AI Agents Zenodo
Artifact Repo lightweight-agent-eval-paper GitHub
Archive lightweight-agent-eval-paper Software Heritage, archived successfully
Preprint Submission Lightweight Evaluation and Operational Scorecards for Tool-Using AI Agents SSRN, in process (PRELIMINARY_UPLOAD)
Preprint Submission Lightweight Evaluation and Operational Scorecards for Tool-Using AI Agents Research Square, declined as not suitable for posting
Preprint Submission Lightweight Evaluation and Operational Scorecards for Tool-Using AI Agents MetaArXiv on OSF Preprints, declined as out of scope
Preprint AI Eval Forge: Mixed-Check Regression Testing for LLM and Agent Workflows Zenodo
Artifact Repo ai-eval-forge-paper GitHub
Preprint Submission AI Eval Forge: Mixed-Check Regression Testing for LLM and Agent Workflows SSRN, public abstract page
Preprint Submission AI Eval Forge: Mixed-Check Regression Testing for LLM and Agent Workflows MetaArXiv on OSF Preprints, submitted
Preprint Karna: A Chat-Native, Multi-Channel Architecture for Personal AI Chief-of-Staff Agents Zenodo
Preprint Submission Karna: A Chat-Native, Multi-Channel Architecture for Personal AI Chief-of-Staff Agents Qeios, submitted (article 5249, live in 1 business day)
Artifact Mirror Karna: A Chat-Native, Multi-Channel Architecture for Personal AI Chief-of-Staff Agents Figshare public dataset mirror
Preprint Six Reliability Primitives for LLM Agents: An Artifact Pattern for Stackable, Single-Concern Libraries Zenodo
Preprint Agent Trajectory Replay for Debugging Tool-Using AI Workflow Regressions Zenodo
Preprint Submission Agent Trajectory Replay for Debugging Tool-Using AI Workflow Regressions SSRN, submitted for review
Preprint Small-Rule Guardrails for Retrieval-Augmented Generation: Prompt Injection and Vector Poisoning Checks Zenodo
Preprint Mirror Small-Rule Guardrails for Retrieval-Augmented Generation: Prompt Injection and Vector Poisoning Checks Figshare
Preprint Chetana: A Theory-Indexed Probe Framework for AI Consciousness Indicator Scoring Zenodo
Preprint ML Intern Lab: A Minimal Agentic Workflow for Reproducible Machine Learning Experiment Reports Zenodo
Preprint Submission ML Intern Lab: A Minimal Agentic Workflow for Reproducible Machine Learning Experiment Reports SSRN, submitted for review
Preprint Mirror ML Intern Lab: A Minimal Agentic Workflow for Reproducible Machine Learning Experiment Reports Academia.edu
Preprint Submission Citation Traceability for Web-Native AI Research Workflows MetaArXiv on OSF Preprints, resubmitted and pending moderator review
Preprint Submission Context Forge: A Lightweight Method for Diversity-Aware Context Packing and Prompt-Injection-Aware Retrieval Research Square, QA/QC check
Article Lightweight Evaluation for Tool-Using AI Agents Hashnode
Research Profile Mukunda Katta ORCID
Research Profile Mukunda Katta Academia.edu
Research Profile Mukunda Katta Authorea profile, new submissions paused during platform migration

🦊  agent-skills-playbook

BEHAVIOR PACK

Production-grade AI agent skills, prompts, and operating playbooks. Reusable behavior packs for research, code review, README writing, handoff briefs, and security passes - each one a small SKILL.md with before/after examples.

PyPI

pip install agent-skills-playbook

skills · SKILL.md · coding agents

🦊  personal-agent-harness

RUNTIME

Lightweight personal agent runtime with memory, task loops, tool adapters, and local-first safety rails. Repeatable workflows without becoming a giant framework - JSON memory, approval gates, replayable run logs.

PyPI

pip install personal-agent-harness

runtime · memory · tool adapters · safety

🦊  browser-research-agent

SPECIALIST · WEB

An AI research agent that searches, reads, summarizes, and cites web sources for repeatable market and repository intelligence. Source-quality labels, recency filters, GitHub trend analysis, markdown briefs.

PyPI

pip install browser-research-agent

research · citations · trend analysis

🦊  ml-intern-lab

SPECIALIST · ML

An ML-engineer agent sandbox for reading papers, running experiments, and shipping model reports. Paper notes → experiment plan → baseline run → metrics → report, all tracked.

PyPI

pip install ml-intern-lab

agentic ML · experiments · paper-to-report

                  ┌───────────────────────────┐
                  │   agent-skills-playbook   │   ← reusable behaviors
                  └─────────────┬─────────────┘
                                │ loaded by
                                ▼
                  ┌───────────────────────────┐
                  │   personal-agent-harness  │   ← the runtime
                  └─────────────┬─────────────┘
                                │ specialized into
                  ┌─────────────┴─────────────┐
                  ▼                           ▼
        ┌──────────────────┐        ┌──────────────────┐
        │ browser-research │        │   ml-intern-lab  │
        └──────────────────┘        └──────────────────┘

Published Packages

pip install agent-skills-playbook
pip install personal-agent-harness
pip install browser-research-agent
pip install ml-intern-lab
Package PyPI GitLab
agent-skills-playbook PyPI Repo
personal-agent-harness PyPI Repo
browser-research-agent PyPI Repo
ml-intern-lab PyPI Repo

Bridge to the Wider Portfolio

🐙 GitHub

@MukundaKatta

222 originals · 105 upstream PRs
OpenAI · Anthropic · Google · MS
Apache · HuggingFace · Pydantic

📦 Registries

npm · PyPI

52 npm packages · 52 PyPI ports
fit · guard · snap · vet · cast
kavach · streamparse · skillint

🔌 Integrations

MCP · Marketplace · 🤗

6 MCP-Registry servers
7 GitHub-Marketplace Actions
14 HF Spaces · 13 HF Datasets


Stack

 ┌─────────────────────┬────────────────────────────────────────────────────────────┐
 │  ML Systems         │  Fault prediction · embedding pipelines · evaluation       │
 │  Agentic AI         │  RAG · LangGraph · query routing · hallucination detection │
 │  Cloud              │  AWS Bedrock/SageMaker · GCP · Azure · K8s · Terraform     │
 │  Full-Stack         │  React/TS · Java/Python APIs · CI/CD · zero-downtime       │
 └─────────────────────┴────────────────────────────────────────────────────────────┘

Track Record

Era Role · Company What I owned
2025 - now AI/ML Engineer · Southwest Airlines production ML, agentic RAG, Bedrock migration
2024 - 2025 AI/ML Engineer · GPS IT Solutions RAG platforms, model-risk governance, vector search
2022 - 2024 SDE · Amazon Web Services enterprise cloud systems, React/Java/Python, CI/CD
2022 - 2022 Data Engineer · GPS IT Solutions AWS Glue, PySpark, on-prem → cloud pipelines
2017 - 2020 Software Engineer · American Express Python REST APIs at high-volume transaction scale
Numbers worth showing
  • 78% infra cost reduction on the SageMaker → Bedrock migration ($1,740 → $371/mo)
  • 600x retrieval-latency improvement on the ML prediction system
  • 30K+ entries in the 9-stage agentic RAG pipeline (LangGraph + Bedrock Nova + FAISS + BM25)
  • 5 prediction types in the aircraft-maintenance fault-prediction system
  • 23 automated evaluation tests for the AI model-risk governance framework
  • 40% content production-time reduction on the GPT-4 + RAG content platform

Credentials

Education  ·  M.S. Big Data Analytics & IT, University of Central Missouri (2021-2022)  ·  B.Tech Mechanical Engineering, SRM University (2012-2016)


ANTHROPIC

MCP Advanced Claude · Bedrock Claude · Vertex Intro to MCP Claude Code Claude API Agent Skills Subagents

AWS

AWS GenAI Apps AWS AI Solutions AWS AI Fundamentals Amazon Q


🦊  Mirrored on GitHub  ·  agent workspace on Bitbucket  ·  refreshed 2026-05-03

Activity

View all
Loading
There was an error loading users activity calendar.
  • Loading

Personal projects

View all
  • Loading
Loading

Info

Member since April 30, 2026