AI Software Engineer building systems that see, reason, and act.
I work at the intersection of computer vision, robotics, multimodal AI, and agentic software systems. My focus is practical automation: taking perception models, retrieval systems, and LLM agents from prototype to usable workflows in industrial and robotics environments.
Currently, I am building agentic and multimodal AI systems for industrial automation at Musashi AI North America. Previously, I worked on agricultural and robotic vision systems at Vineland Research and DaoAI Robotics. I hold an M.A.Sc. in Robotics from McMaster University.
- Multimodal inspection systems that combine images, documents, sensor data, and structured reasoning
- Vision and generative AI pipelines for robotics, manufacturing, and intelligent monitoring
- Agentic workflows using LangGraph, LangChain, RAG, tool calling, and human-in-the-loop review
- End-to-end AI applications with FastAPI, React/TypeScript, Docker, cloud services, and edge deployment paths
- Computer vision systems involving defect detection, 3D perception, segmentation, and model optimization
A multimodal AI agent for industrial quality inspection.
Combines defect images, inspection standard documents, and optional sensor data to identify defects, retrieve relevant standard clauses, assign severity, explain risk, generate structured inspection reports, and flag cases that require human review.
Tech focus: multimodal AI, RAG, LangGraph, FastAPI, industrial inspection, manufacturing AI.
Repository: https://github.com/peige-guo/Multimodal_Industrial_Inspection_Agent
A LangGraph-based research assistant that performs multi-step web research with query generation, reflection, follow-up search, streaming UI, and cited answers.
Tech focus: LangGraph, DeepSeek, Tavily Search, React, TypeScript, streaming agent UI.
Repository: https://github.com/peige-guo/research-chatbot
A real-time agent for analyzing X/Twitter trends and public opinion. It retrieves social data, filters relevant content, performs RAG-based analysis, and generates responses or trend reports.
Tech focus: LangGraph, LangChain, FastAPI, LangServe, Streamlit, RAG, social data analysis.
Repository: https://github.com/peige-guo/twitter_trend_agent
An LLM + MCP automation project for operating seller backend and logistics workflows. It explores browser automation, tool abstraction, workflow orchestration, and exception handling for real-world business processes.
Tech focus: MCP, LLM tools, browser automation, workflow automation, FastAPI/Python.
Repository: https://github.com/peige-guo/auto-shipping
AI / ML:
PyTorch · Hugging Face · OpenAI / DeepSeek / Gemini / Qwen · LangChain · LangGraph · RAG · Vector Databases
Computer Vision / Robotics:
OpenCV · YOLO · SAM · 3D Vision · ROS2 · Sensor Fusion · Defect Detection · Robotic Perception
Backend / Systems:
Python · FastAPI · Pydantic · Docker · Kubernetes · AWS · PostgreSQL · Redis
Frontend / Product:
React · TypeScript · TailwindCSS · Streamlit · Agent UI · Human-in-the-loop Workflows
Optimization / Deployment:
TensorRT · ONNX · Edge AI · Model Serving · CI/CD
I am especially interested in systems that connect perception, reasoning, and action:
- Multimodal transformer agents for real-time industrial perception
- Vision-language models for quality inspection and robotics
- RAG over technical standards, manuals, and inspection documents
- Human-review workflows for safety-critical AI decisions
- Edge/cloud deployment of AI inspection and monitoring systems
- Agentic automation for manufacturing, robotics, and operations
I like building AI systems that are transparent, reproducible, and useful outside of demos.
The projects here are experiments toward one larger direction: agentic multimodal AI for real-world automation.
If you are working on computer vision, robotics, industrial AI, VLMs, or agent systems, I am happy to connect.
Email: guopeige@gmail.com
LinkedIn: https://www.linkedin.com/in/peigeguo/
GitHub: https://github.com/peige-guo