A powerful Zotero AI and MCP plugin with ChatGPT, Gemini 3.5, Claude, DeepSeek V4, Grok, OpenRouter, Kimi 2.6, GLM 5, SiliconFlow, GPT-oss, Gemma 4, Qwen 3.7
-
Updated
May 23, 2026 - JavaScript
A powerful Zotero AI and MCP plugin with ChatGPT, Gemini 3.5, Claude, DeepSeek V4, Grok, OpenRouter, Kimi 2.6, GLM 5, SiliconFlow, GPT-oss, Gemma 4, Qwen 3.7
Gemma Gem runs Google's Gemma 4 model entirely on-device via WebGPU — no API keys, no cloud, no data leaving your machine.
PokeClaw (PocketClaw) — first on-device AI that controls your Android phone. Gemma 4, no cloud, no API key. Poke is short for Pocket.
OpenClaw alternative in your pocket
🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support
Local AI desktop app — chat, agent mode, image gen, video gen. Supports Ollama, Gemma 4, Llama, Qwen, OpenAI, Anthropic. Single .exe, no Docker.
Local AI Assistant on your phone
Run local LLMs like Gemma, Qwen, and LLaMA on Android for offline, private, real-time chat and question answering with LiteRT and ONNX Runtime.
llama.cpp fork with TurboQuant WHT-rotated KV cache & weight compression + Gemma 4 MTP and Qwen 3.6 NextN speculative decoding (+30-50% throughput).
This is end to end course on AI Agents and Agentic AI with 15+ AI Agent Projects with real time use cases and industry expertise.
A privacy-first Android chat app that runs large language models entirely on-device. No internet, no cloud, no tracking. Built with Kotlin, Jetpack Compose, and llama.cpp with optimized ARM NEON/SVE inference.
Agentic ✧ Gemma Inference for Android System Intelligence
Automated image & video captioning using Qwen-VL, Gemma4 and SAM3.
MCore-Bridge: Providing Megatron-Core model definitions for state-of-the-art large models and making Megatron training as simple as Transformers — with support for 300+ large language models (Qwen3-Next, GLM-5.1, Deepseek-V3.2, MiniMax-2.7, ...) and 200+ multimodal large models (Qwen3.5, Qwen3-Omni, Gemma4, ...).
A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability
High-performance on-device LLM inference for React Native, powered by LiteRT-LM and Nitro Modules
Terminal-native AI assistant built specifically for Termux and Android development.
Intent Coding (hjx): From Vibe to Verifiable. Production-grade AI logic with zero-cost static execution.
Add a description, image, and links to the gemma4 topic page so that developers can more easily learn about it.
To associate your repository with the gemma4 topic, visit your repo's landing page and select "manage topics."