Model swapping for llama.cpp (or any local OpenAI API compatible server)
-
Updated
Sep 2, 2025 - Go
Model swapping for llama.cpp (or any local OpenAI API compatible server)
β¨ Kubectl plugin to create manifests with LLMs
The easiest way to use Ollama in .NET
ποΈ Fine-tune, build, and deploy open-source LLMs easily!
Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
AubAI brings you on-device gen-AI capabilities, including offline text generation and more, directly within your app.
Social and customizable AI writing assistant! βοΈ
A local and uncensored AI entity.
LLM RAG Application with Cross-Encoders Re-ranking for YouTube video π₯
Run gguf LLM models in Latest Version TextGen-webui and koboldcpp
Secure Flutter desktop app connecting Auth0 authentication with local Ollama AI models via encrypted tunneling. Access your private AI instances remotely while keeping data on your hardware.
Full featured demo application for OllamaSharp
Use your open source local model from the terminal
π LocalLLaMA Archive β Community-powered static archive for r/LocalLLaMA
A control server for managing multiple Llama Server instances with a web-based dashboard.
Copilot hack for running local copilot without auth and proxying
Lightweight Python tool using Optuna for tuning llama.cpp flags: towards optimal tok/s for your machine
Local AI Search assistant web or CLI for ollama and llama.cpp. Lightweight and easy to run, providing a Perplexity-like experience.
Add a description, image, and links to the localllama topic page so that developers can more easily learn about it.
To associate your repository with the localllama topic, visit your repo's landing page and select "manage topics."