| DeepSeek |
https://www.deepseek.com/ |
tool, provider |
integrated |
tools/ai_knowledge/deepseek.md |
Open-source LLM provider from China. |
| Ollama |
https://ollama.com/ |
tool, infrastructure |
integrated |
services/ollama.md |
Local LLM runner for macOS, Linux, and Windows. |
| LangSmith |
https://www.langchain.com/langsmith |
tool, benchmarking |
integrated |
tools/benchmarking/langsmith.md |
Unified platform for debugging, testing, and monitoring LLM applications. |
| OpenPipe |
https://openpipe.ai/ |
tool, infrastructure |
integrated |
tools/infrastructure/openpipe.md |
Data-driven fine-tuning platform for replacing generic LLMs with smaller, faster models. |
| vLLM |
https://github.com/vllm-project/vllm |
tool, infrastructure |
integrated |
tools/infrastructure/vllm.md |
High-throughput LLM serving engine using PagedAttention. |
| Text Generation Inference (TGI) |
https://github.com/huggingface/text-generation-inference |
tool, infrastructure |
integrated |
tools/infrastructure/tgi.md |
Hugging Face's production inference server. |
| SGLang |
https://github.com/sgl-project/sglang |
tool, infrastructure |
integrated |
tools/infrastructure/sglang.md |
Fast structured generation runtime from LMSYS. |
| ExLlamaV2 |
https://github.com/turboderp/exllamav2 |
tool, infrastructure |
integrated |
tools/infrastructure/exllamav2.md |
Optimized GPTQ/EXL2 inference for consumer GPUs. |
| Fireworks AI |
https://fireworks.ai/?update=2026-02-28 |
provider |
integrated |
tools/providers/fireworks.md |
Fast inference and fine-tuning for open models. |
| Replicate |
https://replicate.com/?update=2026-02-28 |
provider |
integrated |
tools/providers/replicate.md |
API for running a wide variety of open-source models. |