Goal: build useful “plugins” for LM Studio that feel practical in daily life — like a SearXNG web search tool, plus a tool router that converts natural language into reliable tool calls.
This is not hype. This is how you turn a local model into a real assistant: it can search, retrieve, summarize, and cite — instead of guessing.
Rule I use: If the model needs fresh facts, it should search. If it doesn’t search, it should say “I don’t know.”
✅ What you’re building (v1)
- SearXNG running locally (your private metasearch engine)
- One tool:
search_web(query)returns results in JSON - Tool router: natural language → “use tool” vs “answer directly”
- LM Studio integration: via Tools/Function Calling OR MCP (recommended)
The vibe: your assistant stops being “chat-only” and starts being “tool-enabled.”
🧠 Two ways to do “plugins” with LM Studio
Option A — Tools / Function Calling (fast to ship)
You define tools in the chat request. The model returns a tool call. Your app executes the tool and sends results back.
Option B — MCP servers (the clean, scalable approach)
MCP (Model Control Protocol) lets LM Studio use external tool servers in a standardized way. You can run tool servers locally, restrict which tools are allowed, and reuse them across apps.
Beginner recommendation: start with Tools/Function Calling to learn the pattern, then graduate to MCP when you want a stable “tool ecosystem.”
🔎 Step 1 — Run SearXNG locally
SearXNG is a self-hostable metasearch engine. Running it locally gives you a search endpoint your AI can call.
✅ Easiest setup: Docker Compose
Use the official SearXNG Docker stack (it includes sensible defaults). Once running, you’ll have a local endpoint like:
http://localhost:8080
✅ Confirm JSON search works
SearXNG can return JSON results. A typical pattern is:
GET /search?q=your+query&format=json
If you get results back as JSON, you’re ready to connect the AI.
🧰 Step 2 — Create a “search_web” tool (simple contract)
Whether you use Tools/Function Calling or MCP, your tool should do one job well:
- Take a query string
- Call SearXNG
- Return a clean list: title, URL, snippet
Tool contract (concept):
search_web({ query, limit=5 }) -> [{ title, url, snippet }]
🤖 Step 3 — Tool router: natural language → tool use
You have two router styles:
Router Style 1 — Let the model call tools (cleanest)
You give the model tool definitions and a system rule like:
"If the user asks for recent facts, prices, news, or verification, call search_web first.
If the question is opinion or writing, answer directly."
The model decides. Your app executes the tool call and feeds results back.
Router Style 2 — Use a cheap “router model” (saves tokens)
A tiny model makes a fast decision before involving the big model:
Router prompt (cheap model):
"Return JSON only:
{ action: 'search'|'answer', query?: string }
User request: ..."
If action = search, you call SearXNG, then pass results to the larger model for a final response. This saves time and generation cost.
🧪 A minimal Tools/Function Calling flow (concept)
1) User asks: "What changed in LM Studio recently?"
2) Model returns tool_call: search_web({ query: "LM Studio 0.4.0 new features" })
3) Your app calls SearXNG and returns results to the model
4) Model writes the final answer with citations/snippets
Important: always set a timeout and always send “working…” feedback in your UI if search takes time.
🧱 The MCP approach (recommended once you want stability)
With LM Studio’s native REST API, you can enable MCP servers in two ways:
- Ephemeral MCP: defined per-request (great for testing)
- mcp.json servers: configured once, reused everywhere (best for daily workflow)
Pro tip: restrict tool access. Give the model only the tools it needs for the current job.
⚠️ Gotchas (the ones that waste hours)
1) Tool spam / looping
Cause: the model keeps calling tools because the prompt is vague.
Fix: limit allowed tools and set a hard max: “at most 2 searches.”
2) Slow responses
Cause: search + summarization can be slower than pure chat.
Fix: stream results, show “Searching…” immediately, and keep the final answer short.
3) Bad search results
Cause: broad queries or noisy engines.
Fix: teach the router to refine queries (add year, product version, location).
🔗 The links you actually need
- LM Studio Tools / Function Calling: https://lmstudio.ai/docs/developer/openai-compat/tools
- LM Studio “Using MCP via API” (ephemeral + mcp.json): https://lmstudio.ai/docs/developer/core/mcp
- LM Studio REST API (v1) overview: https://lmstudio.ai/docs/developer/rest
- SearXNG official docs: https://docs.searxng.org/
- SearXNG Docker stack (official): https://github.com/searxng/searxng-docker
🔗 Curtision internal links (swap these for your real pages)
- Curtision: LM Studio local server setup
- Curtision: SearXNG install + configuration
- Curtision: tool router / MCP starter pack
If you want the “v2” of this post, I can write it as a full copy-paste build guide with:
Docker Compose for SearXNG, a Node tool server (Tools + MCP versions), and a router prompt tuned for low tokens + high reliability.