Bridge Your Editor and RAG System with MCP

In the last post, we built a Retrieval-Augmented Generation (RAG) system using Ollama, OpenWebUI, AI models, and vector databases. We focused on how to fetch relevant context for a query and get a meaningful response using this pipeline.

In this post, we’ll take it a step further. We’ll use the Model Context Protocol (MCP) to contact that RAG system programmatically so you can ask questions and get answers without ever leaving your favorite editor.

What is MCP?

MCP (Model Context Protocol) is a protocol designed for building AI agents that can interact with external tools and services like APIs, databases, or your local filesystem. Think of it as a bridge between your AI model and the real world.

Whether it’s triggering a CI pipeline, fetching documentation, or searching internal APIs, MCP lets you do it from within the context of a conversation. You can read more about it here.

Use Case: Ask Internal Questions Inside Your Editor

Let’s say your company has several internal tools dashboards, deployment platforms, knowledge portals, etc. used by multiple teams. These systems often contain valuable tribal knowledge, but:

They’re not open-source
They’re behind VPNs or login walls
Tools like GitHub Copilot or Cursor have no visibility into them

This makes them inaccessible for code completion or AI-based help.

Now imagine this:

You create a knowledge base out of those internal tools using your RAG setup.
You plug it into OpenWebUI for querying.
You use MCP to bridge your local editor with that system.

Result? You start asking real, context aware questions inside your editor and get intelligent, up-to-date answers with code suggestions and completions powered by your own infrastructure.

How to use MCP?

MCP (Model Context Protocol) allows you to connect your editor to a RAG system. With it, you can ask a question and get context-rich answers without ever leaving your coding environment.

Let’s look at the code

OpenWebUI exposes an OpenAPI specification that describes how you can interact with it programmatically. You can read more about it here to explore available endpoints and learn how to access them.

Since OpenWebUI exposes a public API, we can use it to query our knowledge base and retrieve answers programmatically.

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import fetch from "node-fetch";
import { z } from "zod";

// Create server instance
const server = new McpServer({
	name: "your-mcp-server",
	version: "0.0.1",
});

const apiKey = process.env.API_KEY;
if (!apiKey) {
	throw new Error("API_KEY is not set");
}

server.tool(
	"your-mcp-server",
	"Returns the relevant docs and usage examples from the knowledge base",
	{
		question: z.string(),
	},
	async ({ question }) => {
		try {
			const response = await fetch("http://localhost:3001/api/chat", {
				method: "POST",
				headers: {
					"x-api-key": apiKey,
					"Content-Type": "application/json",
				},
				body: JSON.stringify({
					question,
					// You can add more knowledge names here
					knowledge_names: ["Security"],
					model_name: "llama3:8b",
				}),
			});

			if (!response.ok) {
				throw new Error(`API request failed with status ${response.status}`);
			}

			const data = await response.json();
			return {
				content: [
					{
						type: "text",
						text: JSON.stringify(data, null, 2),
					},
				],
			};
		} catch (error: unknown) {
			const errorMessage =
				error instanceof Error ? error.message : "Unknown error occurred";
			return {
				content: [
					{
						type: "text",
						text: `Error making chat API request: ${errorMessage}`,
					},
				],
			};
		}
	},
);


async function main() {
	const transport = new StdioServerTransport();
	await server.connect(transport);
}

main().catch((error) => {
	console.error("Fatal error in main():", error);
	process.exit(1);
});

Adding the MCP server to your editor

{
    "servers": {
        "your-mcp-server": {
            "type": "stdio",
            "command": "node",
            "args": [
                "--experimental-strip-types",
                "/<path-to-your-mcp-server>"
            ],
            "env": {
                "API_KEY": "<your-api-key>"
            }
        }
    }
}

The best part? You can use a single MCP server to connect to multiple RAG systems whether they are team-specific, department-wide, or purpose-driven (like docs vs. incidents).

Even better, since MCP supports multiple agents, they can collaborate and talk to each other until they gather the information needed to answer your question. It’s like having a team of AI assistants working together behind the scenes.

Conclusion

By combining OpenWebUI, RAG, and MCP, you are not just building another chatbot you are creating a powerful, editor-integrated assistant tailored to your internal knowledge and workflows.

With MCP as the bridge, you can:

Ask rich, context-aware questions directly from your editor
Connect to one or many RAG systems behind the scenes
Enable multiple agents to collaborate and fetch the best answers for you

This setup isn’t just convenient it’s a step toward truly personalized developer experiences, especially in enterprise environments where traditional tools like Copilot and Cursor fall short.

What’s Next?

In the next post, we’ll take it further and explore how to:

Add chat and code autocompletion powered by your knowledge base
Use the Continue.dev extension to bring this experience into VS Code or other editors
Turn your internal docs and tools into a real-time coding assistant just like Copilot, but with answers your team actually cares about