Introducing "RAGent," a RAG system compatible with various data sources
#SRG(Service Reliability Group) is a group that mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, and contributing to OSS.
This article explains a method for efficiently building a highly accurate RAG system by eliminating knowledge fragmentation using the open source software "RAGent," which integrates internal documents with real-time information from Slack.
Building a RAG (Retrieval-Augmented Generation) system that utilizes internal documents and chat tool logs is a topic of great interest to many engineers.
However, there are many challenges involved in building a highly accurate search system and integrating the flow of information that flows every day.
This time, we will introduce an OSS RAG system construction tool called "RAGent"We will introduce you to "
This is an ambitious product that supports hybrid search using Amazon Bedrock and OpenSearch, as well as the latest MCP (Model Context Protocol).
Challenges in utilizing internal knowledge and the emergence of RAGentRAGent Architecture and Key FeaturesHigh-precision retrieval through hybrid searchData storage and managementVersatile InterfacesEmbedding independent RAGReal-time Slack search integrationURL Reference FunctionMCP and ObservabilityMCP ServerObservability with OpenTelemetryUsage exampleAutomating Internal QA with DevinFuture RAGentsummaryReference Links
Challenges in utilizing internal knowledge and the emergence of RAGent
In typical RAG systems, documents are vectorized and stored in a database (Vector Store), but constantly vectorizing highly real-time information like Slack poses challenges in terms of operational costs and delays in updating the data.
RAGentvector search in Markdown documents andReal-time search in SlackThis is a CLI tool that aims to solve this problem by combining
It is implemented in Go language and uses the AWS environment (Bedrock, OpenSearch, S3) as the backend.
RAGent Architecture and Key Features
RAGent is not just a search tool, but a comprehensive system that provides everything from data vectorization to chat interfaces, Slack BOTs, and MCP Servers.
Let's take a look at its core architecture and functionality.
High-precision retrieval through hybrid search
RAGent's search engine isVector search (semantic search) and BM25 (keyword search)We use a hybrid search that combines
- Vector Search: Vectorize text with Amazon Bedrock (Titan Text Embeddings v2) and search by semantic similarity.
- BM25: Uses OpenSearch to perform searches based on keyword frequency.
kuromojiBy integrating these (using RRF or weighted sum), it is possible to thoroughly pick up both "documents that do not have matching words but are similar in meaning" and "documents that contain specific technical terms."
Data storage and management
vectorizeThe distinctive feature is that the generated vector data and metadataAmazon S3 VectorsThe key feature is that it uses a dual backend configuration, storing data in an S3 bucket while creating a search index in OpenSearch.
This allows for both data persistence and search performance.
Versatile Interfaces
RAGent covers a variety of use cases with five commands:
vectorize
query
chat
slack-bot
mcp-server
Embedding independent RAG
One of the most interesting features of RAGent isSlack search integrationThis is an "embedding independent" approach.
Real-time Slack search integration
Typically, when working with chat logs in RAG, messages need to be periodically batched and vectorized.
SLACK_SEARCH_ENABLED=trueWhen a user poses a question, RAGent generates an answer using the following flow:
- Document Search: Get information from pre-vectorized Markdown/CSV.
- Slack Search: Search messages using the Slack API using a user's query.
- Context Integration: Pass both results to LLM (Claude) to generate an answer.
This clever design eliminates the need to wait for vector database updates and reduces operational costs.
Slack search is implemented to include not only the target message but also the thread content as context."Ongoing discussions"will be reflected in the answer immediately.
URL Reference Function
Additionally, if you include a Slack message URL (permalink) in your query, RAGent will automatically retrieve the thread and load it as context.
https://slack.com/...For Markdown files, if you set it as shown below, the source URL (article source) of the Markdown file will be indexed as metadata, so you can also refer to the Markdown file by reverse lookup from the URL.
MCP and Observability
MCP Server
mcp-serverragent-hybrid_searchThe ability to seamlessly access internal documentation and information from Slack while working locally or in Devin will greatly improve developer productivity.
Observability with OpenTelemetry
With production operations in mind, it supports tracing and metric output via OpenTelemetry (OTel).
The Slack Bot's response speed, search hit count, error rate, etc. can be visualized using Grafana, making it easy to monitor system performance and implement improvement cycles.
Usage example
I can't go into detail here, but I'll just show you one example of how it can be used.
There are many other use cases (creating specifications, investigating problems, etc.)
Automating Internal QA with Devin
We use two features of Devin to automate our internal QA process.
- Playbook: Trigger Devin from Slack like a custom slash command
- MCP: You can register an MCP server so it can work with RAGent.

Devin has access to the following data sources, making it the most powerful agent:
- Source code
- RAGent
- Internal documentation (Markdown)
- Internal documents (CSV, Google Drive)
- Slack
- BigQuery
Future RAGent
We currently rely on AWS because we created it to meet internal needs.
- Bedrock (Embedding model used when vectorizing)
- S3 Vectors (vector database)
In the future we plan to expand these options to allow users to select them.
summary
RAGent is a practical RAG system that skillfully combines AWS managed services and integrates two major sources of information: internal documents and chat logs.
In particular, the "embedding-independent" approach, which incorporates Slack's search functionality directly into the RAG pipeline, is an excellent solution that strikes a good balance between real-time performance and operational load.
It is distributed as a single binary written in the Go language, so the barrier to adoption is low.
chatReference Links
SRG is looking for people to work with us.
If you are interested, please contact us here.
