Introducing "RAGent," a RAG system compatible with various data sources.
Hasegawa from the Service Reliability Group (SRG) of the Media Management Division (@rarirureluis)is.
#SRGThe Service Reliability Group primarily provides comprehensive support for the infrastructure surrounding our media services, focusing on improving existing services, launching new ones, and contributing to open-source software (OSS).
This article explains how to efficiently build a highly accurate RAG system by utilizing "RAGent," an open-source software that integrates internal documents and real-time information from Slack, thereby eliminating knowledge fragmentation.
This time, we'll introduce an open-source RAG system building tool that can handle Markdown documents, CSVs, and Slack conversation logs in an integrated manner.RAGentThis article will introduce you to "[...]".
This ambitious product offers hybrid search capabilities utilizing Amazon Bedrock and OpenSearch, and also supports MCP.
Challenges in utilizing internal knowledge and the emergence of RAGentRAGent's Architecture and Key FeaturesHigh-precision retrieval through hybrid searchData storage and managementDiverse interfacesEmbedding independent RAGReal-time integration of Slack searchURL referencing functionMCP and observabilityMCP ServerObservability with OpenTelemetryUsage exampleAutomating internal QA with DevinFuture of RAGentsummaryReference links
Challenges in utilizing internal knowledge and the emergence of RAGent
In typical RAG systems, documents are vectorized and stored in a database (Vector Store), but continuously vectorizing highly real-time information like that from Slack presented challenges in terms of operational costs and delays in updates.
RAGentThis involves vector search of Markdown documents andSlack real-time searchThis CLI tool attempts to solve this problem by combining [various elements].
It is implemented in Go and uses the AWS environment (Bedrock, OpenSearch, S3) as its backend.
RAGent's Architecture and Key Features
RAGent is more than just a search tool; it's a comprehensive system that provides everything from data vectorization to a chat interface, Slack bots, and MCP servers.
Let's take a look at its core architecture and features.
High-precision retrieval through hybrid search
RAGent's search engine isVector search (semantic search) and BM25 (Keyword Search)We employ a hybrid search system that combines these features.
- Vector search: Use Amazon Bedrock (Titan Text Embeddings v2) to vectorize text and search by semantic similarity.
- BM25We use OpenSearch to perform searches based on keyword frequency.
kuromojiBy integrating these (using RRF or weighted sums), we can comprehensively identify both documents with similar meanings but no identical wording, and documents containing specific technical terms.
Data storage and management
vectorizeA distinctive feature is the generated vector data and metadata.Amazon S3 VectorsThe key feature is the dual backend configuration, which stores data in an S3 bucket while creating a search index in OpenSearch.
This ensures both data persistence and search performance.
Diverse interfaces
RAGent supports a variety of use cases with the following five commands:
vectorize
query
chat
slack-bot
mcp-server
Embedding independent RAG
One of the most interesting features of RAGent is,Slack search integrationThis is an "embedding-independent" approach.
Real-time integration of Slack search
Typically, when working with chat logs in RAG, you need to periodically vectorize messages in batch processing.
SLACK_SEARCH_ENABLED=trueWhen a user asks a question, RAGent generates an answer using the following flow:
- Document SearchInformation is retrieved from pre-vectorized Markdown/CSV files.
- Slack SearchSearch for messages using the Slack API with user queries.
- Contextual IntegrationPass both results to LLM (Claude) to generate the answer.
This is a clever design that eliminates the need to wait for updates to the vector database and reduces operational costs.
Slack search is implemented to include not only the target message but also the content of the thread as part of the context."Ongoing discussions"This will be reflected in the answer immediately.
URL referencing function
Furthermore, if you include a Slack message URL (permalink) in your query, RAGent will automatically retrieve that thread and load it as context.
https://slack.com/...For Markdown files, configuring it as shown below will index the source URL (article source) of the Markdown file as metadata, allowing you to reference the Markdown file by reverse lookup from the URL.
MCP and observability
MCP Server
mcp-serverragent-hybrid_searchThe ability to seamlessly access internal documentation and Slack information while working locally or with Devin will significantly improve developer productivity.
Observability with OpenTelemetry
With production deployment in mind, it supports tracing and metric output using OpenTelemetry (OTel).
The system is designed to facilitate performance monitoring and improvement cycles, as it allows you to visualize Slack bot response speed, search hit count, and error rate using tools like Grafana.
Usage example
I can't go into detail, but here I'll introduce just one example of its use.
There are many other uses as well (such as creating specifications or troubleshooting).
Automating internal QA with Devin
We are using two features of Devin to automate our internal QA process.
- Playbook: Trigger Devin from Slack using a custom slash command-like method.
- MCP: You can register an MCP server and integrate it with RAGent.

This Devin agent is the most powerful because it can access the following data sources.
- Source code
- RAGent
- Internal documents (Markdown)
- Internal documents (CSV, Google Drive)
- Slack
- BigQuery
Future of RAGent
Currently, it relies on AWS because it was created to meet internal company needs.
- Bedrock (the Embedding model used when working with vectors)
- S3 Vectors (Vector Database)
We plan to expand these options in the future so that users can choose from them.
summary
RAGent is a practical RAG system that cleverly combines AWS managed services to integrate two major sources of information: internal documentation and chat logs.
In particular, the "embedding-independent" approach of directly integrating Slack's search functionality into the RAG pipeline is a solution that strikes an excellent balance between real-time capabilities and operational burden.
Because it is distributed as a single binary written in Go, the barrier to entry is kept low.
chatReference links
SRG is looking for new team members.
If you are interested, please contact us here.
