Building Agentic RAG with Llamaindex - Notes

info

Router Query Engine

import libs
setup API key
load, split and get nodes from document
setup LLM and embedding
create indexes, summary, vector

import os
import nest_asyncio
from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import SummaryIndex, VectorStoreIndex
from llama_index.core.tools import QueryEngineTool
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector
from .utils import get_router_query_engine

def get_openai_api_key():
    openai_api_key = os.getenv("OPENAI_API_KEY")
    return openai_api_key

OPENAI_API_KEY = get_openai_api_key()

nest_asyncio.apply()

# load documents
documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

# split documents into nodes
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

# set up LLM and embeddings
Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

# create indexes
summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

note

Summary Index

Think of it like a book's chapter summaries
It creates hierarchical summaries of your documents
Good for questions that need broad understanding or synthesis of the content
Better for "What's the main point of...?" or "Summarize..." type questions
Uses LLM to generate summaries, which can be more expensive but gives better high-level understanding

Vector Index

Think of it like a smart Ctrl+F search
Converts text chunks into numerical vectors (like GPS coordinates for meaning)
Good for finding specific information or answering detailed questions
Better for "Where does it mention...?" or "What are the specific details about...?" type questions
Uses similarity search to find relevant chunks, which is faster and cheaper than summarization

In your code, you're using both because they complement each other:

Summary index (summary_tool) for summarization questions
Vector index (vector_tool) for specific detail retrieval

Query Engines

Create 2x Query Engines: summary_query_engine and vector_query_engine
Create 2x Query Engine Tools: summary_tool and vector_tool
Create 1x RouterQueryEngine: query_engine

# create query engines
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

# create router query engine
summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

# create router query engine
query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

Testing direct calls to each Query Engine with: query_engine.query()

# response
response = query_engine.query("What is the summary of the document?")
print(str(response))

This was the response:

Selecting query engine 0: Useful for summarization questions related to MetaGPT.

The document introduces MetaGPT, a meta-programming framework that enhances multi-agent systems using Large Language Models (LLMs) by incorporating human-like Standardized Operating Procedures (SOPs). It assigns specific roles to agents, streamlines workflows, and improves task decomposition, ensuring efficient collaboration through structured outputs and a communication protocol. MetaGPT achieves state-of-the-art performance in code generation benchmarks, emphasizing role specialization, workflow management, and efficient sharing mechanisms. The framework also includes an executable feedback mechanism to iteratively improve code quality. Additionally, the document discusses the software development process with MetaGPT, highlighting its success in achieving superior performance and its potential for future research in human-inspired techniques for artificial multi-agent systems.

print(len(response.source_nodes))
34

response = query_engine.query(
    "How do agents share information with other agents?"
)
print(str(response))

This was the response:

Selecting query engine 1: This choice is more relevant as it specifically mentions retrieving specific context, which is necessary for understanding how agents share information with other agents..

Agents share information with other agents by utilizing a shared message pool where they can publish structured messages. This shared message pool allows all agents to exchange messages directly, enabling them to both publish their own messages and access messages from other entities transparently. Additionally, agents can subscribe to relevant messages based on their role profiles, allowing them to extract the information they need for their specific tasks and responsibilities.

Router Query Engine Request

Now, to call the Router Query Engine itself:

# everything together

query_engine = get_router_query_engine("metagpt.pdf")
response = query_engine.query("Tell me about the ablation study results?")
print(str(response))

Final Router determined response:

Selecting query engine 1: Ablation study results are specific context from the MetaGPT paper, making choice 2 the most relevant..

The ablation study results show that MetaGPT effectively addresses challenges related to context utilization, code hallucinations, and information overload in software development. By accurately unfolding natural language descriptions, maintaining information validity, and focusing on granular tasks like requirement analysis, MetaGPT mitigates issues such as incomplete implementation, missing dependencies, and undiscovered bugs. Additionally, the use of a global message pool and subscription mechanism helps manage information overload by streamlining communication and filtering out irrelevant contexts, thereby enhancing the relevance and utility of information in software development.

get_router_query_engine("metagpt.pdf")

This function is in a utils.py file in this setup, and bottom of the imports list (see above):

def get_router_query_engine(file_path: str, llm = None, embed_model = None):
    """Get router query engine."""
    llm = llm or OpenAI(model="gpt-3.5-turbo")
    embed_model = embed_model or OpenAIEmbedding(model="text-embedding-ada-002")
    
    # load documents
    documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
    
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)
    
    summary_index = SummaryIndex(nodes)
    vector_index = VectorStoreIndex(nodes, embed_model=embed_model)
    
    summary_query_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
        llm=llm
    )
    vector_query_engine = vector_index.as_query_engine(llm=llm)
    
    summary_tool = QueryEngineTool.from_defaults(
        query_engine=summary_query_engine,
        description=(
            "Useful for summarization questions related to MetaGPT"
        ),
    )
    
    vector_tool = QueryEngineTool.from_defaults(
        query_engine=vector_query_engine,
        description=(
            "Useful for retrieving specific context from the MetaGPT paper."
        ),
    )
    
    query_engine = RouterQueryEngine(
        selector=LLMSingleSelector.from_defaults(),
        query_engine_tools=[
            summary_tool,
            vector_tool,
        ],
        verbose=True
    )
    return query_engine

Router Query Engine​

Query Engines​

Router Query Engine Request​

get_router_query_engine("metagpt.pdf")​

Router Query Engine

Query Engines

Router Query Engine Request

get_router_query_engine("metagpt.pdf")