AI Retrieval Systems with GraphRAG

Sep 9

At Kmeleon, we’re committed to helping enterprises navigate the rapidly evolving AI landscape, equipping them with the tools and knowledge needed to stay competitive. As AI continues to advance, techniques like Retrieval-Augmented Generation (RAG) are becoming crucial for enhancing the capabilities of Large Language Models (LLMs). By leveraging external data sources, RAG systems deliver more accurate, relevant, and contextually rich responses. One of the latest breakthroughs in this field is GraphRAG, which combines vector databases with knowledge graphs to create a hybrid retrieval mechanism. This innovation significantly boosts the performance and applicability of RAG systems. In this article, we delve into the intricacies of GraphRAG, its components, and how it can transform your business, aligning perfectly with Kmeleon Tech’s mission to prepare companies for the future.

Hybrid Retrieval Mechanism

GraphRAG uses a hybrid approach that integrates vector databases and knowledge graphs. Vector databases like Pinecone, Faiss, or Milvus handle semantic search and similarity matching, excelling at finding contextually similar documents based on embeddings from models like BERT or GPT. Knowledge graph databases such as Neo4j and Amazon Neptune handle structured queries and relationships between entities, allowing for data retrieval based on complex relationships and predefined schemas.

This hybrid mechanism sets the foundation for combining search results in a way that optimizes both precision and relevance.

Combining Search Results

In GraphRAG, initial candidates are retrieved from the vector database based on semantic similarity. These results are then refined or augmented using the knowledge graph to ensure they meet specific relational criteria or contain certain entities. This involves implementing a ranking algorithm that considers both semantic similarity scores from the vector database and relevance scores from the knowledge graph, ensuring that only the most contextually and relationally relevant documents are selected.

Diagram illustrating the GraphRAG process, combining user queries, vector databases, graph databases, and large language models (LLMs) to achieve higher response accuracy.

Contextual Embeddings and Entity Linking

GraphRAG enhances embeddings by incorporating entity information from the knowledge graph, making vectors more contextually aware. Entity linking helps identify and connect entities in input queries, improving the precision of information retrieval from the vector database.

Below is a Python code snippet that demonstrates how to use LangChain with a Neo4j graph database and a Qdrant vector database to implement a basic GraphRAG (Retrieval-Augmented Generation) system locally.

Pre-requisites:
Make sure you have Qdrant already installed and running in your local environment.

Make sure you have already set up Neo4J.

Installation:
pip install langchain neo4j qdrant-client

Snippet:
from langchain import LangChain

from langchain.embeddings import OpenAIEmbeddings

from qdrant_client import QdrantClient

from neo4j import GraphDatabase

from langchain.vectorstores import Qdrant

from langchain.graph import Neo4jGraph

# Initialize Neo4j graph database

neo4j_uri = "bolt://localhost:7687"

neo4j_user = "neo4j"

neo4j_password = "your_password"

graph_db = GraphDatabase.driver(neo4j_uri, auth=(neo4j_user, neo4j_password))

# Initialize Qdrant vector database

qdrant_client = QdrantClient("localhost", port=6333)

# Initialize LangChain components

embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")

# Initialize the vector store with Qdrant

vector_store = Qdrant(client=qdrant_client, collection_name="documents", embeddings=embeddings)

# Initialize the graph store with Neo4j

graph_store = Neo4jGraph(driver=graph_db)

# Define your query

query = "What are the key advantages of GraphRAG?"

# Step 1: Retrieve candidates from the vector store based on semantic similarity

candidates = vector_store.similarity_search(query, k=5)

# Step 2: Use the Neo4j knowledge graph to refine the results

refined_results = []

for candidate in candidates:

# Extract entities from the candidate document

entities = extract_entities(candidate['text']) # This is a placeholder for actual entity extraction logic

# Query the knowledge graph for relationships

with graph_store.session() as session:

result = session.run(

"MATCH (e1)-[r]->(e2) "

"WHERE e1.name IN $entities OR e2.name IN $entities "

"RETURN e1, r, e2",

{"entities": entities}

)

if result:

refined_results.append(candidate)

# Combine results from both vector store and graph store

final_results = rank_and_combine_results(candidates, refined_results) # Placeholder for ranking algorithm

# Output the final result

for result in final_results:

print(result['text'])

# Placeholder functions

def extract_entities(text):

# Implement entity extraction logic here

return ["GraphRAG", "Vector Databases", "Knowledge Graphs"]

def rank_and_combine_results(candidates, refined_results):

# Implement ranking and combination logic here

return refined_results

Explanation:
LangChain Components:

OpenAIEmbeddings: Used to create embeddings for documents, allowing for semantic search.

Qdrant: Acts as a vector store to handle similarity search.

Neo4jGraph: Handles structured queries and relationships between entities in the Neo4j graph database.

Vector Store Search:

The query is first passed to the Qdrant vector store to retrieve the most semantically similar documents.

Graph Refinement:

The candidate documents are then refined by querying the Neo4j knowledge graph to ensure they meet specific relational criteria.

Final Combination:

The results from both the vector store and graph database are combined and ranked to produce the most relevant and accurate final results.

Placeholders:

The extract_entities and rank_and_combine_results functions are placeholders where you can implement your logic for entity extraction and result ranking.

Feedback Loop and Continuous Learning

User feedback mechanisms allow for continuous improvement of embeddings and the knowledge graph’s structure. Regular updates ensure that the vector database and knowledge graph reflect new information and changing contexts, maintaining the system's relevance and accuracy.

Use of Metadata and Annotations

The system enriches documents with metadata extracted from the knowledge graph, aiding better indexing and retrieval within the vector database. Annotations with relevant entities and relationships from the knowledge graph further enhance matching and relevance scoring.

Advanced Query Techniques

Knowledge graphs expand initial queries with related entities and concepts, retrieving a broader set of relevant documents. They also handle complex queries involving multiple entities and their relationships, which might be challenging to represent in vector space alone.

Addressing Model Output Accuracy and Hallucinations

Model output accuracy and hallucinations are significant obstacles preventing enterprises from moving LLM use cases into production. GraphRAG not only enhances retrieval accuracy but also mitigates the risk of hallucinations—where models generate plausible-sounding but incorrect information—by cross-verifying retrieved data with structured knowledge graphs.

Fine-tuning and RAG are the principal methods companies use to customize LLMs, each with its own advantages. Combining these methods with considerations like SLMs, inference optimization, model routing, and agentic design patterns helps enterprises create more sophisticated deployment strategies.

The Need for Augmentation

As Databricks CTO Matei Zaharia points out, state-of-the-art AI results increasingly rely on compound systems with multiple components. Databricks found that 60% of LLM applications use some form of RAG, highlighting the importance of augmenting out-of-the-box LLMs with additional data sources to enhance performance and reliability.

Economic Advantages of RAG

Efficient fine-tuning techniques like LoRA can enhance model performance for specific tasks but can be costly. In contrast, RAG offers economic benefits by allowing enterprises to update retrieval data sources cheaply, supporting real-time factuality. Each step of the RAG process can be optimized for performance gains, and RAG combined with citations can provide an audit trail, increasing trust in model outputs.

The Role of Knowledge Graphs

Knowledge graphs are a crucial yet often overlooked component of the RAG stack. They enhance baseline RAG with additional context specific to a company or domain, significantly improving RAG performance. By organizing data into interconnected entities and relationships, knowledge graphs ensure that RAG-generated responses are accurate and contextually relevant.

Components of the RAG Stack

The emerging RAG stack comprises several components:

Data Pipes/Extraction: Unstructured data is piped in from various locations, with data extraction being a key bottleneck.

Vector Databases: Store mathematical embeddings representing the underlying text.

Vector Ops: Streamline processes related to vector creation, optimization, and analytics.

Graph Databases: Store graph structures representing semantic connections and relationships.

Graph Ops: Streamline processes related to graph creation, orchestration, optimization, and analytics.

LLM Orchestration: Manage information manipulation with multi-agent systems.

LLMs: The foundational models that RAG systems build upon to generate appropriate and relevant responses.

Knowledge graphs function within RAG as data stores for retrieving information and as semantic structures to retrieve vector chunks. GraphRAG can handle complex queries and aggregate information across datasets, offering significant advantages over baseline RAG.

Evaluation and Results

To evaluate the effectiveness of GraphRAG, we compared it against naive RAG and hierarchical source-text summarization using the Large Language Model (LLM) GPT-4. We generated a diverse set of activity-centered sense-making questions from two datasets: podcast transcripts and news articles. The evaluation was based on three key metrics:

Comprehensiveness: Ensures all aspects are covered in detail.

Diversity: Provides different perspectives.

Empowerment: Supports informed decision-making.

The results were compelling:

GraphRAG outperformed naive RAG on comprehensiveness and diversity, achieving a win rate of approximately 70–80%.

When using intermediate- and low-level community summaries, GraphRAG also surpassed source text summarization in these metrics at lower token costs (20–70% token use per query).

Performance remained competitive with hierarchical source text summarization for the highest-level communities while maintaining substantially lower token costs (2–3% token use per query).

These results highlight the efficiency and effectiveness of GraphRAG in producing more comprehensive and diverse outputs while maintaining a lower computational cost.

At Kmeleon, we believe in empowering businesses with the latest advancements in AI to ensure they remain ahead of the curve. GraphRAG represents a significant leap forward in AI, seamlessly blending the strengths of vector databases and knowledge graphs to elevate Retrieval-Augmented Generation systems. By adopting these cutting-edge technologies, enterprises can unlock more accurate, relevant, and contextually rich AI outputs. As this technology evolves, GraphRAG offers the potential to revolutionize various industries, opening up new avenues for information retrieval and AI-driven decision-making. Kmeleon Tech is here to help your business harness these innovations, paving the way for a future-ready enterprise. We invite you to contact us and explore how we can drive your company’s success through the power of Gen AI.

[Post Title] | Gen AI Blog | Kmeleon Tech

Dustin Gallegos

Founder CEO @ Kmeleon
Generative AI Expert | Speaker | Writer

https://www.linkedin.com/in/dustin-gallegos/