QAFD-RAG: Query-Aware Flow Diffusion for Graph-Based RAG

Query-Aware Flow Diffusion for Graph-Based
Retrieval-Augmented Generation

QAFD-RAG is a graph-based retrieval-augmented generation framework that uses query-aware flow diffusion to retrieve contextually relevant subgraphs from a knowledge graph with statistical retrieval guarantees.

The Problem

Existing graph-based RAG methods rely on static retrieval—community detection (GraphRAG) or one-hop entity lookup (LightRAG)—that ignore query context and miss relevant multi-hop connections.

Our Solution

QAFD-RAG dynamically re-weights graph edges based on query relevance and propagates flow through the knowledge graph to discover multi-hop context, assembling a query-specific subgraph for the LLM.

Approach Comparison

How different retrieval strategies explore a knowledge graph

GraphRAG
Community Detection

LightRAG
One-Hop Entity-Centric

QAFD-RAG (Ours)
Query-Aware Flow Diffusion

Comparison on Wikipedia pages (Apple fruit, Apple Inc., Amazon River, Amazon.com). Query: "Introduce Steve Jobs's products in Apple." GraphRAG retrieves entire communities, mixing relevant nodes with irrelevant ones. LightRAG focuses on 1-hop neighborhoods. QAFD-RAG reweights edges by query meaning, suppressing irrelevant neighborhoods.

How It Works

A two-stage pipeline from raw documents to grounded LLM answers

Stage 1 — Knowledge Graph Construction

Chunking — Documents split into overlapping token-based chunks.
Entity Extraction — LLM identifies entities and relationships.
Graph Assembly — Entities become nodes, relationships become weighted edges.

Stage 2 — Query-Aware Retrieval

Entity Matching — Query keywords matched to KG entities via vector similarity.
Flow Diffusion — Mass propagates with query-aware edge re-weighting.
Context Assembly — Top-ranked nodes form context for the LLM.

Benchmarks

Evaluated across diverse retrieval and generation tasks

UltraDomain

Diverse Domain QA

Comprehensiveness · Diversity · Relevance · Logicality · Coherence

Multi-hop QA

MuSiQue · HotpotQA · 2WikiMultiHopQA

F1 · Exact Match

Text-to-SQL

Spider2-lite (Pagila, etc.)

Schema Retrieval Accuracy

Summarization

SQuALITY

BLEU · ROUGE · METEOR

Supported Models

Embeddings: openai-small · openai-large · jina-v3 · gritlm · nvidia-nv-embed-v2
LLMs: gpt-4o-mini · gpt-4o · gpt-5-nano · gpt-5-mini · gpt-5 · gpt-oss-120b

Quick Start

# 1. Install dependencies
pip install -r requirements.txt
# 2. Set your OpenAI API key
export OPENAI_API_KEY="sk-..."
# 3. Build a knowledge graph
./run.sh ultradomain --build --max-documents 100
# 4. Run a benchmark
./run.sh ultradomain --questions 10

Python API

from src import QAFD_RAG, QueryParam
rag = QAFD_RAG(
    working_dir="./my_kg",
    llm_model_name="gpt-4o-mini",
    embedding_model_key="jina-v3",
)
# Index documents
rag.insert(["Document text 1...", "Document text 2..."])
# Query
answer = rag.query("What is X?", param=QueryParam(mode="hybrid"))
print(answer)

Citation

@inproceedings{zhou2026qafd,
  title={Query-Aware Flow Diffusion for Graph-Based RAG with Retrieval Guarantees},
  author={Zhuoping Zhou and Davoud Ataee Tarzanagh and Sima Didari and Wenjun Hu
          and Baruch Gutow and Oxana Verkholyak and Masoud Faraki and Heng Hao
          and Hankyu Moon and Seungjai Min},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2026}
}

Query-Aware Flow Diffusion for Graph-BasedRetrieval-Augmented Generation