Learn Before
Dense Seed Pool with L2-Normalized all-MiniLM-L6-v2 Embeddings and Inner-Product Search
MOOC-CS: Language-Matched Controls (Results) in Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls
Method Part 1: Multilingual Encoder + CJK Query Rewrite as a MOOC-CS Control (Auditable Strict-Parity Graph-RAG Paper)
Method Part 2: MOOC-CS Prerequisite Benchmark (Auditable Strict-Parity Graph-RAG Paper)
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
On MOOC-CS (), the paper concludes that the graph effect is conditional on the quality of the dense seed pool. Under MiniLM + English-template queries, the gap between flat-dense and hierarchical-baseline R@ is small (16.0 to 23.1). Under multilingual + CJK-only queries, the same fixed-depth hierarchical policy widens that gap to 49.2 to 68.1, with adaptive close at 65.5. Because the graph policy is unchanged across rows, the much larger graph gain in the language-matched configuration is attributable to a better dense seed pool, not to a different traversal policy.
0
1
Tags
Science
Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls
Related
Fixed Top-m Dense Seed Pool as a Strict-Parity Control
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
Flat Dense Retrieval Baseline in Strict-Parity Prerequisite Retrieval
Template Stripping on MOOC-CS Raises Hierarchical R@10 from 23.1 to 26.5 (MiniLM Encoder)
Multilingual Encoder Alone Does Not Improve MOOC-CS Recall (Hierarchical R@10 = 22.3 vs 23.1)
Multilingual Encoder + CJK-Only Queries Jumps MOOC-CS Hierarchical R@10 to 68.1
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
MiniLM Encoder + CJK-Only Queries on MOOC-CS: Hierarchical R@10 Rises from 23.1 to 26.5 with Flat Dense at 21.7
Multilingual Encoder + CJK-Only Queries Jumps MOOC-CS Hierarchical R@10 to 68.1
Template Stripping on MOOC-CS Raises Hierarchical R@10 from 23.1 to 26.5 (MiniLM Encoder)
Multilingual Encoder Alone Does Not Improve MOOC-CS Recall (Hierarchical R@10 = 22.3 vs 23.1)
MiniLM Encoder + CJK-Only Queries on MOOC-CS: Hierarchical R@10 Rises from 23.1 to 26.5 with Flat Dense at 21.7
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
Multilingual Encoder Alone Does Not Improve MOOC-CS Recall (Hierarchical R@10 = 22.3 vs 23.1)
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
MOOC-CS Error Taxonomy: Residual Failures Dominated by Distant Misses and Bilingual Aliasing
Language-Matched Seeding as a Prerequisite for Graph-Expansion Gains