---
title: Hybrid Search Implementation
category: product
entity_type: skill
price: $15
canonical: https://forgehouse.ai/skills/hybrid-search-implementation/
lang: en
hreflang_alt: https://forgehouse.ai/tr/skiller/hybrid-search-implementation/
last_updated: 2026-06-20
---

# Hybrid Search Implementation

> Combine vector and keyword search for improved retrieval.

Combines vector similarity and keyword (BM25) search into one retrieval pipeline so you catch both semantic matches and exact terms like names, codes, and domain jargon that pure vector search misses. It fuses the two signals with RRF or weighted scoring, optionally reranks with a cross-encoder, and logs every score so recall regressions are debuggable.

## Use cases
- Building a RAG system that needs higher recall than vector search alone
- Handling queries with exact terms: product codes, names, SKUs
- Improving search over domain-specific vocabulary and synonyms
- Adding cross-encoder reranking on top of a fused candidate set
- Tuning the vector-versus-keyword balance per query type
- Diagnosing a recall drop by tracing which leg returned each result

## Benefits
- Lift recall noticeably by catching what semantic search and keyword search each miss
- Return relevant results for exact-match queries that pure vector search drops
- Trade latency for quality deliberately with cascade reranking on only the candidates that need it
- Debug recall regressions fast because every result carries its vector, keyword, and fused score

## What’s included
- Reciprocal Rank Fusion and normalized linear-combination fusion implementations
- PostgreSQL hybrid search with pgvector HNSW plus full-text GIN and in-query RRF
- Elasticsearch hybrid search using script_score and native 8.x RRF ranking
- A complete hybrid RAG pipeline class with parallel search, fusion, and optional reranking
- Cross-encoder reranking step and a cascade pattern to cut P99 latency
- Best-practice and anti-pattern guidance on weight tuning, over-fetch, and edge cases

## Who it’s for
ML and backend engineers building RAG systems or search engines where neither vector nor keyword search alone gives sufficient recall.

## How it runs
Vectors understand intent but miss product codes; keywords nail exact matches but miss meaning. This pipeline indexes both, fuses the two rankings properly, and keeps every result traceable to its source leg.
1. Indexes every document twice in the same store: a dense vector embedding under an HNSW index for semantic intent, and a full-text keyword index for exact matches like product codes and names
2. At query time, over-fetches roughly 3x the requested result count from both legs in parallel, because fusion needs headroom to surface documents one leg missed
3. Fuses the two ranked lists with Reciprocal Rank Fusion or a min-max-normalized weighted sum; raw scores are never compared directly because BM25 and cosine similarity live on different scales
4. Optionally cascades into a cross-encoder reranker applied only to the top 20-50 candidates, where the expensive model is affordable and lifts final ordering quality
5. Attaches three scores to every result (vector, keyword, fused) plus a source tag, so a future recall drop can be traced to the exact failing leg instead of blind debugging
6. Tunes weights empirically per query type: keyword-heavy queries shift weight toward exact match, natural-language queries shift toward semantic, validated with A/B tests rather than defaults

## FAQ
### We run PostgreSQL, not Elasticsearch. Is that covered?
Yes. There's a complete PostgreSQL setup combining a pgvector HNSW index with a full-text GIN index and in-query RRF fusion. Elasticsearch 8.x gets its own implementation using script_score and native RRF, so you pick whichever store you already run.

### If I use a bigger embedding model, do I still need the keyword leg?
Yes. Exact terms like product codes, SKUs, and names keep slipping through pure vector search no matter the model size. The BM25 leg catches those, the two signals fuse via RRF or weighted scoring, and every result logs which leg returned it so recall regressions stay debuggable.

### Does it tune the vector-versus-keyword weights for my data automatically?
No. It gives per-query-type balance guidance and warns about common weight-tuning anti-patterns, but the right weights come from testing against your own queries. The score logging makes that testing practical; the decision stays yours.

## Price
$15, one-time, no subscription. VAT included.

Related guide: [AI and LLM engineering](https://forgehouse.ai/guides/ai-llm-engineering/)