How FinalDoc's AI Search Works Under the Hood

When a reader types "how do I set up single sign-on" into your knowledge base search, they expect to find your SAML Configuration Guide — even though the words don't match. Traditional keyword search fails here. Semantic search doesn't.

This post explains exactly how FinalDoc's AI-powered search works, from embedding generation to query-time ranking.

The Embedding Pipeline

When you publish an article, FinalDoc runs a background process:

Chunking — the article is split into overlapping chunks of 500-1000 tokens. Overlap ensures we don't lose context at chunk boundaries.
Embedding — each chunk is sent to OpenAI's text-embedding-3-small model, which returns a 1536-dimensional vector
Storage — vectors are stored in PostgreSQL using the pgvector extension, alongside the chunk text and article metadata

The embedding captures the meaning of the text, not just the words. "SSO setup" and "single sign-on configuration" produce nearly identical vectors because they mean the same thing.

Hybrid Search

Pure semantic search is powerful but not perfect. It can miss exact term matches — if someone searches for an error code like ERR_AUTH_FAILED, semantic similarity might not rank the right article first.

FinalDoc uses a hybrid approach that combines three search methods:

1. Semantic Search (pgvector)

Convert the query to an embedding, then find the nearest article chunks using cosine similarity:

SELECT * FROM article_chunks ORDER BY embedding <=> query_embedding LIMIT 10

This finds conceptually related content regardless of keyword overlap.

2. Full-Text Search (PostgreSQL tsvector)

Standard PostgreSQL full-text search with ranking. Handles exact matches, stemming, and phrase queries:

SELECT * FROM articles WHERE search_vector @@ plainto_tsquery('english', query)

3. Fuzzy Search (pg_trgm)

Trigram similarity for typo tolerance. When a user types "authenication" instead of "authentication," fuzzy search still finds the right articles.

Scoring and Ranking

Results from all three methods are merged using a weighted scoring formula:

Semantic similarity — 50% weight (highest, because intent matching is most valuable)
Full-text relevance — 35% weight (exact matches should rank high)
Fuzzy similarity — 15% weight (catches typos and near-matches)

Articles that appear in multiple result sets get boosted. An article that's both semantically similar AND contains the exact keywords is almost certainly the right result.

Performance

Search must be fast. Readers expect results as they type. Our targets:

< 200ms for semantic search on 10,000 article chunks
< 50ms for full-text search with GIN index
< 100ms for fuzzy search with trigram index

pgvector uses HNSW (Hierarchical Navigable Small World) indexes for approximate nearest neighbor search. This gives us O(log n) query time instead of O(n) brute-force comparison.

We also use Redis caching for repeat queries. The same search query returns cached results for 60 seconds, eliminating database hits for common searches.

Smart Suggestions

When all three search methods return zero results, FinalDoc doesn't show an empty page. Instead, the AI generates smart suggestions:

The zero-result query is embedded and compared against all article chunks at a lower similarity threshold
The top 3 loosely-related articles are shown as "You might be looking for..."
The AI chatbot offers to answer the question directly: "Can't find what you need? Ask me!"

This turns a dead-end into an engagement opportunity. Zero-result queries are also logged for your content team — they're signals that you need to write new articles.

Embedding Management

Embeddings need to stay in sync with content. FinalDoc handles this automatically:

On publish — new article → generate embeddings
On update — changed article → regenerate embeddings for modified chunks
On delete — removed article → delete associated embeddings
Bulk embed — admin command to regenerate all embeddings (useful after initial import)

You can monitor embedding status in Settings → AI Configuration, which shows the total number of embedded chunks and the last sync timestamp.

Privacy

For teams using Private AI (BYOK), the embedding model also runs on your infrastructure. Deploy text-embedding-3-small on Azure OpenAI or use Amazon Titan Embeddings on AWS Bedrock. Your article content never leaves your cloud, even for search indexing.

Rich Editor

API Documentation

Video StudioNEW

Toolbox

Knowledge Base Portal

Help Widget

AI Agent

Analytics

🚀 Doc Autopilot

🔒 Private AI

Ved AI

AI Writers

Writer Assistant

Doc AutopilotNEW

Video StudioNEW

Design Assistant

One-Click Translation

Reader Chatbot

Voice & AudioNEW

Smart Search

Article Summary

Content Health

Ticket Analysis

Diagram Generation

Private AI (BYOK)

Internal Knowledge Base

External Knowledge Base

Software Documentation

SOP

User Manual

AI Chatbot

API Documentation

SaaS & Technology

Healthcare

Financial Services

Education

E-Commerce & Retail

IT Services

Government

Technical Writers

Engineering

Customer Support

Product Managers

IT & Compliance

HR & People Ops

Knowledge Managers

All-in-One Platform

AI-Native

Quick Start Guide

Ved AI Guide

Doc Autopilot Guide

Full Documentation

Release Notes

Keyboard Shortcuts

Blog

Contact Support

Migration Assistance

About FinalDoc