Smart Sentence Retriever

NLP Embeddings & Serverless Retrieval

Demo

Search a fixed Alice in Wonderland corpus by meaning, not exact wording.

Type a query (a phrase or full sentence).
Adjust “Top results” to control how many matches are returned.
Click “Find Sentences” and review the ranked semantic matches.
If the demo is warming up after idle time, wait for the status to turn ready and try again.

STAR Summary

Cleaned Project Gutenberg text, split it into sentences, and precomputed embeddings for the fixed corpus.
Compared several embedding models on a sample subset and balanced clustering quality against model size and serverless deployment cost.
Deployed the selected model behind an AWS Lambda Function URL and built a browser demo that checks `/health` before ranking top-k matches by cosine similarity.
Shipped a working semantic-search demo for Alice in Wonderland that returns ranked sentence matches instead of keyword hits.
Kept the site embed usable with warm-up status, inline similarity bars, and an open-in-new-tab fallback for a larger view.

Project Links

Data Links

Corpus (Project Gutenberg)

Notes

The live demo uses a fixed Alice in Wonderland corpus and checks endpoint health before enabling queries because cold starts can happen after idle time.