Search a fixed corpus (Alice in Wonderland) by meaning, not exact wording.
- Type a query (a phrase or full sentence).
- Adjust “Top results” to control how many matches are returned.
- Click “Find Sentences” and review the ranked results.
- If the status shows “Warming up”, wait a moment and try again.
Links
STAR Summary
- Situation
- I wanted a fast way to find sentences that match a question, even when the wording is different.
- Task
- Owned the end-to-end build, from implementation through the final deliverable.
- Action
-
- Cleaned the text of Alice in Wonderland, split it into sentences, and precomputed embeddings.
- Experimented with several embedding models on a small sample (~800 sentences) and compared clustering quality using silhouette score as a quick heuristic.
- Deployed a selected model behind an AWS Lambda API (with CORS) that returns the top-k semantic matches.
- Result
-
- Selected an embedding model that clustered the corpus cleanly on the test set (dataset- and k-dependent).
- Shipped a serverless demo: embed the query, then rank cached sentence embeddings by cosine similarity.