Search built for models, not people
A starter draft on what I think about at work. Make it yours before publishing.
Traditional web search optimizes for a human who will glance at a page of results, recognize the one they want, and click. That interface assumes a person in the loop doing the final filtering.
An LLM has a different shape of need. It can’t skim — it ingests. It wants results that are precise, complete, and structured, because whatever comes back goes straight into a context window and gets reasoned over. Noise isn’t a minor annoyance; it’s tokens spent and attention diluted.
A tiny sketch of the shape of the problem:
def answer(question: str, search) -> str:
# The quality of the answer is bounded by the quality of what we retrieve.
docs = search(question, k=5) # retrieval is the bottleneck
context = "\n\n".join(d.text for d in docs)
return llm(f"Context:\n{context}\n\nQuestion: {question}")
The model is only as good as search. If retrieval returns the wrong five
documents, no amount of clever prompting saves the answer. That’s why I find this
layer so interesting: it’s upstream of everything else.
A few open questions I keep coming back to
- What does “relevance” even mean when the consumer is a model, not a person?
- How do you evaluate retrieval when the downstream task is open-ended generation?
- Where should the boundary sit between retrieval and reasoning?
More to come.