A Furniture Discovery example: Why LLMs Fail at Retrieval
Imagine you’re building an agent for a high-end furniture marketplace. A user asks: "Show me some modern lounge chairs that are actually comfortable and won't take forever to ship."
If you build this using only an LLM + a standard Vector DB (RAG), here is the failure mode:
- The Semantic Trap: The Vector DB finds 50 "lounge chairs" where the text embedding matches "modern." But it doesn't know what "comfortable" means. To a database, comfort is just a word; to a business, comfort is a behavioral signal found in low return rates and high repeat purchase data.
- The Data Stale-mate: The LLM recommends a beautiful chair that went out of stock ten minutes ago because the vector index is only updated via weekly batch jobs.
- The Context Window Tax: You can't shovel 500 candidate chairs into a prompt to "let the LLM decide." It’s expensive, high-latency, and LLMs suffer from positional bias, often picking the first items they see regardless of quality.
The outcome? An agent that is a "stochastic parrot"—it looks smart but makes low-quality decisions that lose revenue and erode user trust.
Step 1: Connecting Your Data (The Foundation)
High-fidelity retrieval requires a unified view of your product catalog, user profiles, and event streams. Shaped provides three ways to do this:
- The Console: Select from 20+ native connectors (BigQuery, Snowflake, Segment, Amplitude) and authenticate in clicks.
- The CLI: Use shaped create-table to provision schemas.
- The SDKs: Use the Python or TypeScript SDKs to stream data directly from your application.
For our furniture agent, we’ll use the Python SDK to connect our catalog.
Step 2: AI Enrichment (Transforming Content to Intelligence)
Raw data like "Walnut Chair" isn't enough for an agent to determine if something is "ergonomic." Shaped allows you to create AI Views—materialized, LLM-enriched representations of your data. This creates "agent-ready" features before a user even asks a question.
Outcome: Your agent now has access to a "comfort_score" feature that was derived from actual customer sentiment, not just marketing copy.
Step 3: Defining the Engine (The Intelligence Layer)
An Engine is where Shaped trains models on your data. Unlike a simple vector DB, a Shaped Engine combines semantic embeddings with behavioral models like ELSA or Two-Tower.
You can define this in a YAML file via the CLI or configure it through the Shaped Console.
Step 4: High-Signal Retrieval with ShapedQL
Now, when your agent needs to find the "best" chairs, it doesn't just do a vector lookup. It executes a ShapedQL query that handles the four stages of discovery: Retrieve, Filter, Score, and Reorder.
Step 5: Defining the Agent Logic
Finally, your LLM agent (using OpenAI, Anthropic, etc.) calls Shaped as a tool. Instead of giving the LLM a "junk drawer" of 50 semi-relevant chairs, you give it 5 deterministic, business-validated, and behaviorally-ranked items.
The Python Integration:
The Outcomes: Why This Wins
By moving retrieval and ranking from the prompt into Shaped, you achieve:
- Zero Hallucinations on Constraints: If ShapedQL filters for inventory > 0, the agent cannot recommend an out-of-stock item.
- Behavioral Accuracy: Your agent suggests items people actually buy and keep (via the ELSA model), not just items that match the word "modern."
- Low Latency: Shaped’s fast_tier serves these queries in <50ms, ensuring your agent doesn't feel sluggish.
- Developer Agility: Change your business logic (e.g., "boost high-margin items") by editing a single ShapedQL line in the Console, without re-deploying your agent code.
Stop treating your agent’s context window like a search bar. Give it the high-signal, behavioral intelligence it needs to actually work.
Want to try Shaped with your own data? Sign up for a free trial with $300 credits here.



