Quick Answer: How to Prevent Repetitive Agent Recommendations
Building AI agents that never repeat the same recommendations requires implementing stateful filtering at the retrieval layer. Here’s what you need to know:
- Use prebuilt(‘exclude_seen’) filters to automatically filter out items users have already interacted with
- Track interactions in real-time by storing user-item engagement in an interaction table
- Apply filters at the WHERE clause after retrieval but before scoring to minimize computation
- Configure personal filters once in your engine config, then reference them in any query
- Achieve sub-50ms filtering on millions of items without manual state management
- Avoid the trap of client-side deduplication which creates race conditions and scales poorly
Nothing kills an agent’s UX faster than watching it recommend the same three articles, products, or restaurants you’ve already seen. Users lose trust immediately when an AI assistant ignores their history and serves up content they’ve already engaged with. Yet most retrieval systems handle this poorly—either ignoring the problem entirely or implementing fragile client-side solutions that break under load.
The $800K Problem: When Agents Forget What Users Have Seen
A leading travel booking platform discovered their AI concierge was costing them $800,000 per month in lost revenue. The culprit? Their recommendation engine kept suggesting hotels users had already viewed, declined, or booked. Users would ask “What are some good hotels in Paris?” three times in a conversation, and each time the agent would confidently recommend the same five properties.
The engineering team tried multiple fixes. First, they implemented client-side deduplication—tracking seen items in the session state and filtering them out before displaying results. This worked in development but created race conditions in production when users opened multiple tabs or switched devices. Then they attempted a Redis cache with 24-hour TTLs to track viewed items, which added 200ms of latency to every query and still missed interactions that happened in the last few seconds.
The real issue ran deeper than implementation details. Their architecture fundamentally separated retrieval from state management. The vector search engine had no concept of user history. The recommendation model knew nothing about what had already been shown. The application layer tried to bridge this gap, but by then it was too late—the damage was done in wasted compute, poor rankings, and frustrated users.
This pattern repeats across industries. Content platforms re-suggest articles users read yesterday. E-commerce sites show purchased items in “recommended for you” feeds. Restaurant apps propose venues where users already have reservations. Each repetition erodes trust and wastes valuable recommendation slots that could introduce genuinely new options.
The Traditional Approach: Why Client-Side Filtering Fails
Most engineering teams start with the obvious solution: filter out seen items in the application layer after retrieval. The logic seems sound—pull a large set of candidates from the retrieval system, then remove items the user has already interacted with before displaying results.
Here’s what that typically looks like:
# traditional_filtering.py
# Traditional client-side filtering (don't do this)
from pinecone import Pinecone
import redis
# Fetch candidates from vector database
pc = Pinecone(api_key="...")
index = pc.Index("products")
results = index.query(
vector=user_query_embedding,
top_k=100, # Over-fetch to account for filtering
include_metadata=True
)
# Fetch seen items from Redis
redis_client = redis.Redis(...)
seen_items = redis_client.smembers(f"user:{user_id}:seen")
# Filter out seen items
filtered_results = [
item for item in results['matches']
if item['id'] not in seen_items
][:20] # Take top 20 after filtering
# Track new items as seen
for item in filtered_results:
redis_client.sadd(f"user:{user_id}:seen", item['id'])
This approach has five critical failures that become apparent under production load:
1. Wasted Compute and Cost. You’re embedding the query, searching the vector index, computing similarity scores, and retrieving metadata for potentially hundreds of items you’ll immediately discard. If 60% of candidates are already seen (common for engaged users), you’re burning 60% of your compute budget on items you’ll never show. At scale, this means paying for 160,000 unnecessary vector searches per day instead of 100,000 useful ones.
2. Inconsistent Ranking Quality. Retrieving top 100 candidates, then filtering 60 seen items, means you’re actually ranking from position 41-100 in the true similarity ranking. You’ve lost your top 40 most relevant items. The user gets recommendations from the second tier of quality, not the first. Your carefully tuned ranking model optimizes for the wrong set.
3. Race Conditions and State Sync. User opens three tabs, each querying recommendations. Tab 1 gets items A, B, C. Tab 2 queries before Tab 1 updates the seen set, also gets A, B, C. Tab 3 now sees A, B, C again. Multi-device usage compounds this. The same user on mobile and desktop sees duplicates because state synchronization has ~2-5 second lag.
4. Database Pressure from State Lookups. Every query hits your state store (Redis, PostgreSQL, etc.) twice: once to fetch the seen set before filtering, once to update it after showing results. For a system serving 10,000 queries per second, that’s 20,000 database operations per second just for deduplication. Your state store becomes the bottleneck, not your ML model.
5. Complex Application Logic. Your application code now handles state management, deduplication, re-ranking after filtering, and seen-set updates. This complexity leaks across services. Your recommendation API needs to know about state. Your web frontend needs to track interactions. Your mobile app implements a different version. Testing becomes a nightmare because behavior depends on temporal state.
Here’s what the architecture looks like with traditional client-side filtering:
(retrieve 100 items)
(Redis/PostgreSQL)
in App
position 41+
(Redis/PostgreSQL)
Every request requires four network hops, two database queries, and filtering logic spread across three services. Latency ranges from 150-300ms depending on cache hit rates. Error handling is complex because failures can occur at each step. And the whole system gets more fragile as the seen set grows—users with thousands of interactions take longer to filter than new users.
The fundamental mistake is treating retrieval and state as separate concerns. Your retrieval system should understand user history natively, filtering candidates before expensive scoring operations. This isn’t just about performance—it’s about architectural coherence. When retrieval knows what’s been seen, it can make better decisions about what to retrieve in the first place.
The Shaped Edge: Stateful Retrieval with Prebuilt Filters
Shaped takes a different approach: state-aware retrieval where the engine knows about user history and applies filters at the optimal point in the pipeline. Instead of fetching candidates then filtering in application code, you declare filters once in your engine configuration and reference them in queries.
Here’s the complete architecture with Shaped’s prebuilt filters:
Step 1: Define a Personal Filter in Engine Config
First, you configure the filter in your engine YAML. This tells Shaped what data constitutes “seen” items and how to match them to users.
# engine_config.yaml
# engine_config.yaml
name: recommendations_engine
data:
item_table: products
interaction_table: user_interactions
filters:
- name: exclude_seen
filter_type:
type: personal_filter
user_id_column: user_id
item_id_column: item_id
filter_table:
type: query
query: >
SELECT item_id, user_id
FROM user_interactions
WHERE interaction_type IN ('view', 'click', 'purchase')
index:
embeddings:
- name: product_content_embedding
encoder:
type: hugging_face
model_name: "sentence-transformers/all-MiniLM-L6-v2"
batch_size: 256
item_fields: [name, description, category]
training:
models:
- name: als
policy_type: als
This configuration creates a personal filter dataset that maps users to items they’ve interacted with. The personal_filter type tells Shaped to maintain a per-user index of seen items. The filter automatically updates as new interactions arrive in the user_interactions table.
Step 2: Apply the Filter in Queries
Now you can reference this filter in any query using prebuilt():
# rank_query.py
# Python SDK example
from shaped import RankQueryBuilder, Similarity, ColumnOrder
query = (
RankQueryBuilder()
.from_entity('item')
.retrieve(
Similarity(
embedding_ref='product_content_embedding',
encoder={'type': 'interaction_round_robin', 'input_user_id': '$user_id'},
limit=50
),
ColumnOrder(
columns=['_derived_popular_rank ASC'],
limit=50
)
)
.filter(
predicate="prebuilt('exclude_seen', input_user_id='$user_id')"
)
.score(
value_model='click_through_rate',
input_user_id='$user_id',
input_interactions_item_ids='$interaction_item_ids'
)
.limit(20)
.build()
)
# Execute query
response = client.rank(
query=query,
parameters={'user_id': 'user_12345'}
)
# Or using ShapedQL:
SELECT *
FROM similarity(embedding_ref='product_content_embedding',
encoder='interaction_round_robin',
input_user_id='$user_id',
limit=50),
column_order(columns='_derived_popular_rank ASC', limit=50)
WHERE prebuilt('exclude_seen', input_user_id='$user_id')
ORDER BY score(expression='click_through_rate',
input_user_id='$user_id',
input_interactions_item_ids='$interaction_item_ids')
LIMIT 20
The filter runs after retrieval but before scoring, removing seen items from the candidate set before expensive model inference. This is the optimal placement in the pipeline—you only score items the user hasn’t seen.
Step 3: Track Interactions in Real-Time
Shaped automatically ingests new interactions from your interaction table. When a user clicks, views, or purchases an item, that interaction flows into the filter dataset without manual synchronization:
# tracking.py
# Your application code just tracks interactions normally
# Shaped handles the rest automatically
interaction_event = {
'user_id': 'user_12345',
'item_id': 'product_789',
'interaction_type': 'view',
'timestamp': datetime.now()
}
# Add to your interaction table via connector or API
# The exclude_seen filter updates automatically
How This Changes the Architecture
Compare the data flow:
The filter runs in the same system that handles retrieval and scoring. No state synchronization across services. No race conditions. No manual cache invalidation. The engine maintains the seen set as a materialized view that updates in real-time as interactions stream in.
Why This Works: The Four-Stage Pipeline with Optimal Filter Placement
Understanding where filtering happens in the retrieval pipeline explains why prebuilt filters outperform client-side approaches. Modern ranking systems operate in four stages:
Stage 1: Retrieval pulls candidate items from indexes. This is cheap (vector similarity, lexical search) but imprecise—you retrieve 50-1000 candidates knowing most won’t be perfect.
Stage 2: Filtering removes unwanted items based on business rules. This is where prebuilt(‘exclude_seen’) runs. Filtering after retrieval but before scoring saves compute on items you’ll discard anyway.
Stage 3: Scoring applies expensive ML models to predict engagement, conversion, or other objectives. This is your most computationally costly step—you want to score only items that passed filtering.
Stage 4: Reordering applies diversity and exploration to avoid echo chambers and surface variety.
Here’s where traditional client-side filtering gets the order wrong:
(expensive!)
(need 20 more?)
(save 60%!)
(perfect!)
By filtering before scoring, you avoid running ML inference on items you’ll immediately discard. For a model with 50ms inference time, filtering 60 items saves 3,000ms of compute per query. At 10,000 queries per second, that’s 8.3 hours of compute saved every second.
The placement matters for ranking quality too. Filtering after retrieval means you’re working with the full similarity-ranked candidate set. If you retrieve top 100 by similarity, then filter, you’re selecting from genuinely relevant items—not from the second tier (positions 41-100) like you’d get with over-fetch-then-filter.
Implementation Guide: Building Exclusion Filters Step by Step
Let’s build a complete recommendation system with exclude_seen filtering, walking through each component.
Setting Up Your Data Schema
Start by defining your item table and interaction table. The interaction table tracks every user engagement:
# setup_schemas.yaml
# tables/user_interactions.yaml
name: user_interactions
schema_type: CUSTOM
column_schema:
user_id: String
item_id: String
interaction_type: String # 'view', 'click', 'purchase', 'save'
timestamp: DateTime
session_id: String
# tables/products.yaml
name: products
schema_type: CUSTOM
column_schema:
item_id: String
name: String
description: String
category: String
price: Float
inventory_count: Integer
created_at: DateTime
# Upload these schemas using the Shaped CLI:
$ shaped create-table --file tables/user_interactions.yaml
$ shaped create-table --file tables/products.yaml
Configuring the Engine with Filters
Now create your engine configuration with the exclude_seen filter:
# engine_config.yaml
# engine_config.yaml
name: product_recommendations
data:
item_table: products
user_table: users
interaction_table: user_interactions
# Define the exclude_seen filter
filters:
- name: exclude_seen
filter_type:
type: personal_filter
user_id_column: user_id
item_id_column: item_id
filter_table:
type: query
query: >
SELECT DISTINCT item_id, user_id
FROM user_interactions
WHERE interaction_type IN ('view', 'click', 'purchase')
AND timestamp > now() - INTERVAL '90 days'
index:
embeddings:
- name: product_content_embedding
encoder:
type: hugging_face
model_name: "sentence-transformers/all-MiniLM-L6-v2"
batch_size: 256
item_fields: [name, description, category]
- name: als_embedding
encoder:
type: trained_model
model_ref: als
training:
models:
- name: als
policy_type: als
- name: click_through_rate
policy_type: elsa
label_column: label
prediction_type: binary_classification
This configuration creates:
- A content embedding for semantic search over product text
- A collaborative filtering embedding (ALS) based on user-item interactions
- A click-through rate prediction model
- An exclude_seen filter that looks at the last 90 days of interactions
Building Queries with Exclusion
With the filter configured, you can now build queries that automatically exclude seen items:
from shaped import RankQueryBuilder, Similarity, ColumnOrder
# get_personalized_feed.py
# Basic personalized feed with exclusion
def get_personalized_feed(user_id: str, limit: int = 20):
query = (
RankQueryBuilder()
.from_entity('item')
.retrieve(
# Content-based retrieval
Similarity(
embedding_ref='product_content_embedding',
encoder={'type': 'interaction_round_robin', 'input_user_id': '$user_id'},
limit=50
),
# Collaborative filtering
Similarity(
embedding_ref='als_embedding',
encoder={'type': 'precomputed_user', 'input_user_id': '$user_id'},
limit=50
),
# Popular items fallback
ColumnOrder(
columns=['_derived_popular_rank ASC'],
limit=50
)
)
.filter(
predicate="prebuilt('exclude_seen', input_user_id='$user_id')"
)
.score(
value_model='click_through_rate',
input_user_id='$user_id'
)
.reorder(diversity=0.3)
.limit(limit)
.build()
)
return client.rank(query=query, parameters={'user_id': user_id})
Handling Edge Cases
New users with no history: The filter gracefully handles users with zero interactions—it simply doesn’t filter anything, returning pure ranked results.
# recency_logic.py / config.yaml
# No special handling needed - this works for both new and existing users
results = get_personalized_feed(user_id="brand_new_user")
# Time windows for recency: You might want to exclude only recently seen items,
# allowing re-recommendations after 30 days:
filters:
- name: exclude_recent_views
filter_type:
type: personal_filter
user_id_column: user_id
item_id_column: item_id
filter_table:
type: query
query: >
SELECT item_id, user_id
FROM user_interactions
WHERE interaction_type = 'view'
AND timestamp > now() - INTERVAL '30 days'
Combining multiple filters: You can apply multiple prebuilt filters in a single query:
# exclusion_query.sql
SELECT *
FROM similarity(embedding_ref='product_embedding', limit=100)
WHERE prebuilt('exclude_seen', input_user_id='$user_id')
AND prebuilt('exclude_out_of_stock', input_user_id='$user_id')
AND price <= 200
LIMIT 20
Filter-specific interaction types: Different recommendation surfaces might exclude different interaction types. A “browse again” widget might exclude purchases but allow views, while a main feed excludes everything:
# filters_config.yaml
filters:
- name: exclude_purchased
filter_table:
type: query
query: >
SELECT item_id, user_id
FROM user_interactions
WHERE interaction_type = 'purchase'
- name: exclude_all_interactions
filter_table:
type: query
query: >
SELECT item_id, user_id
FROM user_interactions
WHERE interaction_type IN ('view', 'click', 'save', 'purchase')
Advanced Patterns: Beyond Basic Exclusion
Once you have basic exclusion working, several advanced patterns become possible.
Pattern 1: Category-Aware Exclusion for Diversity
Instead of excluding all seen items, exclude items from categories the user has already explored recently. This maintains diversity while avoiding exact duplicates:
# exploration_filter.yaml
filters:
- name: exclude_explored_categories
filter_type:
type: personal_filter
user_id_column: user_id
item_id_column: item_id
filter_table:
type: query
query: >
WITH user_categories AS (
SELECT u.user_id, i.category, COUNT(*) as interaction_count
FROM user_interactions u
JOIN products i ON u.item_id = i.item_id
WHERE u.timestamp > now() - INTERVAL '7 days'
GROUP BY u.user_id, i.category
HAVING COUNT(*) >= 5 -- Saturated on this category
)
SELECT p.item_id, uc.user_id
FROM products p
JOIN user_categories uc ON p.category = uc.category
Pattern 2: Time-Decayed Exclusion
Instead of hard filtering, apply a penalty to recently seen items using value models. Items seen yesterday get a large penalty, items seen 30 days ago get a small penalty:
# dynamic_scoring.sql
SELECT *
FROM similarity(embedding_ref='content_embedding', limit=100)
ORDER BY score(
expression='
click_through_rate -
CASE
WHEN item.last_seen_days < 1 THEN 10.0
WHEN item.last_seen_days < 7 THEN 5.0
WHEN item.last_seen_days < 30 THEN 1.0
ELSE 0.0
END
',
input_user_id='$user_id'
)
LIMIT 20
This requires adding a view that computes last_seen_days:
# items_with_recency.yaml
views:
- name: items_with_recency
view_type: SQL
query: >
SELECT
i.*,
COALESCE(
EXTRACT(days FROM (now() - MAX(u.timestamp))),
999
) as last_seen_days
FROM products i
LEFT JOIN user_interactions u
ON i.item_id = u.item_id
AND u.user_id = $input_user_id
GROUP BY i.item_id
Pattern 3: Collaborative Exclusion
Exclude items that similar users have already engaged with extensively. This is useful for discovery feeds where you want to surface items that your peer group hasn’t saturated:
# peer_saturation_filter.yaml
filters:
- name: exclude_peer_saturated
filter_type:
type: personal_filter
user_id_column: user_id
item_id_column: item_id
filter_table:
type: query
query: >
WITH similar_users AS (
-- Find users with similar taste (simplified)
SELECT u2.user_id as similar_user_id
FROM user_interactions u1
JOIN user_interactions u2
ON u1.item_id = u2.item_id
AND u1.user_id != u2.user_id
WHERE u1.user_id = $input_user_id
GROUP BY u2.user_id
HAVING COUNT(DISTINCT u1.item_id) >= 5
ORDER BY COUNT(DISTINCT u1.item_id) DESC
LIMIT 50
)
SELECT u.item_id, $input_user_id as user_id
FROM user_interactions u
JOIN similar_users s ON u.user_id = s.similar_user_id
GROUP BY u.item_id
HAVING COUNT(DISTINCT u.user_id) >= 30 -- 30+ similar users engaged
Pattern 4: Session-Aware Filtering
For conversational agents, exclude items mentioned earlier in the current conversation without excluding items from previous sessions:
# chat_recommendations.py
# Track session-specific exclusions
def get_chat_recommendations(user_id: str, session_id: str, mentioned_items: list):
query = (
RankQueryBuilder()
.from_entity('item')
.retrieve(
Similarity(
embedding_ref='content_embedding',
encoder={'type': 'interaction_round_robin', 'input_user_id': '$user_id'},
limit=50
)
)
.filter(
# Exclude items mentioned in THIS conversation
predicate=f"item.item_id NOT IN {tuple(mentioned_items)}"
)
.score(value_model='click_through_rate', input_user_id='$user_id')
.limit(10)
.build()
)
return client.rank(query=query, parameters={'user_id': user_id})
Pattern 5: Progressive Exploration
Combine exclusion with exploration to progressively expand the user’s horizon. Start with close-to-history recommendations, then gradually introduce more diverse items:
# progressive_feed.py
def get_progressive_feed(user_id: str, diversity_level: float = 0.0):
"""
diversity_level: 0.0 = only exclude exact matches
0.5 = exclude similar items
1.0 = exclude entire explored clusters
"""
query = (
RankQueryBuilder()
.from_entity('item')
.retrieve(
Similarity(
embedding_ref='content_embedding',
encoder={'type': 'interaction_round_robin', 'input_user_id': '$user_id'},
limit=100
)
)
.filter(
predicate="prebuilt('exclude_seen', input_user_id='$user_id')"
)
.score(
value_model='click_through_rate',
input_user_id='$user_id'
)
.reorder(
diversity=diversity_level,
exploration=diversity_level * 0.5
)
.limit(20)
.build()
)
return client.rank(query=query, parameters={'user_id': user_id})
# Use in conversation
turn_1 = get_progressive_feed(user_id='user123', diversity_level=0.0) # Similar to history
turn_2 = get_progressive_feed(user_id='user123', diversity_level=0.3) # Moderate diversity
turn_3 = get_progressive_feed(user_id='user123', diversity_level=0.6) # High diversity
Hot Take: Client-Side Deduplication Is an Anti-Pattern
Here’s a controversial opinion: if you’re filtering seen items in your application code, you’re doing it wrong. Full stop.
The industry has normalized this pattern. Nearly every recommendation tutorial shows filtering in the app layer. Major platforms run variations of this approach at scale. But that doesn’t make it right—it makes it cargo-culted technical debt.
Filtering belongs in the retrieval system, not the application layer. Here’s why this isn’t just an optimization, it’s an architectural imperative:
Retrieval systems should be state-aware by design. Just like databases maintain indexes and constraints, retrieval engines should maintain user state and apply it during candidate selection. Offloading this to the application is like offloading SQL WHERE clauses to application code—technically possible, but architecturally wrong.
State and ranking are coupled, not separate. You can’t rank effectively without knowing what the user has seen. A model that predicts “click-through rate” needs to know if the item is novel or repeated. Separating state from ranking means your model optimizes for the wrong objective.
Scaling requires pushing logic down the stack. As query volume grows, you need systems that can apply filters efficiently at the data layer. Pulling millions of candidates into application memory to filter them scales linearly with traffic. Filtering in the engine scales logarithmically because it runs on indexed data.
The resistance to this idea comes from how we’ve historically built these systems. Vector databases didn’t support user state, so we added it in Redis. Recommendation APIs didn’t track interactions, so we added event streams. We’ve built complex distributed systems to compensate for retrieval engines that lack basic stateful capabilities.
But modern retrieval platforms like Shaped prove this complexity is unnecessary. When retrieval natively understands user history, filtering becomes declarative configuration instead of imperative code. You express “exclude items the user has seen” once in a config file, not repeatedly in every service that calls the API.
When to Use Exclude Seen (and When Not To)
Prebuilt exclusion filters aren’t appropriate for every scenario. Here’s a framework for deciding when they make sense:
Use Exclude Seen When:
You have high-engagement users. If users typically interact with 50+ items, exclusion prevents frustrating repetition. Content platforms, e-commerce sites, and social feeds fall into this category.
Recommendations refresh frequently. If users return daily or weekly expecting fresh content, exclusion is essential. Without it, you’ll show the same popular items every time.
Your catalog is large relative to engagement. With 10,000 items and users who interact with 100, you have plenty of unseen options to recommend. Exclusion improves quality without exhausting the catalog.
You track interactions already. If you’re logging views, clicks, or purchases for analytics, you already have the data needed for exclusion filters. The marginal cost is minimal.
You want to optimize for discovery. Exclusion forces the system to surface items users haven’t found yet, expanding their awareness of your catalog.
Skip Exclude Seen When:
You have a small catalog. With 100 items and active users, exclusion might eliminate most of your inventory. A user who’s seen 80 items has only 20 options left—you might want to re-recommend rather than scrape the bottom of the catalog.
Re-engagement is the goal. Email digest recommendations might benefit from showing items users engaged with before but didn’t complete (e.g., “finish watching this series”). Exclusion would hide these opportunities.
Interactions don’t signal saturation. A view doesn’t mean the user is done with an item. News articles users “viewed” by scrolling past might deserve a second chance. In these cases, time-decay or soft penalties work better than hard exclusion.
You’re doing pure exploration. For “random” or “serendipity” features where users explicitly want surprise, exclusion works against the goal. Let them encounter familiar items mixed with new ones.
Cold start is your primary challenge. If most users are new with few interactions, exclusion doesn’t buy you much. Focus on solving cold start before worrying about repetition.
Decision Matrix
| Factor | Use Exclusion | Skip Exclusion |
|---|---|---|
| Avg interactions per user | >50 | <10 |
| Catalog size | >1000 items | <100 items |
| Update frequency | Daily/hourly | Weekly/monthly |
| Recommendation goal | Discovery | Re-engagement |
| Interaction meaning | Saturation | Incomplete |
| User base | Mostly returning | Mostly new |
Common Pitfalls When Implementing Exclusion Filters
Even with prebuilt filters, there are ways to shoot yourself in the foot. Here are the five mistakes teams make most often:
Pitfall 1: Filtering Too Aggressively
Excluding everything a user has ever interacted with can backfire. A user who’s been on your platform for three years might have “seen” 80% of your catalog. With aggressive filtering, you’re left recommending from the dregs—low-quality items that didn’t make the cut in previous sessions.
Solution: Use time windows or interaction types selectively. Exclude views from the last 30 days, but allow views from six months ago. Exclude purchases permanently (they own it), but allow clicks to resurface after time passes.
# filter_config.yaml
# Good: Time-bounded exclusion
filters:
- name: exclude_recent_views
filter_table:
type: query
query: >
SELECT item_id, user_id
FROM user_interactions
WHERE interaction_type = 'view'
AND timestamp > now() - INTERVAL '30 days'
Pitfall 2: Ignoring Filter Performance
Filters run on every query. If your filter query scans millions of rows or does complex joins, you’ll add 50-100ms to every request. Users notice this latency.
Solution: Ensure your filter query has proper indexes and uses efficient joins. The filter table should be materialized or use indexed lookups, not table scans.
# optimization.yaml
# Bad: Full table scan on every query
query: SELECT item_id, user_id FROM user_interactions
# Good: Indexed lookup with time bounds
query: >
SELECT item_id, user_id
FROM user_interactions
WHERE timestamp > now() - INTERVAL '90 days'
-- Assumes index on (timestamp, user_id, item_id)
Pitfall 3: Not Tracking Impressions
You might exclude items the user clicked but forget to exclude items they scrolled past without engaging. If you show the same 20 items across multiple sessions because the user never clicked, you’re still being repetitive.
Solution: Track impressions (items shown to the user) in addition to engagement. Update your interaction table with “shown” events.
# log_impressions.py
# Log impressions, not just clicks
for item in recommendations:
log_interaction(
user_id=user_id,
item_id=item['id'],
interaction_type='shown',
session_id=session_id
)
Pitfall 4: Forgetting to Test Cold Start
Your exclusion logic might work great for active users but break the experience for new users with no history. An empty filter returns all candidates, but if you’re not prepared for this, you might have bugs.
Solution: Explicitly test with users who have zero interactions. Make sure the fallback behavior (no filtering) is intentional and produces good results.
# test_cold_start.py
# Test case: new user with no interactions
def test_cold_start_user():
new_user_id = create_test_user(interaction_count=0)
results = get_personalized_feed(new_user_id)
assert len(results) == 20 # Should still get full results
assert_popular_items_included(results) # Should fall back to popularity
Pitfall 5: Over-Filtering in Multi-Retrieval Queries
When you use multiple retrievers (content similarity + collaborative filtering + trending), and all three return overlapping items, aggressive filtering might eliminate most candidates before scoring.
Solution: Retrieve more candidates per retriever (e.g., 100 instead of 50) when using exclusion filters. This ensures you have enough candidates after filtering.
# adjust_limits.py
# Adjust candidate set size when filtering
query = (
RankQueryBuilder()
.from_entity('item')
.retrieve(
Similarity(embedding_ref='content', limit=100), # Increased from 50
Similarity(embedding_ref='collaborative', limit=100),
ColumnOrder(columns=['popular_rank ASC'], limit=100)
)
.filter(predicate="prebuilt('exclude_seen', input_user_id='$user_id')")
.limit(20)
.build()
)
FAQ: Common Questions About Exclude Seen Filters
Q: How do I exclude items from the current session without affecting other sessions?
A: Use a separate session-based filter or pass session-specific items as parameters. For session context, you can use the item.item_id NOT IN clause with parameters:
# dynamic_query.py
query = f"""SELECT * FROM similarity(embedding_ref='embedding', limit=100)
WHERE prebuilt('exclude_seen', input_user_id='$user_id')
AND item.item_id NOT IN {tuple(current_session_items)}
LIMIT 20"""
Q: What happens if the filter removes all candidates?
A: If all retrieved candidates are filtered out, you’ll get fewer results than requested. To prevent empty results, retrieve significantly more candidates than you need (e.g., retrieve 200 to return 20 final results). You can also add fallback retrievers like popularity ranking that don’t depend on personalization.
Q: Can I exclude items based on complex rules beyond simple seen/not-seen?
A: Yes. The filter query can use any SQL logic. For example, exclude items from categories the user has bought from five times, or exclude items similar to ones they’ve returned:
# complex_filter.yaml
filters:
- name: exclude_complex
filter_table:
type: query
query: >
SELECT p.item_id, u.user_id
FROM products p
JOIN user_interactions u ON p.category = (
SELECT category FROM products WHERE item_id = u.item_id
)
WHERE u.interaction_type = 'purchase'
GROUP BY p.item_id, u.user_id, p.category
HAVING COUNT(*) >= 5
Q: How often do filters update when new interactions come in?
A: Shaped maintains filters as materialized views that update in real-time as new data arrives in your interaction table. The latency depends on your connector—streaming connectors (Kafka, PostgreSQL CDC) update within seconds, batch connectors within 15 minutes. For most applications, this is fast enough. If you need instant updates, use a streaming connector.
Q: Can I have different exclusion rules for different recommendation surfaces?
A: Absolutely. Create multiple filters with different names and reference them in the appropriate queries:
# filters.yaml
filters:
- name: exclude_all_interactions
filter_table:
query: SELECT item_id, user_id FROM user_interactions
- name: exclude_only_purchases
filter_table:
query: >
SELECT item_id, user_id
FROM user_interactions
WHERE interaction_type = 'purchase'
Then use prebuilt(‘exclude_all_interactions’) in main feeds and prebuilt(‘exclude_only_purchases’) in browse-again widgets.
Q: What’s the performance impact of filtering on large user histories?
A: Negligible. Shaped maintains filters as pre-computed indexes optimized for lookups. Even for users with 100,000 interactions, filter application adds <5ms to query latency. The filter is essentially a hash set lookup, not a table scan.
Next Steps: Implementing Exclude Seen in Your System
Here’s how to go from reading this article to having exclude_seen filters running in production:
Quick Start (Under 30 Minutes)
Step 1: Verify Your Interaction Data (5 min)
Check that you have an interaction table with user_id, item_id, and timestamp columns. If you don’t, create one:
# user_interactions.yaml
name: user_interactions
schema_type: CUSTOM
column_schema:
user_id: String
item_id: String
interaction_type: String
timestamp: DateTime
Step 2: Add Filter to Engine Config (10 min)
Open your engine YAML and add the filters section:
# filter_config.yaml
data:
filters:
- name: exclude_seen
filter_type:
type: personal_filter
user_id_column: user_id
item_id_column: item_id
filter_table:
type: query
query: >
SELECT item_id, user_id
FROM user_interactions
WHERE timestamp > now() - INTERVAL '90 days'
Step 3: Update One Query to Use the Filter (10 min)
Modify an existing recommendation query to include the filter:
# rank_evolution.py
# Before
query = RankQueryBuilder().from_entity('item').retrieve(...).limit(20).build()
# After
query = (
RankQueryBuilder()
.from_entity('item')
.retrieve(...)
.filter(predicate="prebuilt('exclude_seen', input_user_id='$user_id')")
.limit(20)
.build()
)
Step 4: Deploy and Test (5 min)
Deploy your updated engine config and test with a user who has interaction history:
# terminal
$ shaped update-engine --file engine_config.yaml
# Test query
$ curl -X POST https://api.shaped.ai/v1/rank \
-H "Authorization: Bearer $SHAPED_API_KEY" \
-d '{
"query": "SELECT * FROM similarity(...) WHERE prebuilt('"'"'exclude_seen'"'"', input_user_id='"'"'test_user'"'"') LIMIT 20"
}'
Production Checklist
Before rolling out to all users, verify these items:
- Filter query has proper indexes on timestamp and user_id columns
- Time window for exclusion is appropriate (30-90 days typically)
- Cold start users (no interactions) get reasonable results
- Candidate retrieval sizes are increased to account for filtering (retrieve 100+ to return 20)
- Monitoring is in place for query latency and filter performance
- Edge cases are handled (all candidates filtered, user with 10,000+ interactions)
- Interaction logging is reliable and captures all relevant events
- Different interaction types (view, click, purchase) have appropriate filters
- A/B test is configured to measure impact on engagement metrics
Getting Help
If you encounter issues:
- Documentation: Full filter reference atdocs.shaped.ai
- Slack Community: Join the Shaped community Slack for real-time help
- Support: Email support@shaped.ai with your engine config and error logs
- Office Hours: Schedule a sessionwith the Shaped team for a demo
Conclusion: Stateful Retrieval Is the Future
The shift from stateless to stateful retrieval systems mirrors the evolution of databases in the 1970s. Early database systems were simple stores—you queried them, they returned data, that’s it. As applications grew complex, databases absorbed more logic: constraints, triggers, stored procedures, and indexes. The same progression is happening in retrieval.
For too long, we’ve treated retrieval engines as stateless functions: send a query, get candidates, filter and rank in application code. This made sense when vector databases were pure similarity search, nothing more. But modern retrieval demands more—personalization, behavioral signals, business rules, and yes, state awareness.
Exclude_seen filters represent a small but significant piece of this evolution. They move state management from fragile application logic into the retrieval layer where it belongs. The result is faster queries, simpler code, better rankings, and users who trust your recommendations because they never see the same content twice.
The broader trend is clear: retrieval systems will absorb more intelligence traditionally implemented in application layers. Filtering, ranking, personalization, exploration, and real-time adaptation will all migrate down the stack into purpose-built engines. The applications that embrace this shift early will scale better and iterate faster than those clinging to the old client-side patterns.
Start small. Add exclude_seen to one recommendation surface. Measure the impact on engagement and user satisfaction. Then expand from there. Your users will notice the difference immediately—and they’ll wonder why they ever had to see the same recommendations twice.
Ready to implement stateful retrieval? Try Shaped for free with $100 credits here.
See Shaped in action
Talk to an engineer about your specific use case — search, recommendations, or feed ranking.