Stop Being Repetitive: Using 'Exclude Seen' Filters in Agents

Your AI agent just cost you $800,000 by recommending the same hotels users already rejected—three times in one conversation. Most retrieval systems optimize for similarity while ignoring user history, wasting 60% of compute on items users have already seen. Learn how to implement stateful filtering with prebuilt('exclude_seen'), cutting query latency from 300ms to 35ms while ensuring agents never repeat themselves.

Quick Answer: How to Prevent Repetitive Agent Recommendations

Building AI agents that never repeat the same recommendations requires implementing stateful filtering at the retrieval layer. Here's what you need to know:

  • Use prebuilt('exclude_seen') filters to automatically filter out items users have already interacted with
  • Track interactions in real-time by storing user-item engagement in an interaction table
  • Apply filters at the WHERE clause after retrieval but before scoring to minimize computation
  • Configure personal filters once in your engine config, then reference them in any query
  • Achieve sub-50ms filtering on millions of items without manual state management
  • Avoid the trap of client-side deduplication which creates race conditions and scales poorly

Nothing kills an agent's UX faster than watching it recommend the same three articles, products, or restaurants you've already seen. Users lose trust immediately when an AI assistant ignores their history and serves up content they've already engaged with. Yet most retrieval systems handle this poorly—either ignoring the problem entirely or implementing fragile client-side solutions that break under load.

The $800K Problem: When Agents Forget What Users Have Seen

A leading travel booking platform discovered their AI concierge was costing them $800,000 per month in lost revenue. The culprit? Their recommendation engine kept suggesting hotels users had already viewed, declined, or booked. Users would ask "What are some good hotels in Paris?" three times in a conversation, and each time the agent would confidently recommend the same five properties.

The engineering team tried multiple fixes. First, they implemented client-side deduplication—tracking seen items in the session state and filtering them out before displaying results. This worked in development but created race conditions in production when users opened multiple tabs or switched devices. Then they attempted a Redis cache with 24-hour TTLs to track viewed items, which added 200ms of latency to every query and still missed interactions that happened in the last few seconds.

The real issue ran deeper than implementation details. Their architecture fundamentally separated retrieval from state management. The vector search engine had no concept of user history. The recommendation model knew nothing about what had already been shown. The application layer tried to bridge this gap, but by then it was too late—the damage was done in wasted compute, poor rankings, and frustrated users.

This pattern repeats across industries. Content platforms re-suggest articles users read yesterday. E-commerce sites show purchased items in "recommended for you" feeds. Restaurant apps propose venues where users already have reservations. Each repetition erodes trust and wastes valuable recommendation slots that could introduce genuinely new options.

The Traditional Approach: Why Client-Side Filtering Fails

Most engineering teams start with the obvious solution: filter out seen items in the application layer after retrieval. The logic seems sound—pull a large set of candidates from the retrieval system, then remove items the user has already interacted with before displaying results.

Here's what that typically looks like:

traditional_filtering.py
# Traditional client-side filtering (don't do this)
from pinecone import Pinecone
import redis

# Fetch candidates from vector database
pc = Pinecone(api_key="...")
index = pc.Index("products")
results = index.query(
    vector=user_query_embedding,
    top_k=100,  # Over-fetch to account for filtering
    include_metadata=True
)

# Fetch seen items from Redis
redis_client = redis.Redis(...)
seen_items = redis_client.smembers(f"user:{user_id}:seen")

# Filter out seen items
filtered_results = [
    item for item in results['matches']
    if item['id'] not in seen_items
][:20]  # Take top 20 after filtering

# Track new items as seen
for item in filtered_results:
    redis_client.sadd(f"user:{user_id}:seen", item['id'])

This approach has five critical failures that become apparent under production load:

1. Wasted Compute and Cost. You're embedding the query, searching the vector index, computing similarity scores, and retrieving metadata for potentially hundreds of items you'll immediately discard. If 60% of candidates are already seen (common for engaged users), you're burning 60% of your compute budget on items you'll never show. At scale, this means paying for 160,000 unnecessary vector searches per day instead of 100,000 useful ones.

2. Inconsistent Ranking Quality. Retrieving top 100 candidates, then filtering 60 seen items, means you're actually ranking from position 41-100 in the true similarity ranking. You've lost your top 40 most relevant items. The user gets recommendations from the second tier of quality, not the first. Your carefully tuned ranking model optimizes for the wrong set.

3. Race Conditions and State Sync. User opens three tabs, each querying recommendations. Tab 1 gets items A, B, C. Tab 2 queries before Tab 1 updates the seen set, also gets A, B, C. Tab 3 now sees A, B, C again. Multi-device usage compounds this. The same user on mobile and desktop sees duplicates because state synchronization has ~2-5 second lag.

4. Database Pressure from State Lookups. Every query hits your state store (Redis, PostgreSQL, etc.) twice: once to fetch the seen set before filtering, once to update it after showing results. For a system serving 10,000 queries per second, that's 20,000 database operations per second just for deduplication. Your state store becomes the bottleneck, not your ML model.

5. Complex Application Logic. Your application code now handles state management, deduplication, re-ranking after filtering, and seen-set updates. This complexity leaks across services. Your recommendation API needs to know about state. Your web frontend needs to track interactions. Your mobile app implements a different version. Testing becomes a nightmare because behavior depends on temporal state.

Here's what the architecture looks like with traditional client-side filtering:

Traditional Client-Side Filtering Architecture
User Query → App Server
Vector DB
(retrieve 100 items)
App Server
State Store
(Redis/PostgreSQL)
Filter seen items
in App
Re-rank from
position 41+
State Store
(Redis/PostgreSQL)
Update seen set
Return 20 items
User

Every request requires four network hops, two database queries, and filtering logic spread across three services. Latency ranges from 150-300ms depending on cache hit rates. Error handling is complex because failures can occur at each step. And the whole system gets more fragile as the seen set grows—users with thousands of interactions take longer to filter than new users.

The fundamental mistake is treating retrieval and state as separate concerns. Your retrieval system should understand user history natively, filtering candidates before expensive scoring operations. This isn't just about performance—it's about architectural coherence. When retrieval knows what's been seen, it can make better decisions about what to retrieve in the first place.

The Shaped Edge: Stateful Retrieval with Prebuilt Filters

Shaped takes a different approach: state-aware retrieval where the engine knows about user history and applies filters at the optimal point in the pipeline. Instead of fetching candidates then filtering in application code, you declare filters once in your engine configuration and reference them in queries.

Here's the complete architecture with Shaped's prebuilt filters:

Step 1: Define a Personal Filter in Engine Config

First, you configure the filter in your engine YAML. This tells Shaped what data constitutes "seen" items and how to match them to users.

engine_config.yaml
# engine_config.yaml
name: recommendations_engine

data:
  item_table: products
  interaction_table: user_interactions
  filters:
    - name: exclude_seen
      filter_type:
        type: personal_filter
        user_id_column: user_id
        item_id_column: item_id
      filter_table:
        type: query
        query: >
          SELECT item_id, user_id
           FROM user_interactions
          WHERE interaction_type IN ('view', 'click', 'purchase')

index:
  embeddings:
    - name: product_content_embedding
      encoder:
        type: hugging_face
        model_name: "sentence-transformers/all-MiniLM-L6-v2"
        batch_size: 256
        item_fields: [name, description, category]

training:
  models:
    - name: als
      policy_type: als

This configuration creates a personal filter dataset that maps users to items they've interacted with. The personal_filter type tells Shaped to maintain a per-user index of seen items. The filter automatically updates as new interactions arrive in the user_interactions table.

Step 2: Apply the Filter in Queries

Now you can reference this filter in any query using prebuilt():

rank_query.py
# Python SDK example
from shaped import RankQueryBuilder, Similarity, ColumnOrder

query = (
    RankQueryBuilder()
    .from_entity('item')
    .retrieve(
        Similarity(
            embedding_ref='product_content_embedding',
            encoder={'type': 'interaction_round_robin', 'input_user_id': '$user_id'},
            limit=50
        ),
        ColumnOrder(
            columns=['_derived_popular_rank ASC'],
            limit=50
        )
    )
    .filter(
        predicate="prebuilt('exclude_seen', input_user_id='$user_id')"
    )
    .score(
        value_model='click_through_rate',
        input_user_id='$user_id',
        input_interactions_item_ids='$interaction_item_ids'
    )
    .limit(20)
    .build()
)

# Execute query
response = client.rank(
    query=query,
    parameters={'user_id': 'user_12345'}
)

# Or using ShapedQL:
SELECT *
FROM similarity(embedding_ref='product_content_embedding',
               encoder='interaction_round_robin',
               input_user_id='$user_id',
               limit=50),
     column_order(columns='_derived_popular_rank ASC', limit=50)
WHERE prebuilt('exclude_seen', input_user_id='$user_id')
ORDER BY score(expression='click_through_rate',
               input_user_id='$user_id',
               input_interactions_item_ids='$interaction_item_ids')
LIMIT 20

The filter runs after retrieval but before scoring, removing seen items from the candidate set before expensive model inference. This is the optimal placement in the pipeline—you only score items the user hasn't seen.

Step 3: Track Interactions in Real-Time

Shaped automatically ingests new interactions from your interaction table. When a user clicks, views, or purchases an item, that interaction flows into the filter dataset without manual synchronization:

tracking.py
# Your application code just tracks interactions normally
# Shaped handles the rest automatically
interaction_event = {
    'user_id': 'user_12345',
    'item_id': 'product_789',
    'interaction_type': 'view',
    'timestamp': datetime.now()
}

# Add to your interaction table via connector or API
# The exclude_seen filter updates automatically

How This Changes the Architecture

Compare the data flow:

Flow Optimization Comparison
Traditional Flow
Retrieve 100 Initial retrieval
Score 100 Expensive compute!
Filter 60 Removing seen items
Return 40
Shaped Flow
Retrieve 100 Initial retrieval
Filter 60 Remove seen first
Score 40 Save 60% compute
Return 20

The filter runs in the same system that handles retrieval and scoring. No state synchronization across services. No race conditions. No manual cache invalidation. The engine maintains the seen set as a materialized view that updates in real-time as interactions stream in.

Why This Works: The Four-Stage Pipeline with Optimal Filter Placement

Understanding where filtering happens in the retrieval pipeline explains why prebuilt filters outperform client-side approaches. Modern ranking systems operate in four stages:

Stage 1: Retrieval pulls candidate items from indexes. This is cheap (vector similarity, lexical search) but imprecise—you retrieve 50-1000 candidates knowing most won't be perfect.

Stage 2: Filtering removes unwanted items based on business rules. This is where prebuilt('exclude_seen') runs. Filtering after retrieval but before scoring saves compute on items you'll discard anyway.

Stage 3: Scoring applies expensive ML models to predict engagement, conversion, or other objectives. This is your most computationally costly step—you want to score only items that passed filtering.

Stage 4: Reordering applies diversity and exploration to avoid echo chambers and surface variety.

Here's where traditional client-side filtering gets the order wrong:

Pipeline Stage Comparison: Where Filtering Happens Matters
❌ Traditional Flow (Filter After Scoring):
Retrieve
100 items
Score ALL
100 items
(expensive!)
Filter
Remove 60 seen
Return
40 items
(need 20 more?)
Retrieve Again?
Repeat cycle
Problem: You score 100 items but discard 60 of them. That's 60% wasted compute on items you'll never show!
✅ Shaped Flow (Filter Before Scoring):
Retrieve
100 items
Filter FIRST
Remove 60 seen
Score
Only 40 items
(save 60%!)
Return
20 items
(perfect!)
Benefit: Filter before scoring means you only run ML inference on items you'll actually use. 60% less compute, better quality results from top-40 instead of position 41-100.

By filtering before scoring, you avoid running ML inference on items you'll immediately discard. For a model with 50ms inference time, filtering 60 items saves 3,000ms of compute per query. At 10,000 queries per second, that's 8.3 hours of compute saved every second.

The placement matters for ranking quality too. Filtering after retrieval means you're working with the full similarity-ranked candidate set. If you retrieve top 100 by similarity, then filter, you're selecting from genuinely relevant items—not from the second tier (positions 41-100) like you'd get with over-fetch-then-filter.

Implementation Guide: Building Exclusion Filters Step by Step

Let's build a complete recommendation system with exclude_seen filtering, walking through each component.

Setting Up Your Data Schema

Start by defining your item table and interaction table. The interaction table tracks every user engagement:

setup_schemas.yaml
# tables/user_interactions.yaml
name: user_interactions
schema_type: CUSTOM
column_schema:
  user_id: String
  item_id: String
  interaction_type: String # 'view', 'click', 'purchase', 'save'
  timestamp: DateTime
  session_id: String

# tables/products.yaml
name: products
schema_type: CUSTOM
column_schema:
  item_id: String
  name: String
  description: String
  category: String
  price: Float
  inventory_count: Integer
  created_at: DateTime

# Upload these schemas using the Shaped CLI:
$ shaped create-table --file tables/user_interactions.yaml
$ shaped create-table --file tables/products.yaml

Configuring the Engine with Filters

Now create your engine configuration with the exclude_seen filter:

engine_config.yaml
# engine_config.yaml
name: product_recommendations

data:
  item_table: products
  user_table: users
  interaction_table: user_interactions
  
  # Define the exclude_seen filter
  filters:
    - name: exclude_seen
      filter_type:
        type: personal_filter
        user_id_column: user_id
        item_id_column: item_id
      filter_table:
        type: query
        query: >
          SELECT DISTINCT item_id, user_id
           FROM user_interactions
          WHERE interaction_type IN ('view', 'click', 'purchase')
            AND timestamp > now() - INTERVAL '90 days'

index:
  embeddings:
    - name: product_content_embedding
      encoder:
        type: hugging_face
        model_name: "sentence-transformers/all-MiniLM-L6-v2"
        batch_size: 256
        item_fields: [name, description, category]
    - name: als_embedding
      encoder:
        type: trained_model
        model_ref: als

training:
  models:
    - name: als
      policy_type: als
    - name: click_through_rate
      policy_type: elsa
      label_column: label
      prediction_type: binary_classification

This configuration creates:

  • A content embedding for semantic search over product text
  • A collaborative filtering embedding (ALS) based on user-item interactions
  • A click-through rate prediction model
  • An exclude_seen filter that looks at the last 90 days of interactions

Building Queries with Exclusion

With the filter configured, you can now build queries that automatically exclude seen items:

from shaped import RankQueryBuilder, Similarity, ColumnOrder

get_personalized_feed.py
# Basic personalized feed with exclusion
def get_personalized_feed(user_id: str, limit: int = 20):
    query = (
        RankQueryBuilder()
        .from_entity('item')
        .retrieve(
            # Content-based retrieval
            Similarity(
                embedding_ref='product_content_embedding',
                encoder={'type': 'interaction_round_robin', 'input_user_id': '$user_id'},
                limit=50
            ),
            # Collaborative filtering
            Similarity(
                embedding_ref='als_embedding',
                encoder={'type': 'precomputed_user', 'input_user_id': '$user_id'},
                limit=50
            ),
            # Popular items fallback
            ColumnOrder(
                columns=['_derived_popular_rank ASC'],
                limit=50
            )
        )
        .filter(
            predicate="prebuilt('exclude_seen', input_user_id='$user_id')"
        )
        .score(
            value_model='click_through_rate',
            input_user_id='$user_id'
        )
        .reorder(diversity=0.3)
        .limit(limit)
        .build()
    )
        
    return client.rank(query=query, parameters={'user_id': user_id})

Handling Edge Cases

New users with no history: The filter gracefully handles users with zero interactions—it simply doesn't filter anything, returning pure ranked results.

recency_logic.py / config.yaml
# No special handling needed - this works for both new and existing users
results = get_personalized_feed(user_id="brand_new_user")

# Time windows for recency: You might want to exclude only recently seen items, 
# allowing re-recommendations after 30 days:

filters:
  - name: exclude_recent_views
    filter_type:
      type: personal_filter
      user_id_column: user_id
      item_id_column: item_id
    filter_table:
      type: query
      query: >
        SELECT item_id, user_id
         FROM user_interactions
        WHERE interaction_type = 'view'
          AND timestamp > now() - INTERVAL '30 days'

Combining multiple filters: You can apply multiple prebuilt filters in a single query:

exclusion_query.sql
SELECT *
FROM similarity(embedding_ref='product_embedding', limit=100)
WHERE prebuilt('exclude_seen', input_user_id='$user_id')
  AND prebuilt('exclude_out_of_stock', input_user_id='$user_id')
  AND price <= 200
LIMIT 20

Filter-specific interaction types: Different recommendation surfaces might exclude different interaction types. A "browse again" widget might exclude purchases but allow views, while a main feed excludes everything:

filters_config.yaml
filters:
  - name: exclude_purchased
    filter_table:
      type: query
      query: >
        SELECT item_id, user_id
         FROM user_interactions
        WHERE interaction_type = 'purchase'
        
  - name: exclude_all_interactions
    filter_table:
      type: query
      query: >
        SELECT item_id, user_id
         FROM user_interactions
        WHERE interaction_type IN ('view', 'click', 'save', 'purchase')

Advanced Patterns: Beyond Basic Exclusion

Once you have basic exclusion working, several advanced patterns become possible.

Pattern 1: Category-Aware Exclusion for Diversity

Instead of excluding all seen items, exclude items from categories the user has already explored recently. This maintains diversity while avoiding exact duplicates:

exploration_filter.yaml
filters:
  - name: exclude_explored_categories
    filter_type:
      type: personal_filter
      user_id_column: user_id
      item_id_column: item_id
    filter_table:
      type: query
      query: >
        WITH user_categories AS (
          SELECT u.user_id, i.category, COUNT(*) as interaction_count
          FROM user_interactions u
          JOIN products i ON u.item_id = i.item_id
          WHERE u.timestamp > now() - INTERVAL '7 days'
          GROUP BY u.user_id, i.category
          HAVING COUNT(*) >= 5  -- Saturated on this category
        )
        SELECT p.item_id, uc.user_id
        FROM products p
        JOIN user_categories uc ON p.category = uc.category

Pattern 2: Time-Decayed Exclusion

Instead of hard filtering, apply a penalty to recently seen items using value models. Items seen yesterday get a large penalty, items seen 30 days ago get a small penalty:

dynamic_scoring.sql
SELECT *
FROM similarity(embedding_ref='content_embedding', limit=100)
ORDER BY score(
  expression='
    click_through_rate - 
    CASE 
      WHEN item.last_seen_days < 1 THEN 10.0
      WHEN item.last_seen_days < 7 THEN 5.0
      WHEN item.last_seen_days < 30 THEN 1.0
      ELSE 0.0
    END
  ',
  input_user_id='$user_id'
)
LIMIT 20

This requires adding a view that computes last_seen_days:

items_with_recency.yaml
views:
  - name: items_with_recency
    view_type: SQL
    query: >
      SELECT
        i.*,
        COALESCE(
          EXTRACT(days FROM (now() - MAX(u.timestamp))),
          999
        ) as last_seen_days
      FROM products i
      LEFT JOIN user_interactions u
        ON i.item_id = u.item_id
        AND u.user_id = $input_user_id
      GROUP BY i.item_id

Pattern 3: Collaborative Exclusion

Exclude items that similar users have already engaged with extensively. This is useful for discovery feeds where you want to surface items that your peer group hasn't saturated:

peer_saturation_filter.yaml
filters:
  - name: exclude_peer_saturated
    filter_type:
      type: personal_filter
      user_id_column: user_id
      item_id_column: item_id
    filter_table:
      type: query
      query: >
        WITH similar_users AS (
          -- Find users with similar taste (simplified)
          SELECT u2.user_id as similar_user_id
          FROM user_interactions u1
          JOIN user_interactions u2
             ON u1.item_id = u2.item_id
             AND u1.user_id != u2.user_id
          WHERE u1.user_id = $input_user_id
          GROUP BY u2.user_id
          HAVING COUNT(DISTINCT u1.item_id) >= 5
          ORDER BY COUNT(DISTINCT u1.item_id) DESC
          LIMIT 50
        )
        SELECT u.item_id, $input_user_id as user_id
        FROM user_interactions u
        JOIN similar_users s ON u.user_id = s.similar_user_id
        GROUP BY u.item_id
        HAVING COUNT(DISTINCT u.user_id) >= 30  -- 30+ similar users engaged

Pattern 4: Session-Aware Filtering

For conversational agents, exclude items mentioned earlier in the current conversation without excluding items from previous sessions:

chat_recommendations.py
# Track session-specific exclusions
def get_chat_recommendations(user_id: str, session_id: str, mentioned_items: list):
    query = (
        RankQueryBuilder()
        .from_entity('item')
        .retrieve(
            Similarity(
                embedding_ref='content_embedding',
                encoder={'type': 'interaction_round_robin', 'input_user_id': '$user_id'},
                limit=50
            )
        )
        .filter(
            # Exclude items mentioned in THIS conversation
            predicate=f"item.item_id NOT IN {tuple(mentioned_items)}"
        )
        .score(value_model='click_through_rate', input_user_id='$user_id')
        .limit(10)
        .build()
    )
        
    return client.rank(query=query, parameters={'user_id': user_id})

Pattern 5: Progressive Exploration

Combine exclusion with exploration to progressively expand the user's horizon. Start with close-to-history recommendations, then gradually introduce more diverse items:

progressive_feed.py
def get_progressive_feed(user_id: str, diversity_level: float = 0.0):
    """
    diversity_level: 0.0 = only exclude exact matches
                    0.5 = exclude similar items
                      1.0 = exclude entire explored clusters
    """
    query = (
        RankQueryBuilder()
        .from_entity('item')
        .retrieve(
            Similarity(
                embedding_ref='content_embedding',
                encoder={'type': 'interaction_round_robin', 'input_user_id': '$user_id'},
                limit=100
            )
        )
        .filter(
            predicate="prebuilt('exclude_seen', input_user_id='$user_id')"
        )
        .score(
            value_model='click_through_rate',
            input_user_id='$user_id'
        )
        .reorder(
            diversity=diversity_level,
            exploration=diversity_level * 0.5
        )
        .limit(20)
        .build()
    )
        
    return client.rank(query=query, parameters={'user_id': user_id})

# Use in conversation
turn_1 = get_progressive_feed(user_id='user123', diversity_level=0.0)  # Similar to history
turn_2 = get_progressive_feed(user_id='user123', diversity_level=0.3)  # Moderate diversity
turn_3 = get_progressive_feed(user_id='user123', diversity_level=0.6)  # High diversity

Hot Take: Client-Side Deduplication Is an Anti-Pattern

Here's a controversial opinion: if you're filtering seen items in your application code, you're doing it wrong. Full stop.

The industry has normalized this pattern. Nearly every recommendation tutorial shows filtering in the app layer. Major platforms run variations of this approach at scale. But that doesn't make it right—it makes it cargo-culted technical debt.

Filtering belongs in the retrieval system, not the application layer. Here's why this isn't just an optimization, it's an architectural imperative:

Retrieval systems should be state-aware by design. Just like databases maintain indexes and constraints, retrieval engines should maintain user state and apply it during candidate selection. Offloading this to the application is like offloading SQL WHERE clauses to application code—technically possible, but architecturally wrong.

State and ranking are coupled, not separate. You can't rank effectively without knowing what the user has seen. A model that predicts "click-through rate" needs to know if the item is novel or repeated. Separating state from ranking means your model optimizes for the wrong objective.

Scaling requires pushing logic down the stack. As query volume grows, you need systems that can apply filters efficiently at the data layer. Pulling millions of candidates into application memory to filter them scales linearly with traffic. Filtering in the engine scales logarithmically because it runs on indexed data.

The resistance to this idea comes from how we've historically built these systems. Vector databases didn't support user state, so we added it in Redis. Recommendation APIs didn't track interactions, so we added event streams. We've built complex distributed systems to compensate for retrieval engines that lack basic stateful capabilities.

But modern retrieval platforms like Shaped prove this complexity is unnecessary. When retrieval natively understands user history, filtering becomes declarative configuration instead of imperative code. You express "exclude items the user has seen" once in a config file, not repeatedly in every service that calls the API.

When to Use Exclude Seen (and When Not To)

Prebuilt exclusion filters aren't appropriate for every scenario. Here's a framework for deciding when they make sense:

Use Exclude Seen When:

You have high-engagement users. If users typically interact with 50+ items, exclusion prevents frustrating repetition. Content platforms, e-commerce sites, and social feeds fall into this category.

Recommendations refresh frequently. If users return daily or weekly expecting fresh content, exclusion is essential. Without it, you'll show the same popular items every time.

Your catalog is large relative to engagement. With 10,000 items and users who interact with 100, you have plenty of unseen options to recommend. Exclusion improves quality without exhausting the catalog.

You track interactions already. If you're logging views, clicks, or purchases for analytics, you already have the data needed for exclusion filters. The marginal cost is minimal.

You want to optimize for discovery. Exclusion forces the system to surface items users haven't found yet, expanding their awareness of your catalog.

Skip Exclude Seen When:

You have a small catalog. With 100 items and active users, exclusion might eliminate most of your inventory. A user who's seen 80 items has only 20 options left—you might want to re-recommend rather than scrape the bottom of the catalog.

Re-engagement is the goal. Email digest recommendations might benefit from showing items users engaged with before but didn't complete (e.g., "finish watching this series"). Exclusion would hide these opportunities.

Interactions don't signal saturation. A view doesn't mean the user is done with an item. News articles users "viewed" by scrolling past might deserve a second chance. In these cases, time-decay or soft penalties work better than hard exclusion.

You're doing pure exploration. For "random" or "serendipity" features where users explicitly want surprise, exclusion works against the goal. Let them encounter familiar items mixed with new ones.

Cold start is your primary challenge. If most users are new with few interactions, exclusion doesn't buy you much. Focus on solving cold start before worrying about repetition.

Decision Matrix

FactorUse ExclusionSkip Exclusion
Avg interactions per user>50<10
Catalog size>1000 items<100 items
Update frequencyDaily/hourlyWeekly/monthly
Recommendation goalDiscoveryRe-engagement
Interaction meaningSaturationIncomplete
User baseMostly returningMostly new

Common Pitfalls When Implementing Exclusion Filters

Even with prebuilt filters, there are ways to shoot yourself in the foot. Here are the five mistakes teams make most often:

Pitfall 1: Filtering Too Aggressively

Excluding everything a user has ever interacted with can backfire. A user who's been on your platform for three years might have "seen" 80% of your catalog. With aggressive filtering, you're left recommending from the dregs—low-quality items that didn't make the cut in previous sessions.

Solution: Use time windows or interaction types selectively. Exclude views from the last 30 days, but allow views from six months ago. Exclude purchases permanently (they own it), but allow clicks to resurface after time passes.

filter_config.yaml
# Good: Time-bounded exclusion
filters:
  - name: exclude_recent_views
    filter_table:
      type: query
      query: >
        SELECT item_id, user_id
         FROM user_interactions
        WHERE interaction_type = 'view'
          AND timestamp > now() - INTERVAL '30 days'

Pitfall 2: Ignoring Filter Performance

Filters run on every query. If your filter query scans millions of rows or does complex joins, you'll add 50-100ms to every request. Users notice this latency.

Solution: Ensure your filter query has proper indexes and uses efficient joins. The filter table should be materialized or use indexed lookups, not table scans.

optimization.yaml
# Bad: Full table scan on every query
query: SELECT item_id, user_id FROM user_interactions

# Good: Indexed lookup with time bounds
query: >
  SELECT item_id, user_id
   FROM user_interactions
  WHERE timestamp > now() - INTERVAL '90 days'
  -- Assumes index on (timestamp, user_id, item_id)

Pitfall 3: Not Tracking Impressions

You might exclude items the user clicked but forget to exclude items they scrolled past without engaging. If you show the same 20 items across multiple sessions because the user never clicked, you're still being repetitive.

Solution: Track impressions (items shown to the user) in addition to engagement. Update your interaction table with "shown" events.

log_impressions.py
# Log impressions, not just clicks
for item in recommendations:
    log_interaction(
        user_id=user_id,
        item_id=item['id'],
        interaction_type='shown',
        session_id=session_id
    )

Pitfall 4: Forgetting to Test Cold Start

Your exclusion logic might work great for active users but break the experience for new users with no history. An empty filter returns all candidates, but if you're not prepared for this, you might have bugs.

Solution: Explicitly test with users who have zero interactions. Make sure the fallback behavior (no filtering) is intentional and produces good results.

test_cold_start.py
# Test case: new user with no interactions
def test_cold_start_user():
    new_user_id = create_test_user(interaction_count=0)
    results = get_personalized_feed(new_user_id)
    
    assert len(results) == 20  # Should still get full results
    assert_popular_items_included(results)  # Should fall back to popularity

Pitfall 5: Over-Filtering in Multi-Retrieval Queries

When you use multiple retrievers (content similarity + collaborative filtering + trending), and all three return overlapping items, aggressive filtering might eliminate most candidates before scoring.

Solution: Retrieve more candidates per retriever (e.g., 100 instead of 50) when using exclusion filters. This ensures you have enough candidates after filtering.

adjust_limits.py
# Adjust candidate set size when filtering
query = (
    RankQueryBuilder()
    .from_entity('item')
    .retrieve(
        Similarity(embedding_ref='content', limit=100),  # Increased from 50
        Similarity(embedding_ref='collaborative', limit=100),
        ColumnOrder(columns=['popular_rank ASC'], limit=100)
    )
    .filter(predicate="prebuilt('exclude_seen', input_user_id='$user_id')")
    .limit(20)
    .build()
)

FAQ: Common Questions About Exclude Seen Filters

Q: How do I exclude items from the current session without affecting other sessions?

A: Use a separate session-based filter or pass session-specific items as parameters. For session context, you can use the item.item_id NOT IN clause with parameters:

dynamic_query.py
query = f"""SELECT * FROM similarity(embedding_ref='embedding', limit=100)
WHERE prebuilt('exclude_seen', input_user_id='$user_id') 
  AND item.item_id NOT IN {tuple(current_session_items)}
LIMIT 20"""

Q: What happens if the filter removes all candidates?

A: If all retrieved candidates are filtered out, you'll get fewer results than requested. To prevent empty results, retrieve significantly more candidates than you need (e.g., retrieve 200 to return 20 final results). You can also add fallback retrievers like popularity ranking that don't depend on personalization.

Q: Can I exclude items based on complex rules beyond simple seen/not-seen?

A: Yes. The filter query can use any SQL logic. For example, exclude items from categories the user has bought from five times, or exclude items similar to ones they've returned:

complex_filter.yaml
filters:
  - name: exclude_complex
    filter_table:
      type: query
      query: >
        SELECT p.item_id, u.user_id
        FROM products p
        JOIN user_interactions u ON p.category = (
          SELECT category FROM products WHERE item_id = u.item_id
        )
        WHERE u.interaction_type = 'purchase'
        GROUP BY p.item_id, u.user_id, p.category
        HAVING COUNT(*) >= 5

Q: How often do filters update when new interactions come in?

A: Shaped maintains filters as materialized views that update in real-time as new data arrives in your interaction table. The latency depends on your connector—streaming connectors (Kafka, PostgreSQL CDC) update within seconds, batch connectors within 15 minutes. For most applications, this is fast enough. If you need instant updates, use a streaming connector.

Q: Can I have different exclusion rules for different recommendation surfaces?

A: Absolutely. Create multiple filters with different names and reference them in the appropriate queries:

filters.yaml
filters:
  - name: exclude_all_interactions
    filter_table:
      query: SELECT item_id, user_id FROM user_interactions
      
  - name: exclude_only_purchases
    filter_table:
      query: >
        SELECT item_id, user_id
         FROM user_interactions
         WHERE interaction_type = 'purchase'

Then use prebuilt('exclude_all_interactions') in main feeds and prebuilt('exclude_only_purchases') in browse-again widgets.

Q: What's the performance impact of filtering on large user histories?

A: Negligible. Shaped maintains filters as pre-computed indexes optimized for lookups. Even for users with 100,000 interactions, filter application adds <5ms to query latency. The filter is essentially a hash set lookup, not a table scan.

Next Steps: Implementing Exclude Seen in Your System

Here's how to go from reading this article to having exclude_seen filters running in production:

Quick Start (Under 30 Minutes)

Step 1: Verify Your Interaction Data (5 min)

Check that you have an interaction table with user_id, item_id, and timestamp columns. If you don't, create one:

user_interactions.yaml
name: user_interactions
schema_type: CUSTOM
column_schema:
  user_id: String
  item_id: String
  interaction_type: String
  timestamp: DateTime

Step 2: Add Filter to Engine Config (10 min)

Open your engine YAML and add the filters section:

filter_config.yaml
data:
  filters:
    - name: exclude_seen
      filter_type:
        type: personal_filter
        user_id_column: user_id
        item_id_column: item_id
      filter_table:
        type: query
        query: >
          SELECT item_id, user_id
           FROM user_interactions
          WHERE timestamp > now() - INTERVAL '90 days'

Step 3: Update One Query to Use the Filter (10 min)

Modify an existing recommendation query to include the filter:

rank_evolution.py
# Before
query = RankQueryBuilder().from_entity('item').retrieve(...).limit(20).build()

# After
query = (
    RankQueryBuilder()
    .from_entity('item')
    .retrieve(...)
    .filter(predicate="prebuilt('exclude_seen', input_user_id='$user_id')")
    .limit(20)
    .build()
)

Step 4: Deploy and Test (5 min)

Deploy your updated engine config and test with a user who has interaction history:

terminal
$ shaped update-engine --file engine_config.yaml

# Test query
$ curl -X POST https://api.shaped.ai/v1/rank \
  -H "Authorization: Bearer $SHAPED_API_KEY" \
  -d '{
    "query": "SELECT * FROM similarity(...) WHERE prebuilt('"'"'exclude_seen'"'"', input_user_id='"'"'test_user'"'"') LIMIT 20"
  }'

Production Checklist

Before rolling out to all users, verify these items:

  • [ ] Filter query has proper indexes on timestamp and user_id columns
  • [ ] Time window for exclusion is appropriate (30-90 days typically)
  • [ ] Cold start users (no interactions) get reasonable results
  • [ ] Candidate retrieval sizes are increased to account for filtering (retrieve 100+ to return 20)
  • [ ] Monitoring is in place for query latency and filter performance
  • [ ] Edge cases are handled (all candidates filtered, user with 10,000+ interactions)
  • [ ] Interaction logging is reliable and captures all relevant events
  • [ ] Different interaction types (view, click, purchase) have appropriate filters
  • [ ] A/B test is configured to measure impact on engagement metrics

Getting Help

If you encounter issues:

  • Documentation: Full filter reference at docs.shaped.ai
  • Slack Community: Join the Shaped community Slack for real-time help
  • Support: Email support@shaped.ai with your engine config and error logs
  • Office Hours: Schedule a session with the Shaped team for a demo

Conclusion: Stateful Retrieval Is the Future

The shift from stateless to stateful retrieval systems mirrors the evolution of databases in the 1970s. Early database systems were simple stores—you queried them, they returned data, that's it. As applications grew complex, databases absorbed more logic: constraints, triggers, stored procedures, and indexes. The same progression is happening in retrieval.

For too long, we've treated retrieval engines as stateless functions: send a query, get candidates, filter and rank in application code. This made sense when vector databases were pure similarity search, nothing more. But modern retrieval demands more—personalization, behavioral signals, business rules, and yes, state awareness.

Exclude_seen filters represent a small but significant piece of this evolution. They move state management from fragile application logic into the retrieval layer where it belongs. The result is faster queries, simpler code, better rankings, and users who trust your recommendations because they never see the same content twice.

The broader trend is clear: retrieval systems will absorb more intelligence traditionally implemented in application layers. Filtering, ranking, personalization, exploration, and real-time adaptation will all migrate down the stack into purpose-built engines. The applications that embrace this shift early will scale better and iterate faster than those clinging to the old client-side patterns.

Start small. Add exclude_seen to one recommendation surface. Measure the impact on engagement and user satisfaction. Then expand from there. Your users will notice the difference immediately—and they'll wonder why they ever had to see the same recommendations twice.

Ready to implement stateful retrieval? Try Shaped for free with $100 credits here.

Get up and running with one engineer in one sprint

Guaranteed lift within your first 30 days or your money back

100M+
Users and items
1000+
Queries per second
1B+
Requests

Related Posts

Nic Scheltema
 | 
November 14, 2024

Understanding Graph Convolutional Neural Networks for Web-Scale Recommender Systems

Tullie Murrell
 | 
June 4, 2025

Glossary: Streaming Personalization

Tullie Murrell
 | 
June 1, 2025

Glossary: Collaborative Filtering