Beyond Retrieval: Optimizing Relevance with Reranking

Retrieving a strong list of candidate items is just the first step—the real challenge is ranking them in the most relevant, personalized order for each user and goal. This post explores how reranking transforms basic search results or recommendations into truly optimized experiences, the technical hurdles of building high-performance reranking systems, and why mastering reranking is key to delivering better engagement, clicks, and conversions.

In the world of search and recommendations, getting a relevant set of candidate items is only half the battle. You might have a great keyword search engine pulling back documents, a rule-based system generating initial product suggestions, or even another recommendation model providing a baseline list. But are these candidates ordered in the best possible way for each individual user and your specific business goals? Often, the answer is no. The initial retrieval step might prioritize keyword density, broad category matches, or simple popularity, missing the nuanced signals of personal relevance.

This is where reranking comes in. Reranking takes a pre-existing list of candidate items and intelligently reorders them using more sophisticated models or objectives, such as deep personalization, optimizing for click-through rate, or balancing multiple goals. It allows you to leverage investments in existing retrieval systems while layering on powerful, context-aware optimization. However, building a custom, high-performance reranking system is a complex undertaking.

The Standard Approach: Building a Custom Reranking Layer

Adding a sophisticated reranking layer on top of an existing candidate generation system typically involves these challenging steps:

Step 1: Candidate Generation (The Prerequisite)

  • Method: Use your existing system (e.g., Elasticsearch, Solr, a database query, a rules engine, a basic recommendation model) to generate an initial list of candidate item IDs based on the context (search query, user location, category page, etc.).
  • The Challenge: While this step is assumed complete, the quality and diversity of these candidates significantly impact the potential of the reranking step.

Step 2: Gathering Data for Reranking

  • Identify & Integrate Data Sources: For each candidate item, you need its features (metadata like category, price, publish date, text descriptions). You also need rich user interaction history and potentially real-time user context.
  • Data Joining & Feature Engineering: In real-time, fetch features for all candidates and combine them with user data to create input vectors for the ranking model. This often requires complex, low-latency data lookups and feature transformations.

The Challenge: Joining disparate data sources (candidate source, item catalog, user profiles, interaction logs) in real-time with low latency is a major engineering hurdle.

Step 3: Building Sophisticated Ranking Models

  • Algorithm Selection: Simple scoring rules are insufficient. Requires advanced machine learning models, often Learning-to-Rank (LTR) approaches (like LambdaMART, RankNet) or deep learning models that can understand complex interactions between user, item, and context features.
  • Model Training & Optimization: Needs large labeled datasets (e.g., search logs with clicks), specialized ML frameworks, significant compute resources for training, and expertise in LTR techniques to optimize for specific metrics (like NDCG, MAP, CTR).

The Challenge: Requires deep ML/LTR expertise, significant infrastructure for training, and robust MLOps practices for experimentation and deployment.

Step 4: Real-Time Scoring and Serving Infrastructure

  • Low-Latency Inference: Deploy the trained ranking model behind a high-throughput, low-latency API endpoint.
  • Scalability & Reliability: Ensure the reranking service can handle peak traffic loads and is fault-tolerant.

The Challenge: Building and managing scalable, low-latency ML model serving infrastructure is operationally intensive.

Step 5: Handling Real-Time or Unseen Items

  • Feature Availability: What happens if a candidate item is brand new and its features aren't yet fully ingested into the feature store used by the ranking model? The system needs graceful handling or ways to use features provided directly.
  • Real-time Feature Updates: Incorporating very fresh item features (e.g., just-updated stock levels, breaking news relevance) into the ranking model in real-time adds another layer of complexity.

The Challenge: Standard feature stores might have latency, making it hard to rank based on truly real-time information or on items unknown to the main catalog.

The Shaped Approach: Seamless Reranking with the rank API

Building a custom reranking layer is often complex and resource-intensive. Shaped offers a much simpler and more powerful solution by allowing you to leverage its sophisticated, pre-trained ranking models via the rank API, specifically using the item_ids or item_features parameters.

You bring the candidates; Shaped provides the state-of-the-art, personalized ranking intelligence.

How Shaped Streamlines Reranking:

  1. Leverage Existing Retrieval: Keep using your existing search engine, database query, or rule engine to generate the initial candidate set.
  2. Provide Candidates to Shaped: Pass the list of candidate item IDs directly to Shaped's rank API.
  3. Two Flexible Methods:
    • item_ids Parameter: Provide a list of candidate item IDs. Shaped will look up these items and their features within its own data catalog (built from your connected datasets) and use your trained model to score and reorder them based on the user's context and learned preferences. Ideal when candidates are known items within Shaped.
    • item_features Parameter: Provide both the item IDs and their relevant features directly within the API call as a structured dictionary. Shaped uses these supplied features immediately for ranking, bypassing its internal catalog lookup. Perfect for reranking items not yet ingested by Shaped, using real-time features, or integrating with systems where features are readily available alongside IDs.
  4. Sophisticated Ranking: Shaped applies its powerful models (trained on your data, understanding user preferences, item attributes, and context) to intelligently reorder the provided candidates.
  5. Managed ML & Infrastructure: Shaped handles the training, deployment, scaling, and maintenance of the underlying ranking models and serving infrastructure.

Implementing Reranking with Shaped: A Conceptual Example

Let's illustrate reranking search results obtained from an external search engine (e.g., Elasticsearch).

Goal: Take the top 50 search results from Elasticsearch for a user's query and use Shaped to rerank them for personalization.

1. Define Your Shaped Model (Foundation): You need a standard Shaped model trained on your user interactions and item metadata. This model learns the user preferences and item relationships needed for personalized ranking.

reranking_model.yaml
1# reranking_model.yaml
2model:
3  name: personalized_reranker_v1
4connectors:
5  # Connect your item catalog (so Shaped knows features for method 1)
6  - type: Dataset
7    name: product_catalog
8    id: items
9  # Connect user interactions (to learn preferences)
10  - type: Dataset
11    name: user_events
12    id: interactions
13fetch:
14  items: |
15    SELECT item_id, title, description, category, price, image_url
16    FROM items
17  events: |
18    SELECT user_id, item_id, timestamp AS created_at, event_type FROM interactions

2. Create the Model & Monitor Training:

CLI
1 shaped create-model --file reranking_model.yaml
2 shaped view-model --model-name personalized_reranker_v1 # Wait for ACTIVE

3. Fetch Candidates and Rerank (Application Backend Logic):

  • Step A (Your Backend): User performs a search query. Your backend queries your external search engine (e.g., Elasticsearch).
search_candidates.py
1# Assume 'elasticsearch_client' is your ES client instance
2search_query = "wireless headphones"
3user_id = "USER_ABC"
4num_candidates = 50
5# Get initial candidate IDs from external search engine
6es_response = elasticsearch_client.search(
7    index="products",
8    body={"query": {"match": {"description": search_query}}},
9    size=num_candidates
10)
11candidate_item_ids = [hit['_id'] for hit in es_response['hits']['hits']]
12# candidate_item_ids is now a list like ['prod_101', 'prod_555', 'prod_213', ...]
  • Step B (Your Backend - Method 1: Using item_ids): If all candidate items are expected to exist in Shaped's catalog.
rank_candidates.py
 1 from shaped import Shaped
 2 shaped_client = Shaped()
 3 model_name = 'personalized_reranker_v1'
 4 try:
 5     rerank_response = shaped_client.rank(
 6         model_name=model_name,
 7         user_id=user_id,
 8         item_ids=candidate_item_ids, # Pass the list of IDs from ES
 9         # No 'limit' needed here usually, as we rank the provided list
10         return_metadata=True # Get full details for display
11     )
12     if rerank_response and rerank_response.ids:
13         personalized_results = rerank_response.metadata or [{'id': id} for id in rerank_response.ids]
14         print(f"Reranked {{len(personalized_results)}} items using item_ids.")
15         # ... Render these personalized_results in the search UI ...
16     else:
17         print("Reranking with item_ids failed or returned empty.")
18         # Fallback: Show original ES results?
19 except Exception as e:
20     print(f"Error reranking with item_ids: {{e}}")
21     # Fallback logic
    
  • Step C (Your Backend - Method 2: Using item_features): If candidates might be new, or you have real-time features available directly from the search result.
rank_candidates_with_features.py
 1 # Assume your ES hit includes some features needed for ranking
 2 # Construct the item_features dictionary
 3 item_features_dict = {
 4     "item_id": [],
 5     "category": [],
 6     "price": [],
 7     # Add other features defined in your Shaped model's fetch.items query
 8 }
 9 for hit in es_response['hits']['hits']:
10     item_features_dict["item_id"].append(hit['_id'])
11     item_features_dict["category"].append(hit['_source'].get('category', 'unknown'))
12     item_features_dict["price"].append(hit['_source'].get('price', 0.0))
13     # ... populate other features ...
14 
15 try:
16     rerank_response_features = shaped_client.rank(
17         model_name=model_name,
18         user_id=user_id,
19         item_features=item_features_dict, # Pass the dictionary of features
20         return_metadata=True # Often useful even when providing features
21     )
22 
23     if rerank_response_features and rerank_response_features.ids:
24         personalized_results_feat = rerank_response_features.metadata or [{'id': id} for id in rerank_response_features.ids]
25         print(f"Reranked {{len(personalized_results_feat)}} items using item_features.")
26         # ... Render these results ...
27     else:
28         print("Reranking with item_features failed or returned empty.")
29         # Fallback
30 
31 except Exception as e:
32     print(f"Error reranking with item_features: {{e}}")
33     # Fallback logic
    

(Node.js examples would follow similar logic, structuring the itemIds array or itemFeatures object correctly)

  • Step D (Your Frontend): Display the reranked, personalized list of results to the user.

Conclusion: Add Intelligence, Not Infrastructure

You've already invested in systems to retrieve relevant items – whether it's a powerful search engine, curated lists, or a basic recommendation algorithm. Shaped's reranking capability allows you to elevate these systems by adding a layer of sophisticated, personalized ranking without building and maintaining complex ML infrastructure yourself.

By simply providing candidate item IDs (using item_ids) or even item features directly (using item_features) to the Shaped rank API, you leverage state-of-the-art ranking models trained on your specific data. Improve the relevance of your search results, refine curated feeds, and optimize any list of items for maximum user engagement and conversion, all with minimal integration effort.

Ready to optimize your existing item lists with personalized reranking?

Request a demo of Shaped today to see how reranking can enhance your use case. Or, start exploring immediately with our free trial sandbox.

Get up and running with one engineer in one sprint

Guaranteed lift within your first 30 days or your money back

100M+
Users and items
1000+
Queries per second
1B+
Requests

Related Posts

Tullie Murrell
 | 
March 5, 2025

Introducing Value Modeling: A Control Panel for Your Business Objectives

Amarpreet Kaur
 | 
January 13, 2025

Cosine Similarity: Not the Silver Bullet We Thought It Was

Jazlyn Lin
 | 
August 30, 2022

Exploration vs. Exploitation in Recommendation Systems