Keep Shoppers Engaged: Powering "Similar Items" Carousels on PDPs

Product Detail Pages (PDPs) are critical decision points, but when a featured item isn’t quite right, showing relevant alternatives can keep users engaged and reduce drop-off. Building "Similar Items" recommendations usually involves complex pipelines: structured metadata, collaborative filtering, embeddings, and vector search. This article breaks down the standard approaches and challenges, then shows how Shaped simplifies it all with a single API call that blends content and behavioral signals to deliver high-quality, low-latency similar item recommendations.

Beyond the Main Product: The Power of Alternatives

Product Detail Pages (PDPs) are critical decision points in the e-commerce journey. A user has shown interest in a specific item, but it might not be exactly right – perhaps the color is off, the price is slightly too high, or they simply want to see comparable options before committing. Leaving them at a dead end if the featured product isn't perfect risks losing the sale entirely. This is where "Similar Items" or "More Like This" recommendations become invaluable.

By showcasing relevant alternatives directly on the PDP, businesses can keep users engaged, facilitate product discovery, increase the likelihood of finding the right product, and ultimately boost conversion rates and average order value. Unlike cross-sells or upsells like "Frequently Bought Together", similar item recommendations focus on providing comparable choices to the item currently being viewed. Building the intelligence to accurately identify true similarity, however, involves tackling significant technical hurdles.

The Standard Approach: Engineering Item Similarity

Determining which items are genuinely similar to another involves understanding complex relationships based on product attributes, user behavior, or often both. Building this capability typically requires:

Step 1: Gathering and Preparing Product Data

Rich, accurate product metadata is the foundation but rare in practice. 

  • Identify & Integrate Data Sources: Product catalog data including titles, descriptions, categories, brands, colors, materials, price, images, and any other relevant attributes.
  • Data Cleaning & Structuring: Ensure consistency and accuracy in metadata (e.g., standardizing "Blue" vs "blue").
  • Feature Engineering (for Content-Based): Extract meaningful features from text (keywords, TF-IDF scores and text embeddings) or images (visual embeddings).

The Challenge: Requires clean, comprehensive metadata. Feature engineering for text/images demands specialized machine-learning and data science skills.

Step 2: Modeling Similarity (Content-Based)

This approach defines similarity based on the items' inherent characteristics.

  • Algorithm Selection:
    • Simple: Basic attribute matching (e.g., same category and brand). Often too crude.
    • Advanced: Calculate similarity based on text descriptions (e.g., cosine similarity on TF-IDF or sentence-transformer embeddings) or image features (e.g., similarity between image embeddings generated by CNNs).
  • Model Training & Infrastructure: Requires ML frameworks, potentially significant compute for training embedding models, and often vector databases (like Pinecone, Weaviate, Turbopuffer, LanceDB) for efficient similarity lookups.

The Challenge: Requires deep ML expertise, complex model training pipelines, and specialized infrastructure (vector databases) for scalable lookups. Defining "similarity" based purely on content can miss nuances captured by user behavior.

Step 3: Modeling Similarity (Collaborative Filtering)

This approach defines similarity based on how users interact with items.

  • Identify & Integrate Interaction Data: Collect user interaction data like product views, clicks, add-to-carts, and purchases. Focus on co-occurrence patterns (e.g., items frequently viewed together, items bought in the same session).
  • Algorithm Selection: Implement item-to-item collaborative filtering algorithms (e.g., based on matrix factorization techniques like ALS or SVD, or analyzing co-occurrence matrices).
  • Data Processing & Calculation: Requires processing large volumes of interaction data to build user-item matrices or co-occurrence tables and calculate similarity scores.

The Challenge: Needs substantial amounts of interaction data to be effective. Suffers from the "cold start" problem for new items with few interactions. Doesn't understand content similarity for items with sparse interaction data.

Step 4: Hybrid Approaches and Serving

Often, the best results come from combining content and collaborative signals.

  • Blending Logic: Develop strategies to weigh and combine scores from content-based and collaborative models.
  • Serving Infrastructure: Build or configure APIs to retrieve similar item candidates (e.g., from a vector DB or pre-computed lists) with low latency when a user visits a PDP. Needs to handle filtering (e.g., remove the item being viewed, filter by stock).
  • Pre-computation vs. Real-time: Decide whether to pre-compute all pairwise similarities (can be computationally expensive and potentially stale) or perform lookups in real-time.

The Challenge: Blending scores effectively is complex. Serving low-latency recommendations requires optimized infrastructure. Managing pre-computation jobs adds operational overhead.

Step 5: Monitoring, A/B Testing, and Iteration

Continuously refining the similarity logic.

  • Key Metrics: Track CTR on similar items module, conversion rate from clicks, contribution to AOV.
  • A/B Testing: Compare different similarity algorithms (content vs. collaborative vs. hybrid), blending strategies, or UI presentations.
  • Analysis & Refinement: Ongoing analysis to understand what drives performance and iterate on the models.

The Challenge: Needs robust A/B testing frameworks and dedicated resources for analysis and optimization.

The Shaped Approach: Simplified Similarity with similar_items

Building robust similar item recommendations involves navigating a maze of data processing, complex ML modeling, and infrastructure management. Shaped dramatically simplifies this with its dedicated similar_items endpoint, powered by sophisticated models that automatically learn complex similarity patterns.

Shaped's underlying models (often leveraging Transformers) learn deep representations of items based on both their metadata (content) and how users interact with them (collaborative signals). The similar_items endpoint provides direct, low-latency access to this learned understanding.

How Shaped Streamlines Similar Item Recommendations:

  • Unified Data Integration: Connect your item metadata and user interaction data using Shaped's connectors. The same data used for personalization feeds the similarity understanding.
  • Automated Model Training: Shaped handles the complex process of training models that intrinsically understand nuanced item similarities, blending content and collaborative signals automatically.
  • Dedicated similar_items API: A single, simple API call retrieves a list of items deemed most similar to a given item_id, based on the model's deep understanding.
  • Contextual Similarity (Optional): You can optionally provide a user_id to the similar_items call. This allows Shaped to potentially tailor the similarity context slightly based on that specific user's preferences and history, although the primary driver remains item-to-item similarity.
  • Managed Infrastructure: Shaped manages the complex model training, serving infrastructure, and low-latency API delivery needed for real-time recommendations.

Building a "Similar Items" Carousel with Shaped

Let's illustrate using Shaped's similar_items endpoint to populate a "More Like This" section on a PDP.

Goal: When a user views the PDP for ITEM_101, display a list of the 5 most similar items.

1. Ensure Data is Connected: Assume item_metadata (with fields like title, description, image_url, category, brand) and user_interactions datasets are connected and used for model training in Shaped.

2. Define Your Shaped Model (YAML): A standard recommendation model definition is usually sufficient. The model trained for personalized ranking (rank) often inherently learns the relationships needed for similar_items. Ensure item metadata fields are included.

product_similarity_model.yaml

1 model:
2   name: product_discovery_engine # Can power rank, similar_items etc.
3   connectors:
4   - type: Dataset
5     name: item_metadata
6     id: items
7   - type: Dataset # user interaction dataset
8     name: user_interactions
9     id: interactions
10   fetch:
11     items: |
12       SELECT
13         item_id,
14         title,
15         description, # Important for content understanding
16         category,
17         brand,
18         image_url,
19         product_url
20       FROM items
21     events: |
22       SELECT  
23         user_id,
24         item_id,
25         timestamp AS created_at,
26         event_value,
27         1 AS label # label is the objective of the model
28       FROM interactions

3. Create the Model:

create-model.sh
1 shaped create-model --file product_similarity_model.yaml
    

4. Monitor Training: Wait for the model product_discovery_engine to become ACTIVE.

view-product-model.sh

1 shaped view-model --model-name product_discovery_engine
    

5. Fetch Similar Items (Application Backend Logic): When a user lands on the PDP for ITEM_101:

  • Step A (Your Backend): Identify the item_id of the product being viewed ('ITEM_101'). Optionally, identify the user_id if the user is logged in and you want potentially contextualized similarity.
  • Step B (Your Backend): Call Shaped's similar_items API endpoint.
similar_items.py

1 from shaped import Shaped
2 
3 shaped_client = Shaped()
4 model_name = 'product_discovery_engine'
5 current_item_id = 'ITEM_101'
6 logged_in_user_id = 'USER_456' # Optional: set to None if user is anonymous
7 num_similar_items = 5
8 
9 response = shaped_client.similar_items(
10    model_name=model_name,
11    item_id=current_item_id,
12    # Optionally provide user_id for potentially contextualized similarity
13    user_id=logged_in_user_id if logged_in_user_id else None,
14    limit=num_similar_items,
15    return_metadata=True # Get full details for display
16 )
17 print(f"Found {len(similar_items_list)} similar items for {current_item_id}")
    

Example API Response (with return_metadata=False):

similar_items_response.json

1 {
2   "ids": [
3     "ITEM_427",
4     "ITEM_182",
5     "ITEM_332",
6     "ITEM_827",
7     "ITEM_403"
8   ]
9 }
    

(With return_metadata=True, each ID would be replaced/accompanied by its full metadata object)

  • Step C (Your Frontend): Use the list of similar items returned in the response.metadata to render the "Similar Items" carousel on the PDP.

Conclusion: Effortless Similarity, Deeper Engagement

Showing relevant similar items on PDPs is a powerful way to keep users engaged and guide them towards the perfect product. However, building the underlying intelligence traditionally requires complex data pipelines, sophisticated content-based or collaborative filtering models, and significant infrastructure management.

Shaped cuts through this complexity with its similar_items endpoint. By leveraging automatically trained models that understand nuanced item relationships from both content and user behavior, Shaped allows you to easily integrate powerful similar item recommendations with a simple API call. Reduce development time, eliminate infrastructure headaches, and start providing more engaging product discovery experiences today.

Ready to add powerful "Similar Items" recommendations to your PDPs?

Request a demo of Shaped today to see the similar_items endpoint in action. Or, start exploring immediately with our free trial sandbox.

Get up and running with one engineer in one sprint

Guaranteed lift within your first 30 days or your money back

100M+
Users and items
1000+
Queries per second
1B+
Requests

Related Posts

Daniel Camilleri
 | 
April 25, 2023

Part 1: How much data do I need for a recommendation system?

Nic Scheltema
 | 
October 16, 2024

Recommender Systems: The Rise of Graph Neural Networks

Javier Jorge Cano
 | 
January 24, 2023

Whisper 🤫 : A multilingual and multitask robust ASR model