How to Deploy a Production Two-Tower Model in Less Than a Day

If you’ve read our Deep Dive on Two-Tower Models, you already know the theory: two independent neural networks, one for the user, one for the item, mapping features into a shared embedding space for lightning-fast retrieval. But there is a massive gap between understanding the math of Two-Tower models and actually productionizing them. Usually, this requires a team of ML engineers to manage PyTorch training loops, GPU clusters, and specialized vector databases for Approximate Nearest Neighbor (ANN) search.‍Shaped closes that gap. In this post, we’ll walk through how to deploy a production-grade Two-Tower model using Shaped, from data ingestion to sub-50ms retrieval.

Read our Deep Dive on Two-Tower Models here.

The Scenario: Scaling a Global Fashion Marketplace

Imagine you are building the discovery engine for a high-growth fashion marketplace. You have millions of users and tens of millions of unique SKUs.

  • The Problem: Pure collaborative filtering (like ALS) fails because your inventory is highly ephemeral, items sell out and new arrivals appear every second. This is the "Cold Start" problem; you don't have historical data for new items, so a traditional model doesn't know who to show them to.
  • The Solution: A Two-Tower Model. By using neural networks to encode both user preferences and item metadata (brand, material, style), the model can recommend a brand-new item the moment it’s listed, as long as its features "look" like something the user enjoys.

What is Shaped?

Shaped is a retrieval database that enables you to build state-of-the-art discovery systems without the infrastructure overhead. While a standard vector database just stores and searches embeddings, Shaped handles the entire Machine Learning lifecycle:

  • Data Layer: Connects to your production data via 20+ native connectors.
  • Intelligence Layer: Automatically trains models (like Two-Tower, ELSA, or BERT4Rec) on your data and handles embedding generation.
  • Query Layer: Serves ranked results in real-time via a familiar SQL-like interface called ShapedQL.

By using Shaped, you move from raw data to a deployed Two-Tower model in a single configuration file, offloading the GPU management and pipeline engineering to us.

Step 1: Connecting Your Data

A Two-Tower model is only as smart as the features you give it. You need three datasets: User attributes, Item attributes, and the Interaction stream (clicks, likes, purchases).

Shaped gives you the flexibility to connect these however you work. You can select native connectors for BigQuery or Snowflake in the Shaped Console, or use the Python SDK to define your tables programmatically.

main.py
import shaped

client = shaped.Client(api_key="your_api_key")

# Connect your Fashion Marketplace tables via the Python SDK
client.create_table(
    name="clothing_catalog",
    schema_type="POSTGRES",
    connection_config={
        "host": "db.fashion-market.com",
        "database": "inventory",
        "table": "products"
    },
    replication_key="updated_at" 
)

# You can also stream real-time interaction events 
# (clicks, purchases) via our Custom API or Segment

Step 2: Defining the Two-Tower Engine

In a traditional setup, you’d now have to write complex feature engineering and training code. In Shaped, you define your towers in a declarative YAML configuration.

You can upload this via the Shaped CLI (shaped create-engine --file tower_config.yaml) or set it up directly in the Dashboard.

tower_config.yaml
# tower_config.yaml
version: v2
name: fashion_two_tower_engine

data:
  item_table: { name: "clothing_catalog" }
  user_table: { name: "user_profiles" }
  interaction_table: { name: "user_clicks" }

training:
  models:
    - name: fashion_retrieval_model
      policy_type: two-tower
      # The networks learn to map these features into the same space
      user_fields: [location, style_preferences, gender]
      item_fields: [brand, category, color, material_description]

index:
  embeddings:
    - name: fashion_embeddings
      encoder:
        type: trained_model
        model_ref: fashion_retrieval_model

What happens next? Shaped automates the heavy lifting:

  1. Training: It provisions GPUs and trains the neural networks on your interaction data.
  2. Indexing: It pre-computes the embeddings for your entire catalog (the Item Tower).
  3. Deployment: It hosts the User Tower as a real-time service for sub-50ms inference.

Step 3: High-Performance Querying with ShapedQL

At query time, your application sends a request to the Shaped Query API. Shaped takes the user_id, runs it through the live User Tower to create a real-time embedding, and performs an ANN search against the Item Tower index.

You can execute and test this via ShapedQL in the Query Console or through your backend SDK.

query.sql
-- Retrieve the top 50 items matched via the Two-Tower model,
-- then filter by the user's specific size and real-time inventory.
SELECT *
FROM similarity(
    embedding_ref='fashion_embeddings',
    encoder='precomputed_user',
    input_user_id=$user_id,
    limit=50
)
WHERE size IN ('M', 'L') 
  AND inventory_count > 0
LIMIT 20

The Outcome: Why Engineers Use Shaped for Two-Tower

Productionizing this architecture yourself usually takes months of engineering effort. With Shaped, you get the benefits of state-of-the-art neural retrieval in days:

  • Solve the Cold Start: Because the Item Tower encodes metadata (brand, style, color), new items are recommended the moment they are indexed based on their content similarity.
  • Sub-50ms Latency: By using our fast_tier (Redis-backed) serving, you can calculate user embeddings and search millions of items in a blink.
  • Hands-Free MLOps: You don't need to manage GPU instances, re-indexing pipelines, or versioning. When your data updates, Shaped automatically refreshes the towers and the index.
  • Deterministic Control: Combine neural retrieval with standard SQL WHERE clauses to ensure you never recommend an out-of-stock item or a size that doesn't fit.

Stop managing infrastructure and start building discovery.

Want to try Shaped with your own data? Sign up for a free trial with $300 credits here.

Get up and running with one engineer in one sprint

Guaranteed lift within your first 30 days or your money back

100M+
Users and items
1000+
Queries per second
1B+
Requests

Related Posts

Heorhii Skovorodnikov
 | 
April 28, 2023

Not your average RecSys metrics. Part 1: Serendipity

Tullie Murrell
 | 
July 14, 2025

Gowalla Dataset: Understanding Location Check-ins, Social Ties, and Mobility Patterns

Nic Scheltema
 | 
August 6, 2025

Measuring Personalization: Are Your Recommendations Truly Unique?