Connect Your Users: Building "People to Follow" Recommendations

Helping users discover people they’d genuinely want to follow is a cornerstone of engaging digital communities, but building smart, scalable user similarity systems is notoriously complex. Traditional approaches involve messy data pipelines, heavyweight ML infrastructure, and cold-start problems that slow down development. In this article, we explore how Shaped radically simplifies user-to-user recommendations through its similar_users API, letting you power “People to Follow” and “Suggested Connections” features with just a single call, all backed by real-time behavioral data and deep learned user embeddings.

Fostering Connections in Digital Communities

Social networks, community forums, marketplaces, and collaborative platforms thrive on connection. Helping users discover and connect with others who share their interests, behaviors, or roles is fundamental to growth, engagement, and creating a vibrant ecosystem. A common and powerful way to facilitate this is through "People to Follow," "Suggested Connections," or "Similar Users" recommendations.

Surfacing relevant profiles for a user to connect with can spark new interactions, expand their network, expose them to new content or opportunities, and ultimately make the platform more engaging and valuable. However, identifying who is genuinely similar or relevant to a specific user is a complex task. Simply suggesting friends-of-friends or matching basic profile tags often misses the deeper signals that drive meaningful connections. Building a system to intelligently identify these affinities is a significant technical undertaking.

The Standard Approach: Engineering User Similarity

Determining which users are "similar" requires analyzing potentially vast amounts of data about their profiles, behaviors, and interactions. Building this capability from scratch typically involves:

Step 1: Gathering User Profile and Interaction Data

You need comprehensive data about your users.

  • Identify & Integrate User Profiles: Collect explicit profile information like location, job title, declared interests, bio text, skills, etc.
  • Identify & Integrate Interaction Data: Gather data on user actions: content they create, like, share, bookmark, comment on; profiles they view; items they buy/sell; groups they join; people they already follow/interact with; people they are friends with.
  • Data Cleaning & Pipelines: Ensure profile data consistency and build reliable pipelines to ingest both profile updates and ongoing user interactions.

The Challenge: Requires clean, well-structured profile data (which users may not always provide) and robust tracking of diverse interaction types.

Step 2: Modeling Similarity (Attribute-Based)

This approach matches users based on their explicit profile characteristics.

  • Algorithm Selection: Implement logic to calculate similarity based on shared attributes (e.g., number of matching interests, same location/industry). Can involve simple matching or more complex text analysis on bios/descriptions.
  • Implementation: Requires systems to store and efficiently query user profiles based on attribute filters.

The Challenge: Similarity can be superficial. Relies heavily on users having complete and accurate profiles. Misses similarity based on actual behavior or interests not explicitly stated.

Step 3: Modeling Similarity (Behavior-Based - Collaborative Filtering)

This approach finds users who act similarly or like similar things.

  • Algorithm Selection:
    • User-User Collaborative Filtering: Calculate similarity scores between users directly based on overlapping interactions (e.g., liking the same posts, buying similar products, following the same accounts). Requires computing a potentially huge user-user similarity matrix.
    • Item-Based / Latent Factor Models: Use techniques like matrix factorization (SVD, ALS) or deep learning models (embedding users based on their interaction sequences) to place users in a "preference space". Users close together in this space are considered similar.
  • Model Training & Infrastructure: Requires significant compute resources, ML frameworks, and expertise to process large interaction datasets and train complex models effectively.

The Challenge: Computationally expensive, especially user-user CF at scale. Needs large amounts of interaction data. Suffers from the "cold start" problem for new users with little activity.

Step 4: Hybrid Approaches and Serving

Combining attribute and behavioral signals often yields better results but increases complexity.

  • Blending Logic: Develop strategies to combine similarity scores derived from profile attributes and user behavior.
  • Serving Infrastructure: Build APIs to retrieve lists of similar users with low latency when needed. Requires efficient lookups (e.g., from pre-computed lists or embedding similarity searches). Needs filtering logic (e.g., exclude users already followed).
  • Pre-computation vs. Real-time: Decide whether to periodically pre-compute similarity scores or perform lookups on demand.

The Challenge: Blending different signals effectively is complex. Serving requires optimized infrastructure, potentially including vector databases if using embeddings.

Step 5: Monitoring, A/B Testing, and Iteration

Continuously improving the quality of suggestions.

  • Key Metrics: Track connection/follow acceptance rate, profile views from suggestions, subsequent interaction rates between newly connected users.
  • A/B Testing: Compare different similarity algorithms, blending strategies, or UI presentations of the suggestions.
  • Analysis & Refinement: Analyze results to understand what drives successful connections and iterate on the models.

The Challenge: Needs robust experimentation infrastructure and ongoing effort for analysis and optimization.

The Shaped Approach: Simplified User Similarity with similar_users

Building an effective user similarity engine involves navigating complex data integration, machine learning modeling, and infrastructure challenges. Shaped drastically simplifies this with its dedicated similar_users endpoint, powered by the same sophisticated models that drive personalized recommendations.

Shaped's models learn deep representations (embeddings) of your users based on their interactions, profile attributes (if provided during training), and their relationships within the platform's ecosystem. The similar_users endpoint provides direct, low-latency access to this learned understanding of user affinity.

How Shaped Streamlines Similar User Recommendations:

  • Unified Data & Model: Leverages the same connected data (interactions, user profiles, item data) and the same core model trained for personalized item ranking (rank) to understand user similarities. No need to build and maintain a separate system.
  • Automated Learning: Shaped automatically learns complex relationships and similarities between users by analyzing behavioral patterns and profile features during the model training process.
  • Dedicated similar_users API:
  • A single, straightforward API call retrieves a list of user IDs deemed most similar to a given user_id, based on the model's deep understanding.
  • Managed Infrastructure: Shaped handles the complex model training, embedding generation, efficient similarity lookups, and low-latency API delivery.

Building a "People to Follow" Feature with Shaped

Let's illustrate using Shaped's similar_users endpoint to populate a "Who to Follow" suggestion list for a user on a social platform.

Goal:

When USER_123 visits their feed or a dedicated discovery page, show them the 5 most relevant users they aren't already following.

1. Ensure Data is Connected:

Assume user_interactions (likes, posts, shares, follows, profile views), user_profiles (optional but helpful: interests, location, bio), and potentially content_metadata datasets are connected to Shaped.

2. Define Your Shaped Model (YAML): A standard model definition that includes user interactions is typically sufficient. Including user profile features can help the model learn richer user representations.

social_discovery_model.yaml

1 model:
2   name: social_discovery_engine
3 connectors:
4   - type: Dataset
5     name: user_interactions
6     id: interactions
7   - type: Dataset # Optional but recommended
8     name: user_profiles
9     id: users
10     # Potentially connect item/content data if interactions relate to items
11 fetch:
12     # Define how to fetch user events (likes, follows, posts, etc.)
13     events: |
14         SELECT
15           user_id,
16           post_id AS item_id, # Map content interactions too
17           timestamp AS created_at,
18           'like' AS event_type
19         FROM like_events
20         UNION ALL
21         # ... include other relevant interaction types ...
22     users: |
23         SELECT
24           user_id,
25           location,
26           declared_interests,
27           Signup_date,
28           follower_userids,
29         FROM users
    
  • Key Point: The model learns user similarity from the patterns in the events data and potentially enriched by the users data.

3. Create the Model

create_user_similarity_model.sh

1 shaped create-model --file user_similarity_model.yaml
    

4. Monitor Training: Wait for the model social_discovery_engine to become ACTIVE.

view_social_discovery_model.sh

1 shaped view-model --model-name social_discovery_engine
    

5. Fetch Similar Users (Application Backend Logic): When you need to generate suggestions for USER_123:

  • Step A (Your Backend): Identify the user_id of the current user ('USER_123').
  • Step B (Your Backend): Call Shaped's similar_users API endpoint.
recommend_similar_users.js

1 const { Shaped } = require('@shaped/shaped');
2 
3 const shapedClient = new Shaped();
4 const modelName = 'social_discovery_engine';
5 const currentUserId = 'USER_123';
6 const numSuggestions = 10;
7 const response = await shapedClient.similarUsers({
8     modelName: modelName,
9     userId: userId,
10     limit: numSuggestions,
11 
12     // Ensures Shaped doesn't return profiles the user already follows.
13     filter_predicate: "not array_has_any(follower_userids, user_id)"
14 });
15 console.log(`Found ${response.metadata.length} relevant users to suggest for ${userId}`);
    

Example API Response:

similar_users_response.json

1 {
2   "ids": [
3     "user427010",
4     "user182094",
5     "user332874",
6     "user827918",
7     "user403528",
8     "user991002",
9     # ... up to the limit requested ...
10   ]
11 }
    
  • Step C (Your Backend/Frontend): After receiving the list of similar user IDs from Shaped and performing necessary filtering (removing self, removing existing connections), use these IDs to fetch the full user profile details (name, avatar, bio, follower count, etc.) from your own user database. Then, render the "People to Follow" UI component with this information.

Conclusion: Spark Connections with Effortless User Similarity

Helping users discover meaningful connections is key to building thriving online communities and social platforms. Yet, traditional approaches to user similarity detection often require complex machine learning models and significant infrastructure overhead.

Shaped dramatically streamlines this process with the similar_users endpoint. By tapping into the deep user understanding already learned by your core Shaped models, you can retrieve highly relevant lists of similar users with a single API call. Eliminate the need for separate similarity engines, reduce development complexity, and focus on building features that foster meaningful connections between your users.

Ready to help your users build their network?

Request a demo of Shaped today to see how easy it is to power connection discovery. Or, start exploring immediately with our free trial sandbox.

Get up and running with one engineer in one sprint

Guaranteed lift within your first 30 days or your money back

100M+
Users and items
1000+
Queries per second
1B+
Requests

Related Posts

Zac Weigold
 | 
November 28, 2023

Embracing Embeddings: From fragmented insights to unified understanding

Nina Shenker-Tauris
 | 
September 5, 2024

Is the key to unlocking better user experiences in recommender systems found in exploration?

Nic Scheltema
 | 
October 16, 2024

Recommender Systems: The Rise of Graph Neural Networks