Powering AI Personalization with Your PostgreSQL Data and Shaped

PostgreSQL is a powerful source of structured business data, but tapping into it for real-time personalization often requires complex pipelines and engineering overhead. This article shows how Shaped’s direct PostgreSQL connector makes it easy to activate your relational data for intelligent recommendations and search. It explains how Shaped securely connects to your database, syncs relevant tables incrementally, and uses that data to train models and serve AI-powered results via API. From purchase histories and product catalogs to user profiles and transactional data, Shaped helps teams turn trusted SQL data into dynamic, personalized user experiences with minimal setup.

Activating Your Relational Data for Intelligent Experiences

PostgreSQL is a cornerstone for countless applications, prized for its reliability, robustness, and SQL compliance. It often serves as the primary operational database, holding critical business data like user account information, detailed product catalogs, transaction histories, and application state. While PostgreSQL excels at managing structured, transactional data, unlocking its value for dynamic, AI-driven personalization like real-time recommendations or intelligently ranked search requires connecting it to specialized machine learning platforms.

How do you leverage the structured user profiles and purchase histories in your PostgreSQL database to predict future behavior? How do you ensure your AI models always have the latest product attributes from your PostgreSQL catalog tables? How do you train sophisticated models on this relational data without complex ETL processes or putting excessive load on your production database? This is where Shaped's dedicated PostgreSQL connector provides a seamless and efficient solution.

Shaped is an AI-native relevance platform designed to connect directly to your PostgreSQL database, ingest data from specified tables, train state-of-the-art machine learning models, and serve personalized search rankings and recommendations via simple APIs. This post explains the benefits of connecting PostgreSQL to Shaped and provides a step-by-step guide to setting up the integration.

Why Connect PostgreSQL to Shaped? Leverage Your Operational Database

Connecting your PostgreSQL database directly to Shaped allows you to activate your core operational data for powerful personalization and analytics use cases:

  • Data-Rich Recommendations: Utilize the structured, reliable data in PostgreSQL to fuel highly relevant suggestions:
    • Leverage Transactional History: Generate recommendations based on purchase patterns, order details, or other relational data stored in PostgreSQL.
    • Accurate Catalog Awareness: Incorporate detailed and up-to-date product attributes, categories, pricing, and inventory levels directly from your PostgreSQL catalog tables.
    • User Profile & Segment Personalization: Utilize user demographics, subscription statuses, loyalty tiers, or other structured attributes from PostgreSQL user tables to tailor recommendations.
    • Relational "Similar Items": Discover items related not just by behavior, but also by structured attributes defined in your PostgreSQL schema.
  • Enhanced Search Relevance: Improve search results by incorporating trusted data from your operational database:
    • Attribute-Based Filtering & Faceting: Easily use accurate item attributes synced from PostgreSQL for powerful filtering via Shaped's APIs.
    • Optimize Ranking with Business Data: Train models using historical conversion data, user lifetime value, or other business metrics stored in PostgreSQL.
  • Simplified Data Flow: Avoid building and maintaining complex ETL pipelines to export data from PostgreSQL for ML purposes. Shaped's connector handles the data synchronization directly.
  • Scheduled Updates & Incremental Syncs: Keep models fresh by periodically syncing only new or updated data from your PostgreSQL tables based on a replication key, minimizing load on your database.
  • Secure Connectivity: Options for SSL encryption and SSH tunneling ensure your data remains secure during transit.

How it Works: The PostgreSQL Connector

Shaped connects to your PostgreSQL instance using standard database credentials (username/password) for a read-only user you create. You configure which schema and table Shaped should sync.

To efficiently keep data up-to-date after the initial load, Shaped relies on a replication_key. This is a column in your PostgreSQL table that reliably increases over time for new or updated records (e.g., an updated_at timestamp, a created_at timestamp, or an auto-incrementing primary key id). On subsequent syncs, Shaped queries PostgreSQL for rows where the replication_key value is greater than the maximum value seen in the previous sync, fetching only the changes. Shaped also supports secure connections via SSL and SSH tunneling.

Connecting PostgreSQL to Shaped

Setting up the connection involves creating a read-only user in PostgreSQL, ensuring network connectivity (IP allowlisting), and configuring the dataset in Shaped.

Step 1: Prepare PostgreSQL - Create Read-Only User & Grant Permissions

For security, create a dedicated PostgreSQL user with only the necessary read permissions on the specific schema and tables Shaped needs.

  1. Connect to PostgreSQL: Use psql or another SQL client to connect to your target database as an administrative user.
  1. Create Read-Only User: Execute the following SQL commands, replacing database_name, public (if using a different schema), and table names as needed. Choose a strong password.
setup_readonly_user.sql

-- 1. Create a new user with a secure password
CREATE USER shaped_readonly WITH PASSWORD 'YOUR_SECURE_PASSWORD_HERE!';

-- 2. Grant the user permission to connect to the specific database
GRANT CONNECT ON DATABASE your_database_name TO shaped_readonly;

-- 3. Grant the user permission to use the relevant schema (e.g., public)
GRANT USAGE ON SCHEMA public TO shaped_readonly;

-- 4. Grant the user SELECT (read) permission on the specific tables needed

-- Option A: Grant access to ALL tables in the schema (simpler, less secure)
GRANT SELECT ON ALL TABLES IN SCHEMA public TO shaped_readonly;

-- Option B: Grant access ONLY to specific tables (Recommended)
-- GRANT SELECT ON TABLE public.your_users_table TO shaped_readonly;
-- GRANT SELECT ON TABLE public.your_items_table TO shaped_readonly;
-- GRANT SELECT ON TABLE public.your_events_table TO shaped_readonly;

-- 5. Ensure the user can access future tables created in the schema (Optional but Recommended)
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO shaped_readonly;
  1. Secure Credentials: Securely store the username (shaped_readonly in this example) and the password you created.
  1. IP Allowlisting: Depending on where your PostgreSQL database is hosted (e.g., AWS RDS, Google Cloud SQL, self-hosted), you will likely need to configure its firewall rules or security groups to allow incoming connections from Shaped's specific IP addresses. Contact the Shaped team to obtain these IPs.

Step 2: Configure the Shaped Dataset (YAML)

Define the PostgreSQL connection details, target table, replication key, and optional security parameters in a Shaped dataset configuration file.

postgres_dataset.yaml

name: your_postgres_dataset_name

schema_type: POSTGRES
table: your_table_name
user: shaped_readonly
password: YOUR_SECURE_PASSWORD_HERE!
host: your_postgres_host.com
port: 5432
database: your_database_name
replication_key: updated_at

# Optional fields

# database_schema: public
# columns: ["user_id", "item_id", "timestamp", "category", "price"]
# unique_keys: ["order_id", "item_id"]

# ssl_certificate_authority: |
#   -----BEGIN CERTIFICATE-----
#   (CA cert content here)
#   -----END CERTIFICATE-----

# ssl_client_certificate: |
#   -----BEGIN CERTIFICATE-----
#   (Client cert content here)
#   -----END CERTIFICATE-----

# ssl_client_private_key: |
#   -----BEGIN PRIVATE KEY-----
#   (Client private key content here)
#   -----END PRIVATE KEY-----

# ssh_tunnel_host: your_bastion_host.com
# ssh_tunnel_port: 22
# ssh_tunnel_username: your_ssh_user
# ssh_tunnel_password: your_ssh_password

# ssh_tunnel_private_key: |
#   -----BEGIN RSA PRIVATE KEY-----
#   (SSH private key content here)
#   -----END RSA PRIVATE KEY-----

# ssh_tunnel_private_key_password: your_ssh_key_password

# batch_size: 50000
# schedule_interval: "@hourly"

Key Configuration Points:

  • Credentials & Connection: Ensure user, password, host, port, and database are correct.
  • table & database_schema: Specify the exact source table and its schema (if not public).
  • replication_key: Essential for efficient incremental updates. Choose a suitable timestamp or auto-incrementing ID column.
  • columns (Optional): Best practice is to select only the columns needed for your models to improve efficiency.
  • Security (SSL/SSH): Use these optional fields if your database connection requires specific SSL certificates or must be routed through an SSH bastion host. Provide certificate/key content directly in the YAML.

Step 3: Create the Dataset in Shaped

Use the Shaped CLI to create the dataset from your YAML configuration file:

postgres_dataset.yaml

1 shaped create-dataset --file postgres_dataset.yaml

Shaped will validate the configuration, attempt to connect to your PostgreSQL database (check IP allowlisting!), and begin the initial data sync. Monitor the status via the Shaped Dashboard or CLI (shaped view-dataset --dataset-name your_postgres_dataset_name).

What Happens Next? Syncing, Training, Serving from PostgreSQL

Once connected:

  1. Initial Sync: Shaped performs a full sync of the specified table based on your configuration.
  2. Incremental Syncs: On the schedule_interval (default: hourly), Shaped queries PostgreSQL for rows where the replication_key is greater than the last synced value, efficiently fetching only new/updated data.
  3. Model Training: Shaped uses the synced data to train its advanced AI models for personalization.
  4. API Serving: After models are trained, Shaped's APIs serve personalized search rankings and recommendations derived from your PostgreSQL operational data.
  5. Continuous Updates: Scheduled syncs and model retraining keep personalization fresh based on the latest data in your PostgreSQL database.

Conclusion: Unlock Your Operational PostgreSQL Data for AI

Your PostgreSQL database is a vital source of truth for your business operations and customer information. Shaped's PostgreSQL connector provides a secure and efficient way to activate this valuable data for state-of-the-art AI personalization without disrupting your operational database or requiring complex ETL. By connecting Shaped, you can transform your relational data into dynamic, personalized experiences that enhance user engagement and drive business growth.

Ready to power intelligent recommendations and search with your PostgreSQL data?

Request a demo of Shaped today to see it in action with your specific use case. Or, start exploring immediately with our free trial sandbox.

Get up and running with one engineer in one sprint

Guaranteed lift within your first 30 days or your money back

100M+
Users and items
1000+
Queries per second
1B+
Requests

Related Posts

Tullie Murrell
 | 
February 27, 2025

Beyond Dot Products: Retrieval with Learned Similarities

Nic Scheltema
 | 
May 30, 2025

Bringing Collaborative Filtering to LLMs with AdaptRec

Tullie Murrell
 | 
May 28, 2025

A/B Testing Your Rankings: Metrics That Matter in the Real World