Activating Your Relational Data for Intelligent Experiences
PostgreSQL is a cornerstone for countless applications, prized for its reliability, robustness, and SQL compliance. It often serves as the primary operational database, holding critical business data like user account information, detailed product catalogs, transaction histories, and application state. While PostgreSQL excels at managing structured, transactional data, unlocking its value for dynamic, AI-driven personalization like real-time recommendations or intelligently ranked search requires connecting it to specialized machine learning platforms.
How do you leverage the structured user profiles and purchase histories in your PostgreSQL database to predict future behavior? How do you ensure your AI models always have the latest product attributes from your PostgreSQL catalog tables? How do you train sophisticated models on this relational data without complex ETL processes or putting excessive load on your production database? This is where Shaped's dedicated PostgreSQL connector provides a seamless and efficient solution.
Shaped is an AI-native relevance platform designed to connect directly to your PostgreSQL database, ingest data from specified tables, train state-of-the-art machine learning models, and serve personalized search rankings and recommendations via simple APIs. This post explains the benefits of connecting PostgreSQL to Shaped and provides a step-by-step guide to setting up the integration.
Why Connect PostgreSQL to Shaped? Leverage Your Operational Database
Connecting your PostgreSQL database directly to Shaped allows you to activate your core operational data for powerful personalization and analytics use cases:
- Data-Rich Recommendations: Utilize the structured, reliable data in PostgreSQL to fuel highly relevant suggestions:
- Leverage Transactional History: Generate recommendations based on purchase patterns, order details, or other relational data stored in PostgreSQL.
- Accurate Catalog Awareness: Incorporate detailed and up-to-date product attributes, categories, pricing, and inventory levels directly from your PostgreSQL catalog tables.
- User Profile & Segment Personalization: Utilize user demographics, subscription statuses, loyalty tiers, or other structured attributes from PostgreSQL user tables to tailor recommendations.
- Relational "Similar Items": Discover items related not just by behavior, but also by structured attributes defined in your PostgreSQL schema.
- Enhanced Search Relevance: Improve search results by incorporating trusted data from your operational database:
- Attribute-Based Filtering & Faceting: Easily use accurate item attributes synced from PostgreSQL for powerful filtering via Shaped's APIs.
- Optimize Ranking with Business Data: Train models using historical conversion data, user lifetime value, or other business metrics stored in PostgreSQL.
- Simplified Data Flow: Avoid building and maintaining complex ETL pipelines to export data from PostgreSQL for ML purposes. Shaped's connector handles the data synchronization directly.
- Scheduled Updates & Incremental Syncs: Keep models fresh by periodically syncing only new or updated data from your PostgreSQL tables based on a replication key, minimizing load on your database.
- Secure Connectivity: Options for SSL encryption and SSH tunneling ensure your data remains secure during transit.
How it Works: The PostgreSQL Connector
Shaped connects to your PostgreSQL instance using standard database credentials (username/password) for a read-only user you create. You configure which schema and table Shaped should sync.
To efficiently keep data up-to-date after the initial load, Shaped relies on a replication_key. This is a column in your PostgreSQL table that reliably increases over time for new or updated records (e.g., an updated_at timestamp, a created_at timestamp, or an auto-incrementing primary key id). On subsequent syncs, Shaped queries PostgreSQL for rows where the replication_key value is greater than the maximum value seen in the previous sync, fetching only the changes. Shaped also supports secure connections via SSL and SSH tunneling.
Connecting PostgreSQL to Shaped
Setting up the connection involves creating a read-only user in PostgreSQL, ensuring network connectivity (IP allowlisting), and configuring the dataset in Shaped.
Step 1: Prepare PostgreSQL - Create Read-Only User & Grant Permissions
For security, create a dedicated PostgreSQL user with only the necessary read permissions on the specific schema and tables Shaped needs.
- Connect to PostgreSQL: Use psql or another SQL client to connect to your target database as an administrative user.
- Create Read-Only User: Execute the following SQL commands, replacing database_name, public (if using a different schema), and table names as needed. Choose a strong password.
- Secure Credentials: Securely store the username (shaped_readonly in this example) and the password you created.
- IP Allowlisting: Depending on where your PostgreSQL database is hosted (e.g., AWS RDS, Google Cloud SQL, self-hosted), you will likely need to configure its firewall rules or security groups to allow incoming connections from Shaped's specific IP addresses. Contact the Shaped team to obtain these IPs.
Step 2: Configure the Shaped Dataset (YAML)
Define the PostgreSQL connection details, target table, replication key, and optional security parameters in a Shaped dataset configuration file.
Key Configuration Points:
- Credentials & Connection: Ensure user, password, host, port, and database are correct.
- table & database_schema: Specify the exact source table and its schema (if not public).
- replication_key: Essential for efficient incremental updates. Choose a suitable timestamp or auto-incrementing ID column.
- columns (Optional): Best practice is to select only the columns needed for your models to improve efficiency.
- Security (SSL/SSH): Use these optional fields if your database connection requires specific SSL certificates or must be routed through an SSH bastion host. Provide certificate/key content directly in the YAML.
Step 3: Create the Dataset in Shaped
Use the Shaped CLI to create the dataset from your YAML configuration file:
Shaped will validate the configuration, attempt to connect to your PostgreSQL database (check IP allowlisting!), and begin the initial data sync. Monitor the status via the Shaped Dashboard or CLI (shaped view-dataset --dataset-name your_postgres_dataset_name).
What Happens Next? Syncing, Training, Serving from PostgreSQL

Once connected:
- Initial Sync: Shaped performs a full sync of the specified table based on your configuration.
- Incremental Syncs: On the schedule_interval (default: hourly), Shaped queries PostgreSQL for rows where the replication_key is greater than the last synced value, efficiently fetching only new/updated data.
- Model Training: Shaped uses the synced data to train its advanced AI models for personalization.
- API Serving: After models are trained, Shaped's APIs serve personalized search rankings and recommendations derived from your PostgreSQL operational data.
- Continuous Updates: Scheduled syncs and model retraining keep personalization fresh based on the latest data in your PostgreSQL database.
Conclusion: Unlock Your Operational PostgreSQL Data for AI
Your PostgreSQL database is a vital source of truth for your business operations and customer information. Shaped's PostgreSQL connector provides a secure and efficient way to activate this valuable data for state-of-the-art AI personalization without disrupting your operational database or requiring complex ETL. By connecting Shaped, you can transform your relational data into dynamic, personalized experiences that enhance user engagement and drive business growth.
Ready to power intelligent recommendations and search with your PostgreSQL data?
Request a demo of Shaped today to see it in action with your specific use case. Or, start exploring immediately with our free trial sandbox.