Skip to main content

PostgreSQL Vault Configuration

PostgreSQL vaults enable secure connections to PostgreSQL databases for data quality monitoring and analysis.

Overview

PostgreSQL is one of the most popular open-source relational database management systems. DeepDQ's PostgreSQL vault provides seamless integration with PostgreSQL instances, enabling DeepDQ to execute Sentinels for data quality monitoring, discover schemas for Data Catalog, track relationships for Data Lineage, and power the DAB Chatbot with query capabilities.

Configuration Parameters

Required Fields

  • Name: A unique identifier for your PostgreSQL vault
  • Type: PostgreSQL (automatically selected)
  • Host: The hostname or IP address of your PostgreSQL server
  • Port: The port number (default: 5432)
  • Database: The name of the database to connect to
  • Username: The database username for authentication
  • Password: The database password (securely encrypted)

Example Configuration

Name: Production PostgreSQL Vault
Type: PostgreSQL
Host: your-postgres-host.example.com
Port: 5432
Database: your_database
Username: your_username
Password: [encrypted]

Supported PostgreSQL Versions

DeepDQ supports the following PostgreSQL versions:

  • PostgreSQL 12.x
  • PostgreSQL 13.x
  • PostgreSQL 14.x
  • PostgreSQL 15.x
  • PostgreSQL 16.x

Connection Requirements

Network Access

  • Ensure DeepDQ can reach your PostgreSQL instance
  • Configure firewall rules to allow connections on the specified port
  • For cloud deployments, verify security group settings
  • DeepDQ Static IP: Contact salesandsupport@deepanalyze.ai to get the Static IP for whitelisting

Database Permissions

The PostgreSQL user configured in the vault requires the following minimum permissions:

  • CONNECT privilege on the target database
  • SELECT privilege on tables to be monitored
  • USAGE privilege on schemas containing monitored tables

SSL/TLS Support

DeepDQ supports secure connections to PostgreSQL:

  • SSL connections are automatically detected and used when available
  • Encrypted data transmission for enhanced security
  • Automatic SSL mode detection and configuration

Common Use Cases

Data Quality Monitoring with Sentinels

  • Execute automated SQL queries for data validation
  • Monitor data freshness and completeness
  • Detect anomalies and data quality issues
  • Track key metrics and business rules

Data Catalog Integration

  • Automatic schema discovery and cataloging
  • Table and column metadata extraction
  • Data type and constraint documentation
  • Schema change detection and monitoring

Data Lineage Tracking

  • Map data flow between PostgreSQL tables
  • Track transformation dependencies
  • Monitor cross-database relationships
  • Visualize data movement patterns

DAB Chatbot Queries

  • Natural language querying of PostgreSQL data
  • Automated insights and data exploration
  • Interactive data analysis capabilities
  • Business intelligence through conversation

Best Practices

Security

  • Use dedicated database users for DeepDQ connections
  • Implement least-privilege access principles
  • Regularly rotate database passwords
  • Enable SSL/TLS for all connections

Performance

  • Create appropriate indexes for frequently queried tables
  • Monitor connection pool usage
  • Configure connection timeouts appropriately
  • Use read replicas for monitoring when possible

Monitoring

  • Set up connection health checks
  • Monitor query performance and execution times
  • Track connection pool utilization
  • Alert on connection failures

Troubleshooting

Common Connection Issues

Connection Refused

  • Verify host and port configuration
  • Check PostgreSQL service status
  • Validate firewall and network connectivity

Authentication Failed

  • Confirm username and password accuracy
  • Check PostgreSQL authentication configuration (pg_hba.conf)
  • Verify user permissions and database access

SSL/TLS Issues

  • Validate SSL certificate configuration
  • Check SSL mode requirements
  • Verify certificate authority trust chain

Performance Issues

Slow Query Execution

  • Review query complexity and table sizes
  • Check for missing indexes on monitored columns
  • Monitor PostgreSQL performance metrics
  • Consider query optimization strategies

Advanced Configuration

Connection Parameters

Additional PostgreSQL connection parameters can be configured:

  • application_name: Identify DeepDQ connections in PostgreSQL logs
  • statement_timeout: Set maximum query execution time
  • idle_in_transaction_session_timeout: Manage idle connections

High Availability

For production environments with high availability requirements:

  • Configure connection failover options
  • Use connection pooling for improved performance
  • Implement monitoring and alerting for vault health

Schema Discovery

PostgreSQL vaults automatically discover:

  • Available databases and schemas
  • Table structures and relationships
  • Column data types and constraints
  • Index information for optimization