Aurora PostgreSQL Internal Tables vs S3 External Data: A Real-World Performance & Cost Benchmark

Aurora PostgreSQL supports reading data directly from Amazon S3 using aws_s3 functions, making it possible to offload heavy or cold data without fully moving it to a data lake platform. But does it perform well enough for analytical queries? In this benchmark, I tested identical datasets—one stored locally in Aurora tables, and the other offloaded to S3 and queried through external table definitions—to compare latency, resource impact, and practical usability.

Aurora PostgreSQL doesn’t provide a traditional federated query capability for S3 CSV files like some other database systems. Instead, you can:

  • Use External Tables via aws_s3 Functions
  • Create a View that dynamically loads S3 data
  • Use a Stored Procedure for ad-hoc S3 queries
  • And the list goes on…

I chose the first option because it offers several advantages:

Performance Control – You can optimize the function for your specific use case
Reusability – Create once, use multiple times with different parameters
Memory Management – Uses temporary tables that are automatically cleaned up
Flexibility – Easy to modify for different S3 files or filtering
Error Handling – Ability to implement proper error handling and logging
Cost Efficiency – Data is loaded only when actually needed