Performance Tuning and Optimization 13
Performance Tuning and Optimization 13
Optimization
Your name
Trainer: Rupinder Singh
Course Agenda
• Introduction
• SQL Tuning
• Performance Tuning
• Database Maintenance
PostgreSQL
Getting Started Enterprise Strategy Custom Services
Optimization
S e rvice s Quick Deploy Performance Tuning PostgreSQL Strategy Implementation
Solution Design Automation Services Security Assessment Embedded Expert
Migration Assessment Monitoring Best Practices Enterprise Architecture Training
Migration Assistance Backup Best Practices
* P os tgre s Ente rpris e Ma na ge r, Ba ckup a nd Re cove ry Tool, Fa ilove r Ma na ge r, Re plica tion S e rve r,
Conta ine rs , Kube rne te s Ope ra tor, P os tGIS , P gpool, P gBounce r, Conne ctors , Fore ign Da ta Wra ppe rs ,
Migra tion Toolkit
• Security - Password policy management, session tag auditing, data redaction, SQL injection
protection, and procedural language code obfuscation
• Performance - Query optimizer hints, SQL session/ system wait diagnostics
export PATH
export PGDATA=/var/lib/pgsql/13/data/
export PGUSER=postgres Logoff and Login
export PGPORT=5432
export PGDATABASE=postgres
• Statement Processing
• Common Query Performance Issues
• SQL Tuning Goals
• SQL Tuning Steps
• Identify slow queries
• Review the query execution plan
• Review Optimizer statistics and behaviour
• Restructure SQL statements
• Review Indexes
Optimize
• Check Syntax
• Call Traffic Cop
• Identify Query Type • Execute Query based
• Command Processor if on query plan
• Planner generate Plan
needed
• Uses Database Statistics
• Break Query in Tokens
• Query Cost Calculation
• Choose best plan
Parse Execute
Review Review
Review the Restructure Add /
Identify Optimizer Final
Query SQL Remove
Slow Statistics Execution
Execution Statements Indexes
Queries and Plan
Plan
Behavior
• log_min_duration_statement
• Sets a minimum statement execution time (in milliseconds) that
causes a statement to be logged
• All SQL statements that run for the time specified or longer will
be logged with their duration
• Enabling this option can be useful in tracking down non-
optimized queries in your applications
COSTS [ boolean ]
Loop etc.
SETTINGS [ boolean ]
• Join Type, Join Order BUFFERS [ boolean ]
TIMING [ boolean ]
SUMMARY [ boolean ]
• Example
postgres=# EXPLAIN SELECT * FROM emp;
QUERY PLAN
------------------------------------------------------
Seq Scan on emp (cost=0.00..1.14 rows=14 width=135)
• Load data:
=# INSERT INTO city
values(1,'Edmonton'),(2,'Calgary'),(3,'SherwoodPark'),(4,
'STAlbert');
=# INSERT INTO office VALUES(generate_series(1,100),4);
=# INSERT INTO office VALUES(generate_series(101,200),3);
=# INSERT INTO office VALUES(generate_series(201,300),2);
=# INSERT INTO office VALUES(generate_series(301,400),1);
• Update the statistics for city and office table:
=# ANALYZE city;
=# ANALYZE office;
• Plan:
=# EXPLAIN ANALYZE SELECT city.cityname, office.officeid,
office.cityid FROM city, office WHERE office.cityid =
city.cityid;
Hash Join (cost=1.09..12.59 rows=400 width=20) (actual time=0.057..0.770
rows=400 loops=1)
Hash Cond: (office.cityid = city.cityid)
-> Seq Scan on office (cost=0.00..6.00 rows=400 width=10) (actual
time=0.012..0.128 rows=400 loops=1)
-> Hash (cost=1.04..1.04 rows=4 width=15) (actual time=0.013..0.013 rows=4
loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on city (cost=0.00..1.04 rows=4 width=15) (actual
time=0.002..0.005 rows=4 loops=1)
Planning time: 0.577 ms
Execution time: 0.893 ms
• Table Statistics
• Absolutely critical to have accurate statistics to ensure optimal query plans
• Are not updated in real time, so should be manually updated after bulk operations
• Can be updated using ANALYZE command or OS command vacuumdbwith -Z
option
• Stored inpg_class and pg_statistics
• You can run the ANALYZE command from psql on specific tables and just specific
columns
• Autovacuumwill run ANALYZE as configured
• Syntax for ANALYZE
=# ANALYZE [ VERBOSE ] [ table_name [ ( column_name [, ...] ) ] ]
43
© Copyright EnterpriseDB Corporation, 2020. All rights reserved.
Step 4 - Restructuring SQL
Statements
Restructure SQL Statements
SP-GiST like GiST, offer an infrastructure that supports various kinds of searches
GIN can handle values that contain more than one key, for example arrays
BRIN (Block accelerates scanning of large tables by maintaining summary data about
Range Index) block ranges. Very small index size compared to B-Tree, the tradeoff being
you can’t select a specific row so they are not useful for all data sets
• Statement Processing
• Common Query Performance Issues
• SQL Tuning Goals
• SQL Tuning Steps
• Identify slow queries
• Review the query execution plan
• Optimizer statistics and behaviour
• Restructure SQL statements
• Indexes
Recommendation
s
o max_connections
• Sets the maximum number of concurrent connections
• Each user connection has an associated user backend process on
the server
• User backend processes are terminated when a user logs off
• Connection pooling can decrease the overhead on postmaster by
reusing existing user backend processes
o shared_buffers
• Sets the number of shared memory buffers used by the database
server
• Each buffer is 8K bytes
• Minimum value must be 16 and at least 2 x max_connections
• 6% - 25% of available memory is a good general guideline
• You may find better results keeping the setting relatively low and
using the operating system cache more instead
o work_mem
• Amount of memory in KB to be used by internal sorts and hash tables
before switching to temporary disk files
• Minimum allowed value is 64 KB
• It is set in KB
• Increasing the work_mem often helps in faster sorting
• work_mem settings can also be changed on a per session basis
o maintenance_work_mem
• Maximum memory in KB to be used in maintenance operations such
as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN
KEY
• Minimum allowed value is 1024 KB
• It is set in KB
• Performance for vacuuming and restoring database dumps can be
improved by increasing this value
o autovacuum_work_mem
• Maximum amount of memory to be used by each autovacuum
worker process
• Default value is -1, indicates that maintenance_work_mem to
be used instead
o huge_pages
• Enables/ disables the use of huge memory pages
• Valid values are try (the default), on, and off
• May help in increasing performance by using smaller page tables
thus less CPU time on memory management
• This parameter is only supported on Linux
o effective_cache_size
• Size of memory available for disk cache that is available to a single
query
• A higher value favors index scans
• This parameter does not reserve kernel disk cache; it is used only for
estimation purposes
• ½ of RAM is a conservative setting
• ¾ of RAM is more aggressive
• Find the optimal value looking at OS stats after increasing or
decreasing this parameter
o temp_file_limit
• Maximum amount of disk space that a session can use for
temporary files
• A transaction attempting to exceed this limit will be cancelled
• Default is -1 (no limit)
• This setting constrains the total space used at any instant by all
temporary files used by a given Postgres session
o wal_level
• wal_level determines how much information is written to the WAL
• The default value is replica, adds logging required for WAL
archiving as well as information required to run read-only queries on
a Replica server
• The value logical is used to add information required for logical
decoding
• The value minimal, removes all logging except the information
required to recover from a crash or immediate shutdown
• This parameter can only be set at server start
o wal_buffers
• Number of disk-page buffers allocated in shared memory for WAL
data
• Each buffer is 8K bytes
• Needs to be only large enough to hold the amount of WAL data
created by a typical transaction since the WAL data is flushed out to
disk upon every transaction commit
• Minimum allowed value is 4
• Default setting is -1 (auto-tuned)
• Checkpoints
• Writes the current in-memory modified pages (known as dirty pages) to the disk
• An automatic checkpoint occurs each time the
max_wal_size is reached
o checkpoint_timeout
• Maximum time between automatic WAL checkpoints in seconds
before a checkpoint is forced
• A larger setting results in fewer checkpoints
• Range is 30 – 3600 seconds
• The default is 300 seconds
o fsync
• Ensures that all the WAL buffers are written to the WAL logs at each
COMMIT
• When on, fsync() or other wal_sync_method is forked
• Turning this off will be a performance boost but there is a risk of data
corruption
• Can be turned off during initial loading of a new database cluster
from a backup file
• synchronous_commit = off can provide similar benefits for
noncritical transactions without any risk of data corruption
• Parallel scans– The following types of parallel table scans are supported:
• Parallel sequential scan– Allows multiple workers to perform a sequential scan
• Parallel bitmap heap scan– Allows a single index scan to dispatch parallel workers to
process different areas of the heap
• Parallel index scan or parallel index
-only scan– Allows B-tree index pages to be
searched by separate parallel workers
• Parallel joins – Allows nested loop, hash join or merge joins to be performed
in parallel
• Parallel aggregation– Allows queries with aggregations to be parallelized
• Parallel DDLs- CREATE TABLE AS SELECT, CREATE INDEX, and
CREATE MATERIALIZED VIEW
• pg_prewarm
• Can be used to load relation data into either the operating system buffer cache or into the
PostgreSQL buffer cache
• Supports prefetch method for operating system buffer cache and buffer method for PostgreSQL
buffer cache
• While inserting data using multiple inserts useBEGIN at the start and
COMMIT at the end
• Use COPY to load all the rows in one command, instead of using a series of
INSERT commands
• If you cannot use COPY, it might help to use PREPARE to create a prepared
INSERT statement, and then use EXECUTE as many times as required
• If you are loading a freshly created table, the fastest method is to create the
table, bulk load the table’s data using COPY, then create any indexes needed
for the table. EDB*Loader of Advanced Server is 2x faster than COPY
• It might be useful to drop foreign key constraints, load the data, and then
re-create the constraints
1 2
• VACUUM FULL
• More aggressive algorithm compared to VACUUM
• Compacts tables by writing a complete new version of the table file with no dead space
• Takes more time
• Requires extra disk space for the new copy of the table, until the operation completes
• Introduction
• SQL Tuning
• Performance Tuning
• Routine Maintenance Tasks
[email protected]
www.enterprisedb.com