SlideShare a Scribd company logo
1 of 41
Download to read offline
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
Don’t Do This
Jimmy Angelakos
Senior Solutions Architect
FOSDEM 2023-02-05
PostgreSQL Devroom
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
What is this talk?
• Not all-inclusive
• There is literally nothing you
cannot mess up
• Misconceptions
• Confusing things
• Common but impactful
mistakes
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
We’ll be looking at
• Bad SQL
• Improper data types
• Improper feature usage
• Performance considerations
• Security considerations
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
Bad SQL
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
NOT IN (i)
5
Doesn’t work the way you expect!
• As in: SELECT WHERE NOT IN (SELECT )
… … …
• SQL is not Python or Ruby!
– SELECT a FROM tab1 WHERE a NOT IN (1, null); returns NO rows!
– SELECT a FROM tab1 WHERE a NOT IN (SELECT b FROM tab2);
same, if any b is NULL
• Why is this bad even if no NULLs?
– Query planning / optimization
– Subplan instead of anti-join
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
NOT IN (ii)
6
What to do instead?
• Anti-join
• SELECT col
FROM tab1
WHERE NOT EXISTS
(SELECT col
FROM tab2
WHERE tab1.col = tab2.col);
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
NOT IN (iii)
7
Or:
• SELECT col
FROM tab1
LEFT JOIN tab2 USING (col)
WHERE tab2.col IS NULL;
• NOT IN is OK, if you know there are no NULLs
– e.g. excluding constants: NOT IN (1,3,5,7,11)
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
BETWEEN (i)
8
Especially with TIMESTAMPs
• BETWEEN (1 AND 100) is inclusive (closed interval)
• When is this bad?
SELECT sum(amount)
FROM transactions
WHERE transaction_timestamp
BETWEEN ('2023-02-05 00:00' AND '2023-02-06 00:00');
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
BETWEEN (ii)
9
Be explicit instead, and use:
SELECT sum(amount)
FROM transactions
WHERE transaction_timestamp >= '2023-02-05 00:00'
AND transaction_timestamp < '2023-02-06 00:00';
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Using upper case in identifiers
10
For table or column names
• Postgres makes everything lower case unless you double quote it
• CREATE TABLE Plerp ( );
…
CREATE TABLE "Quux" ( );
…
– Creates a table named plerp and one named Quux
– TABLE Plerp; works – TABLE "Plerp"; fails
– TABLE Quux; fails – TABLE "Quux"; works
– Same with column names
• For pretty column names: SELECT col FROM plerp AS "Pretty Name";
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
Improper
data types
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
TIMESTAMP (WITHOUT TIME ZONE)
12
a.k.a. naïve timestamps
• Stores a date and time with no time zone information
– Arithmetic between timestamps entered at different time zones is
meaningless and gives wrong results
• TIMESTAMPTZ (TIMESTAMP WITH TIME ZONE) stores a moment in time
– Arithmetic works correctly
– Displays in your time zone, but can display it AT TIME ZONE
• Don’t use TIMESTAMP to store UTC because the DB doesn’t know it’s UTC
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
TIMETZ
13
Or TIME WITH TIME ZONE has questionable usefulness
• Only there for SQL compliance
– Time zones in the real world have little meaning without dates
– Offset can vary with Daylight Savings
– Not possible to do arithmetic across DST boundaries
• Use TIMESTAMPTZ instead
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
CURRENT_TIME
14
Is TIMETZ. Instead use:
• CURRENT_TIMESTAMP or now() for a TIMESTAMPTZ
• LOCALTIMESTAMP for a TIMESTAMP
• CURRENT_DATE for a DATE
• LOCALTIME for a TIME
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
CHAR(n) / VARCHAR(n)
15
Padded with whitespace up to length n
• Padding spaces are ignored when comparing
– But not for pattern matching with LIKE & regular expressions!
• Actually not stored as fixed-width field!
– Can waste space storing irrelevant spaces
– Performance-wise, spend extra time stripping spaces
– Index created for CHAR(n) may not work with a TEXT parameter
• company_name VARCHAR(50) → Peterson's and Sons and Friends Bits & Parts Limited
• To restrict length, just enforce CHECK constraint
• Bottom line: just use TEXT (VARCHAR)
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
MONEY
16
Get away 🎶
• Fixed-point
– Doesn’t handle fractions of a cent, etc. – rounding may be off!
• Doesn’t store currency type, assumes server LC_MONETARY
• Accepts garbage input:
# SELECT ',123,456,,7,8.1,0,9'::MONEY;
money
----------------
£12,345,678.11
(1 row)
• Just use NUMERIC and store currency in another column
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
SERIAL
17
Used to be useful shorthand but now more trouble than it’s worth
• Non SQL Standard
• Permissions for sequence created by SERIAL need to be managed separately from the
table
• CREATE TABLE LIKE
… will use the same sequence!
• Use identity columns instead:
CREATE TABLE tab (id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
content TEXT);
• With an identity column, you don’t need to know the name of the sequence:
ALTER TABLE tab ALTER COLUMN id RESTART WITH 1000;
• BUT: if application depends on a serial sequence with no gaps (e.g. for receipt numbers),
generate that in the application
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
Improper
feature
usage
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
SQL_ASCII
19
Is not a database encoding
• No encoding conversion or validation!
– Byte values 0-127 interpreted as ASCII
– Byte values 128-255 uninterpreted
• Setting behaves differently from other character sets
• Can end up storing a mixture of encodings
– And no way to recover original strings
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
CREATE RULE
20
RULEs are not the same as TRIGGERs
• Rules don’t simply apply conditional logic
– They rewrite queries to modify or add extra queries
– All non-trivial rules will probably have unintended side-effects
– Non SQL Standard
• If you are not creating writable VIEWs, use TRIGGERs instead
• Look for Depesz’s exhaustive blog post on rules:
https://www.depesz.com/2010/06/15/to-rule-or-not-to-rule-that-is-the-question
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
CREATE TABLE ( ) INHERITS
… … (i)
21
Table inheritance
• Seemed like a good idea before ORMs…
• e.g. CREATE TABLE events (id BIGINT, many columns );
… …
CREATE TABLE meetings (scheduled_time TIMESTAMPTZ)
INHERITS events;
• Was used to implement partitioning (< PG 10)
• Incompatible with declarative partitioning (>= PG 10):
– One cannot inherit from a partitioned table
– One cannot add inheritance to a partitioned table
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
CREATE TABLE ( ) INHERITS
… … (ii)
22
How to undo table inheritance
• You can replace table inheritance with foreign key relations
• Create a new table to hold the data, and add the FK column:
CREATE TABLE new_meetings LIKE meetings;
ALTER TABLE new_meetings ADD item_id BIGINT;
• Copy data from old table into new one (may take a long time):
INSERT INTO new_meetings
SELECT *, id FROM meetings;
• Create required constraints, indexes, triggers etc. for new_meetings
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
CREATE TABLE ( ) INHERITS
… … (iii)
23
How to undo table inheritance (continued)
• Very dirty hack (if your table is huge) - create the FK but do not validate it now
to avoid the full table scan:
ALTER TABLE new_meetings
CONSTRAINT event_id_fk
FOREIGN KEY (event_id)
REFERENCES events (id)
NOT VALID;
• If doing this on a live system, create a trigger to replicate changes coming into
meetings also into new_meetings
• Normally one should not touch pg_catalog directly, but we can
UPDATE pg_constraint SET convalidated = true WHERE conname = 'event_id_fk';
as we are confident that data in FK column is valid (as exact copy of the original table)
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
CREATE TABLE ( ) INHERITS
… … (iv)
24
How to undo table inheritance (continued)
• Inside a transaction, perform all the DDL at once:
DO $$
BEGIN
ALTER TABLE meetings RENAME TO old_meetings;
ALTER TABLE new_meetings RENAME TO meetings;
DROP TABLE old_meetings;
-- IMPORTANT: Create trigger to INSERT/UPDATE/DELETE items in
-- events as they get changed in meetings - it's easy as now
-- we have the FK.
COMMIT;
END $$ LANGUAGE plpgsql;
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Partitioning by multiple keys (i)
25
Is not partitioning on multiple levels
• Be careful!
• CREATE TABLE transactions ( , location_code
… TEXT, tstamp TIMESTAMPTZ)
PARTITION BY RANGE (tstamp, location_code);
• CREATE TABLE transactions_2023_02_a
PARTITION OF transactions
FOR VALUES FROM ('2023-02-01', 'AAA') TO ('2023-03-01', 'BAA');
• CREATE TABLE transactions_2023_02_b
PARTITION OF transactions
FOR VALUES FROM ('2023-02-01', 'BAA') TO ('2023-03-01', 'BZZ');
ERROR: partition "transactions_2023_02_b" would overlap partition
"transactions_2023_02_a"
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Partitioning by multiple keys (ii)
26
Subpartitioning is what you actually need
• CREATE TABLE transactions ( , location_code
… TEXT, tstamp TIMESTAMPTZ)
PARTITION BY RANGE (tstamp);
• CREATE TABLE transactions_2023_02
PARTITION OF transactions
FOR VALUES FROM ('2023-02-01') TO ('2023-03-01')
PARTITION BY HASH (location_code);
• CREATE TABLE transactions_2023_02_p1
PARTITION OF transactions_2023_02
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
Performance
considerations
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Number of connections (i)
28
Don’t overload your server for no reason
• max_connections = 5000 🤘
• Every client connection spawns a separate backend process
– IPC via semaphores & shared memory
– Risk: CPU context switching
• Accessing the same objects from multiple connections may incur many
Lightweight Locks (LWLocks or “latches”)
– Lock becomes heavily contended, lots of lockers slow each other down
– You may be making your data hotter for no reason
– No queuing, more or less random
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Number of connections (ii)
29
Mitigation strategy
• Pre-PG 13: Snapshot contention
– Each transaction has an MVCC snapshot – even if idle!
• Contention often caused by too much concurrency
– Insert a connection pooler (e.g. PgBouncer) between application and DB
– Allow fewer connections into the DB, make the rest queue for their turn
– “Throttle” or introduce latency on the application side, to save your
server performance
●
Sounds counter-intuitive!
●
Doesn’t necessarily slow anything down – queries may execute
faster!
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
High transaction rate (i)
30
Just because you can, doesn’t mean you should
• Postgres assigns an identifier to each transaction
– Unsigned 32-bit int (4.2B values)
– Circular space, with a visibility horizon
• XID wraparound: you try to read a very old tuple that is > 2.1B XIDs in the past
• Very heavy OLTP workloads can go through 2.1B transactions in a short time
– For you, that’s the future! (invisible)
– Freezing: Flag tuple as “frozen” which is known to always be in the past
• Need to make sure FREEZE happens before XID wraparound
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
High transaction rate (ii)
31
What can you do?
• Can batching help?
– Does application really need to commit everything atomically?
– Batch size 1000 will have 1/1000th the burn rate
• Increase effectiveness of autovacuum
– More efficient FREEZE
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Turning off autovacuum (i)
32
a.k.a. the MVCC maintenance operation. Yeah, don’t.
• Removes dead tuples, freezes tuples (among other things)
• Has overhead
– Scans tables & indexes
– Needs, obtains, and waits for locks
– Has limited capacity by default
• People are concerned about overhead
– Alternative is worse! You can’t avoid VACUUM in Postgres (yet).
– You can outrun it (and then you’ll need VACUUM FULL)
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Turning off autovacuum (ii)
33
For most production workloads, defaults are too low
• Make it work harder to avoid problems
• Increase potency via:
– maintenance_work_mem (1GB is good)
– autovacuum_max_workers
– autovacuum_vacuum_cost_delay / autovacuum_vacuum_cost_limit
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Explicit locking (i)
34
a.k.a. heavyweight locks
• Table-level (e.g. SHARE) or row-level (e.g. FOR UPDATE)
• Conflict with other lock modes (e.g. ACCESS EXCLUSIVE with ROW
EXCLUSIVE)
• Block read/write access totally leading to waits
• Disastrous for performance
– Unless your application is exquisitely crafted
– Hint: it isn’t
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Explicit locking (ii)
35
Lock contention: waiting for explicit locks
• Avoid explicit locking!
• Use SSI (Serializable Snapshot Isolation, SERIALIZABLE isolation level)
• Make application tolerant
– Allow it to fail and retry
• Slightly reduced concurrency, but:
– No blocking, no explicit locks needed (SIReadLocks, rw-conflicts)
– Best performance choice for some application types
2022 Copyright © EnterpriseDB Corporation All Rights Reserved
Security
considerations
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
psql --W or --password
37
Request password before attempting connection
• It will ask for a password even if the server doesn’t require one
• Unnecessary: psql will always ask for a password if required by server
• Insecure: You may think you’re logging in with a password
– But the server may be in trust mode and letting you in anyhow
– Also, you may be entering the wrong password and still getting in
– From a different client, you may get a surprise!
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Come In WE’RE OPEN
listen_addresses = "*"
38
Listening for connections from clients
• There’s a reason the default is
'localhost' (only TCP/IP loopback)
• Make sure you only enable the
interfaces and networks which you
actually want to have access to the
database server
• e.g. Internet connection on one network
& private network on another interface
• Don’t advertise your presence:
3600000 MySQL/MariaDB servers
(port 3306) found exposed on the
Internet in May 2022
“We’re open” by Enrico Donelli is licensed under CC BY-ND 2.0.
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
No door
pg_hba.conf → trust
39
Host-Based Authentication
• Called that for a reason, i.e. configuring
with host … like:
host mydb myuser 10.10.10.10/32 md5
• trust with host(ssl) is a Very Bad Idea™
– Even for local e.g. improper user
can connect to the DB
– Postgres might be fine, but
other software on the same
server could be compromised
• Default to giving access only where
strictly necessary (better safe...)
“Broken door” by Arno Volkers is licensed under CC BY-NC-ND 2.0.
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Database owned by superuser
40
Do you really need to?
• Use superuser only for management of global objects
– Such as users
– Good security practice
• Superuser bypasses a lot of checks
• (Bad) code that’s normally harmless could be exploited in harmful way
with superuser access
• Try to restrict database ownership to standard users
2023 Copyright © EnterpriseDB Corporation All Rights Reserved
Find me on
Find me on Mastodon: @vyruss@fosstodon.org
Mastodon: @vyruss@fosstodon.org
Photo: “The Devil’s Beef Tub”, Scotland
Photo: “The Devil’s Beef Tub”, Scotland
Thank you!
Thank you!

More Related Content

Similar to Don't Do This [FOSDEM 2023]

2 designing tables
2 designing tables2 designing tables
2 designing tablesRam Kedem
 
DNN Database Tips & Tricks
DNN Database Tips & TricksDNN Database Tips & Tricks
DNN Database Tips & TricksWill Strohl
 
How to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'rollHow to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'rollPGConf APAC
 
Developers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman OracleDevelopers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman OraclemCloud
 
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0mCloud
 
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL ServerGeek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL ServerIDERA Software
 
You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQueryRyuji Tamagawa
 
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OSIDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OSCuneyt Goksu
 
Oracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the IndicesOracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the IndicesKellyn Pot'Vin-Gorman
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAiougVizagChapter
 
Cloudera Impala technical deep dive
Cloudera Impala technical deep diveCloudera Impala technical deep dive
Cloudera Impala technical deep divehuguk
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseAltinity Ltd
 
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1sunildupakuntla
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesInMobi Technology
 
GLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New FeaturesGLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New FeaturesBiju Thomas
 
PostgreSQL 13 is Coming - Find Out What's New!
PostgreSQL 13 is Coming - Find Out What's New!PostgreSQL 13 is Coming - Find Out What's New!
PostgreSQL 13 is Coming - Find Out What's New!EDB
 

Similar to Don't Do This [FOSDEM 2023] (20)

2 designing tables
2 designing tables2 designing tables
2 designing tables
 
DNN Database Tips & Tricks
DNN Database Tips & TricksDNN Database Tips & Tricks
DNN Database Tips & Tricks
 
MS SQL Server
MS SQL ServerMS SQL Server
MS SQL Server
 
How to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'rollHow to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'roll
 
Rdbms day3
Rdbms day3Rdbms day3
Rdbms day3
 
Developers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman OracleDevelopers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman Oracle
 
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
 
MariaDB Temporal Tables
MariaDB Temporal TablesMariaDB Temporal Tables
MariaDB Temporal Tables
 
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL ServerGeek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL Server
 
You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQuery
 
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OSIDUG NA 2014 / 11 tips for DB2 11 for z/OS
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
 
Oracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the IndicesOracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the Indices
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
Cloudera Impala technical deep dive
Cloudera Impala technical deep diveCloudera Impala technical deep dive
Cloudera Impala technical deep dive
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major Features
 
GLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New FeaturesGLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New Features
 
PostgreSQL 13 is Coming - Find Out What's New!
PostgreSQL 13 is Coming - Find Out What's New!PostgreSQL 13 is Coming - Find Out What's New!
PostgreSQL 13 is Coming - Find Out What's New!
 

More from Jimmy Angelakos

Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]Jimmy Angelakos
 
Changing your huge table's data types in production
Changing your huge table's data types in productionChanging your huge table's data types in production
Changing your huge table's data types in productionJimmy Angelakos
 
The State of (Full) Text Search in PostgreSQL 12
The State of (Full) Text Search in PostgreSQL 12The State of (Full) Text Search in PostgreSQL 12
The State of (Full) Text Search in PostgreSQL 12Jimmy Angelakos
 
Deploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesDeploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesJimmy Angelakos
 
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseBringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseJimmy Angelakos
 
Using PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataUsing PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataJimmy Angelakos
 
Eισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλον
Eισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλονEισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλον
Eισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλονJimmy Angelakos
 
PostgreSQL: Mέθοδοι για Data Replication
PostgreSQL: Mέθοδοι για Data ReplicationPostgreSQL: Mέθοδοι για Data Replication
PostgreSQL: Mέθοδοι για Data ReplicationJimmy Angelakos
 

More from Jimmy Angelakos (8)

Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]
 
Changing your huge table's data types in production
Changing your huge table's data types in productionChanging your huge table's data types in production
Changing your huge table's data types in production
 
The State of (Full) Text Search in PostgreSQL 12
The State of (Full) Text Search in PostgreSQL 12The State of (Full) Text Search in PostgreSQL 12
The State of (Full) Text Search in PostgreSQL 12
 
Deploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesDeploying PostgreSQL on Kubernetes
Deploying PostgreSQL on Kubernetes
 
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseBringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
 
Using PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataUsing PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic Data
 
Eισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλον
Eισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλονEισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλον
Eισαγωγή στην PostgreSQL - Χρήση σε επιχειρησιακό περιβάλλον
 
PostgreSQL: Mέθοδοι για Data Replication
PostgreSQL: Mέθοδοι για Data ReplicationPostgreSQL: Mέθοδοι για Data Replication
PostgreSQL: Mέθοδοι για Data Replication
 

Recently uploaded

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Recently uploaded (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

Don't Do This [FOSDEM 2023]

  • 1. 2022 Copyright © EnterpriseDB Corporation All Rights Reserved Don’t Do This Jimmy Angelakos Senior Solutions Architect FOSDEM 2023-02-05 PostgreSQL Devroom
  • 2. 2022 Copyright © EnterpriseDB Corporation All Rights Reserved 2022 Copyright © EnterpriseDB Corporation All Rights Reserved What is this talk? • Not all-inclusive • There is literally nothing you cannot mess up • Misconceptions • Confusing things • Common but impactful mistakes
  • 3. 2022 Copyright © EnterpriseDB Corporation All Rights Reserved 2022 Copyright © EnterpriseDB Corporation All Rights Reserved We’ll be looking at • Bad SQL • Improper data types • Improper feature usage • Performance considerations • Security considerations
  • 4. 2022 Copyright © EnterpriseDB Corporation All Rights Reserved Bad SQL
  • 5. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved NOT IN (i) 5 Doesn’t work the way you expect! • As in: SELECT WHERE NOT IN (SELECT ) … … … • SQL is not Python or Ruby! – SELECT a FROM tab1 WHERE a NOT IN (1, null); returns NO rows! – SELECT a FROM tab1 WHERE a NOT IN (SELECT b FROM tab2); same, if any b is NULL • Why is this bad even if no NULLs? – Query planning / optimization – Subplan instead of anti-join
  • 6. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved NOT IN (ii) 6 What to do instead? • Anti-join • SELECT col FROM tab1 WHERE NOT EXISTS (SELECT col FROM tab2 WHERE tab1.col = tab2.col);
  • 7. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved NOT IN (iii) 7 Or: • SELECT col FROM tab1 LEFT JOIN tab2 USING (col) WHERE tab2.col IS NULL; • NOT IN is OK, if you know there are no NULLs – e.g. excluding constants: NOT IN (1,3,5,7,11)
  • 8. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved BETWEEN (i) 8 Especially with TIMESTAMPs • BETWEEN (1 AND 100) is inclusive (closed interval) • When is this bad? SELECT sum(amount) FROM transactions WHERE transaction_timestamp BETWEEN ('2023-02-05 00:00' AND '2023-02-06 00:00');
  • 9. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved BETWEEN (ii) 9 Be explicit instead, and use: SELECT sum(amount) FROM transactions WHERE transaction_timestamp >= '2023-02-05 00:00' AND transaction_timestamp < '2023-02-06 00:00';
  • 10. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Using upper case in identifiers 10 For table or column names • Postgres makes everything lower case unless you double quote it • CREATE TABLE Plerp ( ); … CREATE TABLE "Quux" ( ); … – Creates a table named plerp and one named Quux – TABLE Plerp; works – TABLE "Plerp"; fails – TABLE Quux; fails – TABLE "Quux"; works – Same with column names • For pretty column names: SELECT col FROM plerp AS "Pretty Name";
  • 11. 2022 Copyright © EnterpriseDB Corporation All Rights Reserved Improper data types
  • 12. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved TIMESTAMP (WITHOUT TIME ZONE) 12 a.k.a. naïve timestamps • Stores a date and time with no time zone information – Arithmetic between timestamps entered at different time zones is meaningless and gives wrong results • TIMESTAMPTZ (TIMESTAMP WITH TIME ZONE) stores a moment in time – Arithmetic works correctly – Displays in your time zone, but can display it AT TIME ZONE • Don’t use TIMESTAMP to store UTC because the DB doesn’t know it’s UTC
  • 13. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved TIMETZ 13 Or TIME WITH TIME ZONE has questionable usefulness • Only there for SQL compliance – Time zones in the real world have little meaning without dates – Offset can vary with Daylight Savings – Not possible to do arithmetic across DST boundaries • Use TIMESTAMPTZ instead
  • 14. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved CURRENT_TIME 14 Is TIMETZ. Instead use: • CURRENT_TIMESTAMP or now() for a TIMESTAMPTZ • LOCALTIMESTAMP for a TIMESTAMP • CURRENT_DATE for a DATE • LOCALTIME for a TIME
  • 15. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved CHAR(n) / VARCHAR(n) 15 Padded with whitespace up to length n • Padding spaces are ignored when comparing – But not for pattern matching with LIKE & regular expressions! • Actually not stored as fixed-width field! – Can waste space storing irrelevant spaces – Performance-wise, spend extra time stripping spaces – Index created for CHAR(n) may not work with a TEXT parameter • company_name VARCHAR(50) → Peterson's and Sons and Friends Bits & Parts Limited • To restrict length, just enforce CHECK constraint • Bottom line: just use TEXT (VARCHAR)
  • 16. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved MONEY 16 Get away 🎶 • Fixed-point – Doesn’t handle fractions of a cent, etc. – rounding may be off! • Doesn’t store currency type, assumes server LC_MONETARY • Accepts garbage input: # SELECT ',123,456,,7,8.1,0,9'::MONEY; money ---------------- £12,345,678.11 (1 row) • Just use NUMERIC and store currency in another column
  • 17. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved SERIAL 17 Used to be useful shorthand but now more trouble than it’s worth • Non SQL Standard • Permissions for sequence created by SERIAL need to be managed separately from the table • CREATE TABLE LIKE … will use the same sequence! • Use identity columns instead: CREATE TABLE tab (id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, content TEXT); • With an identity column, you don’t need to know the name of the sequence: ALTER TABLE tab ALTER COLUMN id RESTART WITH 1000; • BUT: if application depends on a serial sequence with no gaps (e.g. for receipt numbers), generate that in the application
  • 18. 2022 Copyright © EnterpriseDB Corporation All Rights Reserved Improper feature usage
  • 19. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved SQL_ASCII 19 Is not a database encoding • No encoding conversion or validation! – Byte values 0-127 interpreted as ASCII – Byte values 128-255 uninterpreted • Setting behaves differently from other character sets • Can end up storing a mixture of encodings – And no way to recover original strings
  • 20. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved CREATE RULE 20 RULEs are not the same as TRIGGERs • Rules don’t simply apply conditional logic – They rewrite queries to modify or add extra queries – All non-trivial rules will probably have unintended side-effects – Non SQL Standard • If you are not creating writable VIEWs, use TRIGGERs instead • Look for Depesz’s exhaustive blog post on rules: https://www.depesz.com/2010/06/15/to-rule-or-not-to-rule-that-is-the-question
  • 21. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved CREATE TABLE ( ) INHERITS … … (i) 21 Table inheritance • Seemed like a good idea before ORMs… • e.g. CREATE TABLE events (id BIGINT, many columns ); … … CREATE TABLE meetings (scheduled_time TIMESTAMPTZ) INHERITS events; • Was used to implement partitioning (< PG 10) • Incompatible with declarative partitioning (>= PG 10): – One cannot inherit from a partitioned table – One cannot add inheritance to a partitioned table
  • 22. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved CREATE TABLE ( ) INHERITS … … (ii) 22 How to undo table inheritance • You can replace table inheritance with foreign key relations • Create a new table to hold the data, and add the FK column: CREATE TABLE new_meetings LIKE meetings; ALTER TABLE new_meetings ADD item_id BIGINT; • Copy data from old table into new one (may take a long time): INSERT INTO new_meetings SELECT *, id FROM meetings; • Create required constraints, indexes, triggers etc. for new_meetings
  • 23. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved CREATE TABLE ( ) INHERITS … … (iii) 23 How to undo table inheritance (continued) • Very dirty hack (if your table is huge) - create the FK but do not validate it now to avoid the full table scan: ALTER TABLE new_meetings CONSTRAINT event_id_fk FOREIGN KEY (event_id) REFERENCES events (id) NOT VALID; • If doing this on a live system, create a trigger to replicate changes coming into meetings also into new_meetings • Normally one should not touch pg_catalog directly, but we can UPDATE pg_constraint SET convalidated = true WHERE conname = 'event_id_fk'; as we are confident that data in FK column is valid (as exact copy of the original table)
  • 24. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved CREATE TABLE ( ) INHERITS … … (iv) 24 How to undo table inheritance (continued) • Inside a transaction, perform all the DDL at once: DO $$ BEGIN ALTER TABLE meetings RENAME TO old_meetings; ALTER TABLE new_meetings RENAME TO meetings; DROP TABLE old_meetings; -- IMPORTANT: Create trigger to INSERT/UPDATE/DELETE items in -- events as they get changed in meetings - it's easy as now -- we have the FK. COMMIT; END $$ LANGUAGE plpgsql;
  • 25. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Partitioning by multiple keys (i) 25 Is not partitioning on multiple levels • Be careful! • CREATE TABLE transactions ( , location_code … TEXT, tstamp TIMESTAMPTZ) PARTITION BY RANGE (tstamp, location_code); • CREATE TABLE transactions_2023_02_a PARTITION OF transactions FOR VALUES FROM ('2023-02-01', 'AAA') TO ('2023-03-01', 'BAA'); • CREATE TABLE transactions_2023_02_b PARTITION OF transactions FOR VALUES FROM ('2023-02-01', 'BAA') TO ('2023-03-01', 'BZZ'); ERROR: partition "transactions_2023_02_b" would overlap partition "transactions_2023_02_a"
  • 26. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Partitioning by multiple keys (ii) 26 Subpartitioning is what you actually need • CREATE TABLE transactions ( , location_code … TEXT, tstamp TIMESTAMPTZ) PARTITION BY RANGE (tstamp); • CREATE TABLE transactions_2023_02 PARTITION OF transactions FOR VALUES FROM ('2023-02-01') TO ('2023-03-01') PARTITION BY HASH (location_code); • CREATE TABLE transactions_2023_02_p1 PARTITION OF transactions_2023_02 FOR VALUES WITH (MODULUS 4, REMAINDER 0);
  • 27. 2022 Copyright © EnterpriseDB Corporation All Rights Reserved Performance considerations
  • 28. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Number of connections (i) 28 Don’t overload your server for no reason • max_connections = 5000 🤘 • Every client connection spawns a separate backend process – IPC via semaphores & shared memory – Risk: CPU context switching • Accessing the same objects from multiple connections may incur many Lightweight Locks (LWLocks or “latches”) – Lock becomes heavily contended, lots of lockers slow each other down – You may be making your data hotter for no reason – No queuing, more or less random
  • 29. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Number of connections (ii) 29 Mitigation strategy • Pre-PG 13: Snapshot contention – Each transaction has an MVCC snapshot – even if idle! • Contention often caused by too much concurrency – Insert a connection pooler (e.g. PgBouncer) between application and DB – Allow fewer connections into the DB, make the rest queue for their turn – “Throttle” or introduce latency on the application side, to save your server performance ● Sounds counter-intuitive! ● Doesn’t necessarily slow anything down – queries may execute faster!
  • 30. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved High transaction rate (i) 30 Just because you can, doesn’t mean you should • Postgres assigns an identifier to each transaction – Unsigned 32-bit int (4.2B values) – Circular space, with a visibility horizon • XID wraparound: you try to read a very old tuple that is > 2.1B XIDs in the past • Very heavy OLTP workloads can go through 2.1B transactions in a short time – For you, that’s the future! (invisible) – Freezing: Flag tuple as “frozen” which is known to always be in the past • Need to make sure FREEZE happens before XID wraparound
  • 31. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved High transaction rate (ii) 31 What can you do? • Can batching help? – Does application really need to commit everything atomically? – Batch size 1000 will have 1/1000th the burn rate • Increase effectiveness of autovacuum – More efficient FREEZE
  • 32. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Turning off autovacuum (i) 32 a.k.a. the MVCC maintenance operation. Yeah, don’t. • Removes dead tuples, freezes tuples (among other things) • Has overhead – Scans tables & indexes – Needs, obtains, and waits for locks – Has limited capacity by default • People are concerned about overhead – Alternative is worse! You can’t avoid VACUUM in Postgres (yet). – You can outrun it (and then you’ll need VACUUM FULL)
  • 33. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Turning off autovacuum (ii) 33 For most production workloads, defaults are too low • Make it work harder to avoid problems • Increase potency via: – maintenance_work_mem (1GB is good) – autovacuum_max_workers – autovacuum_vacuum_cost_delay / autovacuum_vacuum_cost_limit
  • 34. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Explicit locking (i) 34 a.k.a. heavyweight locks • Table-level (e.g. SHARE) or row-level (e.g. FOR UPDATE) • Conflict with other lock modes (e.g. ACCESS EXCLUSIVE with ROW EXCLUSIVE) • Block read/write access totally leading to waits • Disastrous for performance – Unless your application is exquisitely crafted – Hint: it isn’t
  • 35. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Explicit locking (ii) 35 Lock contention: waiting for explicit locks • Avoid explicit locking! • Use SSI (Serializable Snapshot Isolation, SERIALIZABLE isolation level) • Make application tolerant – Allow it to fail and retry • Slightly reduced concurrency, but: – No blocking, no explicit locks needed (SIReadLocks, rw-conflicts) – Best performance choice for some application types
  • 36. 2022 Copyright © EnterpriseDB Corporation All Rights Reserved Security considerations
  • 37. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved psql --W or --password 37 Request password before attempting connection • It will ask for a password even if the server doesn’t require one • Unnecessary: psql will always ask for a password if required by server • Insecure: You may think you’re logging in with a password – But the server may be in trust mode and letting you in anyhow – Also, you may be entering the wrong password and still getting in – From a different client, you may get a surprise!
  • 38. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Come In WE’RE OPEN listen_addresses = "*" 38 Listening for connections from clients • There’s a reason the default is 'localhost' (only TCP/IP loopback) • Make sure you only enable the interfaces and networks which you actually want to have access to the database server • e.g. Internet connection on one network & private network on another interface • Don’t advertise your presence: 3600000 MySQL/MariaDB servers (port 3306) found exposed on the Internet in May 2022 “We’re open” by Enrico Donelli is licensed under CC BY-ND 2.0.
  • 39. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved No door pg_hba.conf → trust 39 Host-Based Authentication • Called that for a reason, i.e. configuring with host … like: host mydb myuser 10.10.10.10/32 md5 • trust with host(ssl) is a Very Bad Idea™ – Even for local e.g. improper user can connect to the DB – Postgres might be fine, but other software on the same server could be compromised • Default to giving access only where strictly necessary (better safe...) “Broken door” by Arno Volkers is licensed under CC BY-NC-ND 2.0.
  • 40. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Database owned by superuser 40 Do you really need to? • Use superuser only for management of global objects – Such as users – Good security practice • Superuser bypasses a lot of checks • (Bad) code that’s normally harmless could be exploited in harmful way with superuser access • Try to restrict database ownership to standard users
  • 41. 2023 Copyright © EnterpriseDB Corporation All Rights Reserved Find me on Find me on Mastodon: @vyruss@fosstodon.org Mastodon: @vyruss@fosstodon.org Photo: “The Devil’s Beef Tub”, Scotland Photo: “The Devil’s Beef Tub”, Scotland Thank you! Thank you!