PostgreSQL 15.10 commit log

Stamp 15.10.

commit   : a4bd20b6d7f9d42750b797c450592f55d5374c1f    
  
author   : Tom Lane <[email protected]>    
date     : Mon, 18 Nov 2024 15:35:15 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Mon, 18 Nov 2024 15:35:15 -0500    

Click here for diff

M configure
M configure.ac

Fix recently-exposed portability issue in regex optimization.

commit   : 6ab39c02747c33173e5e33291e66cebbdbc75d82    
  
author   : Tom Lane <[email protected]>    
date     : Sun, 17 Nov 2024 14:14:06 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Sun, 17 Nov 2024 14:14:06 -0500    

Click here for diff

fixempties() counts the number of in-arcs in the regex NFA and then  
allocates an array of that many arc pointers.  If the NFA contains no  
arcs, this amounts to malloc(0) for which some platforms return NULL.  
The code mistakenly treats that as indicating out-of-memory.  Thus,  
we can get a bogus "out of memory" failure for some unsatisfiable  
regexes.  
  
This happens only in v15 and earlier, since bea3d7e38 switched to  
using palloc() rather than bare malloc().  And at least of the  
platforms in the buildfarm, only AIX seems to return NULL.  So the  
impact is pretty narrow.  But I don't especially want to ship code  
that is failing its own regression tests, so let's fix this for  
this week's releases.  
  
A quick code survey says that there is only the one place in  
src/backend/regex/ that is at risk of doing malloc(0), so we'll just  
band-aid that place.  A more future-proof fix could be to install a  
malloc() wrapper similar to pg_malloc().  But this code seems unlikely  
to change much more in the affected branches, so that's probably  
overkill.  
  
The only known test case for this involves a complemented character  
class in a bracket expression, for example [^\s\S], and we didn't  
support that in v13.  So it may be that the problem is unreachable  
in v13.  But I'm not 100% sure of that, so patch v13 too.  
  
Discussion: https://postgr.es/m/[email protected]  

M src/backend/regex/regc_nfa.c

Release notes for 17.2, 16.6, 15.10, 14.15, 13.18, 12.22.

commit   : b57d9d2e5d5543cc0c4b2de70d65d7b7c4115da6    
  
author   : Tom Lane <[email protected]>    
date     : Sat, 16 Nov 2024 17:09:53 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Sat, 16 Nov 2024 17:09:53 -0500    

Click here for diff

M doc/src/sgml/release-15.sgml

Undo unintentional ABI break in struct ResultRelInfo.

commit   : 17db248f318f09b143af208fdcc1f067b3b0b2cb    
  
author   : Tom Lane <[email protected]>    
date     : Sat, 16 Nov 2024 12:58:26 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Sat, 16 Nov 2024 12:58:26 -0500    

Click here for diff

Commits aac2c9b4f et al. added a bool field to struct ResultRelInfo.  
That's no problem in the master branch, but in released branches  
care must be taken when modifying publicly-visible structs to avoid  
an ABI break for extensions.  Frequently we solve that by adding the  
new field at the end of the struct, and that's what was done here.  
But ResultRelInfo has stricter constraints than just about any other  
node type in Postgres.  Some executor APIs require extensions to index  
into arrays of ResultRelInfo, which means that any change whatever in  
sizeof(ResultRelInfo) causes a fatal ABI break.  
  
Fortunately, this is easy to fix, because the new field can be  
squeezed into available padding space instead --- indeed, that's where  
it was put in master, so this fix also removes a cross-branch coding  
variation.  
  
Per report from Pavan Deolasee.  Patch v14-v17 only; earlier versions  
did not gain the extra field, nor is there any problem in master.  
  
Discussion: https://postgr.es/m/CABOikdNmVBC1LL6pY26dyxAS2f+gLZvTsNt=2XbcyG7WxXVBBQ@mail.gmail.com  

M src/include/nodes/execnodes.h

Fix per-session activation of ALTER {ROLE|DATABASE} SET role.

commit   : edf80895f6bda824403f843df91cbc83890e4b6c    
  
author   : Noah Misch <[email protected]>    
date     : Fri, 15 Nov 2024 20:39:56 -0800    
  
committer: Noah Misch <[email protected]>    
date     : Fri, 15 Nov 2024 20:39:56 -0800    

Click here for diff

After commit 5a2fed911a85ed6d8a015a6bafe3a0d9a69334ae, the catalog state  
resulting from these commands ceased to affect sessions.  Restore the  
longstanding behavior, which is like beginning the session with a SET  
ROLE command.  If cherry-picking the CVE-2024-10978 fixes, default to  
including this, too.  (This fixes an unintended side effect of fixing  
CVE-2024-10978.)  Back-patch to v12, like that commit.  The release team  
decided to include v12, despite the original intent to halt v12 commits  
earlier this week.  
  
Tom Lane and Noah Misch.  Reported by Etienne LAFARGE.  
  
Discussion: https://postgr.es/m/CADOZwSb0UsEr4_UTFXC5k7=fyyK8uKXekucd+-uuGjJsGBfxgw@mail.gmail.com  

M src/backend/utils/init/miscinit.c
M src/backend/utils/misc/guc.c
M src/test/modules/unsafe_tests/Makefile
A src/test/modules/unsafe_tests/expected/setconfig.out
A src/test/modules/unsafe_tests/sql/setconfig.sql

Fix a possibility of logical replication slot's restart_lsn going backwards.

commit   : 91771b3fbbc33e066e9a28a7d85bde87f5a0c900    
  
author   : Masahiko Sawada <[email protected]>    
date     : Fri, 15 Nov 2024 17:06:02 -0800    
  
committer: Masahiko Sawada <[email protected]>    
date     : Fri, 15 Nov 2024 17:06:02 -0800    

Click here for diff

Previously LogicalIncreaseRestartDecodingForSlot() accidentally  
accepted any LSN as the candidate_lsn and candidate_valid after the  
restart_lsn of the replication slot was updated, so it potentially  
caused the restart_lsn to move backwards.  
  
A scenario where this could happen in logical replication is: after a  
logical replication restart, based on previous candidate_lsn and  
candidate_valid values in memory, the restart_lsn advances upon  
receiving a subscriber acknowledgment. Then, logical decoding restarts  
from an older point, setting candidate_lsn and candidate_valid based  
on an old RUNNING_XACTS record. Subsequent subscriber acknowledgments  
then update the restart_lsn to an LSN older than the current value.  
  
In the reported case, after WAL files were removed by a checkpoint,  
the retreated restart_lsn prevented logical replication from  
restarting due to missing WAL segments.  
  
This change essentially modifies the 'if' condition to 'else if'  
condition within the function. The previous code had an asymmetry in  
this regard compared to LogicalIncreaseXminForSlot(), which does  
almost the same thing for different fields.  
  
The WAL removal issue was reported by Hubert Depesz Lubaczewski.  
  
Backpatch to all supported versions, since the bug exists since 9.4  
where logical decoding was introduced.  
  
Reviewed-by: Tomas Vondra, Ashutosh Bapat, Amit Kapila  
Discussion: https://postgr.es/m/Yz2hivgyjS1RfMKs%40depesz.com  
Discussion: https://postgr.es/m/85fff40e-148b-4e86-b921-b4b846289132%40vondra.me  
Backpatch-through: 13  

M src/backend/replication/logical/logical.c

Avoid assertion due to disconnected NFA sub-graphs in regex parsing.

commit   : 2496c3f6f1bf5a735184d27d81527dfea7ad9e9b    
  
author   : Tom Lane <[email protected]>    
date     : Fri, 15 Nov 2024 18:23:38 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Fri, 15 Nov 2024 18:23:38 -0500    

Click here for diff

In commit 08c0d6ad6 which introduced "rainbow" arcs in regex NFAs,  
I didn't think terribly hard about what to do when creating the color  
complement of a rainbow arc.  Clearly, the complement cannot match any  
characters, and I took the easy way out by just not building any arcs  
at all in the complement arc set.  That mostly works, but Nikolay  
Shaplov found a case where it doesn't: if we decide to delete that  
sub-NFA later because it's inside a "{0}" quantifier, delsub()  
suffered an assertion failure.  That's because delsub() relies on  
the target sub-NFA being fully connected.  That was always true  
before, and the best fix seems to be to restore that property.  
Hence, invent a new arc type CANTMATCH that can be generated in  
place of an empty color complement, and drop it again later when we  
start NFA optimization.  (At that point we don't need to do delsub()  
any more, and besides there are other cases where NFA optimization can  
lead to disconnected subgraphs.)  
  
It appears that this bug has no consequences in a non-assert-enabled  
build: there will be some transiently leaked NFA states/arcs, but  
they'll get cleaned up eventually.  Still, we don't like assertion  
failures, so back-patch to v14 where rainbow arcs were introduced.  
  
Per bug #18708 from Nikolay Shaplov.  
  
Discussion: https://postgr.es/m/[email protected]  

M src/backend/regex/regc_color.c
M src/backend/regex/regc_nfa.c
M src/backend/regex/regcomp.c
M src/include/regex/regguts.h
M src/test/modules/test_regex/expected/test_regex.out
M src/test/modules/test_regex/sql/test_regex.sql

Avoid deleting critical WAL segments during pg_rewind

commit   : e28cf2fbc222a607377813590e4bee448fcf0a29    
  
author   : Álvaro Herrera <[email protected]>    
date     : Fri, 15 Nov 2024 12:53:12 +0100    
  
committer: Álvaro Herrera <[email protected]>    
date     : Fri, 15 Nov 2024 12:53:12 +0100    

Click here for diff

Previously, in unlucky cases, it was possible for pg_rewind to remove  
certain WAL segments from the rewound demoted primary.  In particular  
this happens if those files have been marked for archival (i.e., their  
.ready files were created) but not yet archived; the newly promoted node  
no longer has such files because of them having been recycled, but they  
are likely critical for recovery in the demoted node.  If pg_rewind  
removes them, recovery is not possible anymore.  
  
Fix this by maintaining a hash table of files in this situation in the  
scan that looks for a checkpoint, which the decide_file_actions phase  
can consult so that it knows to preserve them.  
  
Backpatch to 14.  The problem also exists in 13, but that branch was not  
blessed with commit eb00f1d4bf96, so this patch is difficult to apply  
there.  Users of older releases will just have to continue to be extra  
careful when rewinding.  
  
Co-authored-by: Полина Бунгина (Polina Bungina) <[email protected]>  
Co-authored-by: Alexander Kukushkin <[email protected]>  
Reviewed-by: Kyotaro Horiguchi <[email protected]>  
Reviewed-by: Atsushi Torikoshi <[email protected]>  
Discussion: https://postgr.es/m/CAAtGL4AhzmBRsEsaDdz7065T+k+BscNadfTqP1NcPmsqwA5HBw@mail.gmail.com  

M src/bin/pg_rewind/filemap.c
M src/bin/pg_rewind/filemap.h
M src/bin/pg_rewind/parsexlog.c
M src/bin/pg_rewind/pg_rewind.c
A src/bin/pg_rewind/t/010_keep_recycled_wals.pl
M src/tools/pgindent/typedefs.list

Fix race conditions with drop of reused pgstats entries

commit   : 154c5b42a3d80424f7b7beef33a69600245c147d    
  
author   : Michael Paquier <[email protected]>    
date     : Fri, 15 Nov 2024 11:32:18 +0900    
  
committer: Michael Paquier <[email protected]>    
date     : Fri, 15 Nov 2024 11:32:18 +0900    

Click here for diff

This fixes a set of race conditions with cumulative statistics where a  
shared stats entry could be dropped while it should still be valid in  
the event when it is reused: an entry may refer to a different object  
but requires the same hash key.  This can happen with various stats  
kinds, like:  
- Replication slots that compute internally an index number, for  
different slot names.  
- Stats kinds that use an OID in the object key, where a wraparound  
causes the same key to be used if an OID is used for the same object.  
- As of PostgreSQL 18, custom pgstats kinds could also be an issue,  
depending on their implementation.  
  
This issue is fixed by introducing a counter called "generation" in the  
shared entries via PgStatShared_HashEntry, initialized at 0 when an  
entry is created and incremented when the same entry is reused, to avoid  
concurrent issues on drop because of other backends still holding a  
reference to it.  This "generation" is copied to the local copy that a  
backend holds when looking at an object, then cross-checked with the  
shared entry to make sure that the entry is not dropped even if its  
"refcount" justifies that if it has been reused.  
  
This problem could show up when a backend shuts down and needs to  
discard any entries it still holds, causing statistics to be removed  
when they should not, or even an assertion failure.  Another report  
involved a failure in a standby after an OID wraparound, where the  
startup process would FATAL on a "can only drop stats once", stopping  
recovery abruptly.  The buildfarm has been sporadically complaining  
about the problem, as well, but the window is hard to reach with the  
in-core tests.  
  
Note that the issue can be reproduced easily by adding a sleep before  
dshash_find() in pgstat_release_entry_ref() to enlarge the problematic  
window while repeating test_decoding's isolation test oldest_xmin a  
couple of times, for example, as pointed out by Alexander Lakhin.  
  
Reported-by: Alexander Lakhin, Peter Smith  
Author: Kyotaro Horiguchi, Michael Paquier  
Reviewed-by: Bertrand Drouvot  
Discussion: https://postgr.es/m/CAA4eK1KxuMVyAryz_Vk5yq3ejgKYcL6F45Hj9ZnMNBS-g+PuZg@mail.gmail.com  
Discussion: https://postgr.es/m/[email protected]  
Backpatch-through: 15  

M src/backend/utils/activity/pgstat_shmem.c
M src/include/utils/pgstat_internal.h

Count contrib/bloom index scans in pgstat view.

commit   : 16a2bb0793ac269943b0d676bac781e3691e28a0    
  
author   : Peter Geoghegan <[email protected]>    
date     : Tue, 12 Nov 2024 20:57:39 -0500    
  
committer: Peter Geoghegan <[email protected]>    
date     : Tue, 12 Nov 2024 20:57:39 -0500    

Click here for diff

Maintain the pg_stat_user_indexes.idx_scan pgstat counter during  
contrib/Bloom index scans.  
  
Oversight in commit 9ee014fc, which added the Bloom index contrib  
module.  
  
Author: Masahiro Ikeda <[email protected]>  
Reviewed-By: Peter Geoghegan <[email protected]>  
Discussion: https://postgr.es/m/[email protected]  
Backpatch: 13- (all supported branches).  

M contrib/bloom/blscan.c

Fix arrays comparison in CompareOpclassOptions()

commit   : 713b8546aba66be102bdd8b320c06ea3b2813fd9    
  
author   : Alexander Korotkov <[email protected]>    
date     : Tue, 12 Nov 2024 01:44:20 +0200    
  
committer: Alexander Korotkov <[email protected]>    
date     : Tue, 12 Nov 2024 01:44:20 +0200    

Click here for diff

The current code calls array_eq() and does not provide FmgrInfo.  This commit  
provides initialization of FmgrInfo and uses C collation as the safe option  
for text comparison because we don't know anything about the semantics of  
opclass options.  
  
Backpatch to 13, where opclass options were introduced.  
  
Reported-by: Nicolas Maus  
Discussion: https://postgr.es/m/18692-72ea398df3ec6712%40postgresql.org  
Backpatch-through: 13  

M contrib/pg_trgm/expected/pg_trgm.out
M contrib/pg_trgm/sql/pg_trgm.sql
M src/backend/commands/indexcmds.c