doc: Fix XML_CATALOG_FILES env var for Apple Silicon machines
commit : aad9a2ee7c2d381bfa3317ecfb2cd28601b0aa19 author : Daniel Gustafsson <firstname.lastname@example.org> date : Mon, 27 Mar 2023 21:35:30 +0200 committer: Daniel Gustafsson <email@example.com> date : Mon, 27 Mar 2023 21:35:30 +0200
Homebrew changed the prefix for Apple Silicon based machines, so our advice for XML_CATALOG_FILES needs to mention both. More info on the Homebrew change can be found at: https://github.com/Homebrew/brew/issues/9177 This is backpatch of commits 4c8d65408 and 5a91c7975, the latter which contained a small fix based on a report from Dagfinn Ilmari Mannsåker. Author: Julien Rouhaud <firstname.lastname@example.org> Discussion: https://postgr.es/m/20230327082441.h7pa2vqiobbyo7rd@jrouhaud
Reject attempts to alter composite types used in indexes.
commit : cd07163c0e36596a53154c7fb7ffb479d225fe78 author : Tom Lane <email@example.com> date : Mon, 27 Mar 2023 15:04:02 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Mon, 27 Mar 2023 15:04:02 -0400
find_composite_type_dependencies() ignored indexes, which is a poor decision because an expression index could have a stored column of a composite (or other container) type even when the underlying table does not. Teach it to detect such cases and error out. We have to work a bit harder than for other relations because the pg_depend entry won't identify the specific index column of concern, but it's not much new code. This does not address bug #17872's original complaint that dropping a column in such a type might lead to violations of the uniqueness property that a unique index is supposed to ensure. That seems of much less concern to me because it won't lead to crashes. Per bug #17872 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://email@example.com
Fix oversights in array manipulation.
commit : ad5fe7420594811d3ca4054a8397c5cc8c6575e2 author : Tom Lane <firstname.lastname@example.org> date : Sun, 26 Mar 2023 13:41:06 -0400 committer: Tom Lane <email@example.com> date : Sun, 26 Mar 2023 13:41:06 -0400
The nested-arrays code path in ExecEvalArrayExpr() used palloc to allocate the result array, whereas every other array-creating function has used palloc0 since 18c0b4ecc. This mostly works, but unused bits past the end of the nulls bitmap may end up undefined. That causes valgrind complaints with -DWRITE_READ_PARSE_PLAN_TREES, and could cause planner misbehavior as cited in 18c0b4ecc. There seems no very good reason why we should strive to avoid palloc0 in just this one case, so fix it the easy way with s/palloc/palloc0/. While looking at that I noted that we also failed to check for overflow of "nbytes" and "nitems" while summing the sizes of the sub-arrays, potentially allowing a crash due to undersized output allocation. For "nbytes", follow the policy used by other array-munging code of checking for overflow after each addition. (As elsewhere, the last addition of the array's overhead space doesn't need an extra check, since palloc itself will catch a value between 1Gb and 2Gb.) For "nitems", there's no very good reason to sum the inputs at all, since we can perfectly well use ArrayGetNItems' result instead of ignoring it. Per discussion of this bug, also remove redundant zeroing of the nulls bitmap in array_set_element and array_set_slice. Patch by Alexander Lakhin and myself, per bug #17858 from Alexander Lakhin; thanks also to Richard Guo. These bugs are a dozen years old, so back-patch to all supported branches. Discussion: https://firstname.lastname@example.org
Ignore generated columns during apply of update/delete.
commit : 0f2d4adbf85f7f4e3d5afda419e8fc87b89862f2 author : Amit Kapila <email@example.com> date : Thu, 23 Mar 2023 11:08:38 +0530 committer: Amit Kapila <firstname.lastname@example.org> date : Thu, 23 Mar 2023 11:08:38 +0530
We fail to apply updates and deletes when the REPLICA IDENTITY FULL is used for the table having generated columns. We didn't use to ignore generated columns while doing tuple comparison among the tuples from the publisher and subscriber during apply of updates and deletes. Author: Onder Kalaci Reviewed-by: Shi yu, Amit Kapila Backpatch-through: 12 Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
doc: Add description of some missing monitoring functions
commit : 488ace333584d345b4bf76992e81b0b7ef52cbe0 author : Michael Paquier <email@example.com> date : Wed, 22 Mar 2023 18:32:09 +0900 committer: Michael Paquier <firstname.lastname@example.org> date : Wed, 22 Mar 2023 18:32:09 +0900
This commit adds some documentation about two monitoring functions: - pg_stat_get_xact_blocks_fetched() - pg_stat_get_xact_blocks_hit() The description of these functions has been removed in ddfc2d9, later simplified by 5f2b089, assuming that all the functions whose descriptions were removed are used in system views. Unfortunately, some of them were are not used in any system views, so they lacked documentation. This gap exists in the docs for a long time, so backpatch all the way down. Reported-by: Michael Paquier Author: Bertrand Drouvot Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/ZBeeH5UoNkTPrwHO@paquier.xyz Backpatch-through: 11
Ignore dropped columns during apply of update/delete.
commit : fc63e6ba8670e0eb1bc40ae9fe4acdd4203bc36e author : Amit Kapila <email@example.com> date : Tue, 21 Mar 2023 08:50:23 +0530 committer: Amit Kapila <firstname.lastname@example.org> date : Tue, 21 Mar 2023 08:50:23 +0530
We fail to apply updates and deletes when the REPLICA IDENTITY FULL is used for the table having dropped columns. We didn't use to ignore dropped columns while doing tuple comparison among the tuples from the publisher and subscriber during apply of updates and deletes. Author: Onder Kalaci, Shi yu Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
Fix race in parallel hash join batch cleanup, take II.
commit : 44d44aa9712f315b125c9c5f2a1640ebb70e1d2f author : Thomas Munro <email@example.com> date : Tue, 21 Mar 2023 14:29:34 +1300 committer: Thomas Munro <firstname.lastname@example.org> date : Tue, 21 Mar 2023 14:29:34 +1300
With unlucky timing and parallel_leader_participation=off (not the default), PHJ could attempt to access per-batch shared state just as it was being freed. There was code intended to prevent that by checking for a cleared pointer, but it was racy. Fix, by introducing an extra barrier phase. The new phase PHJ_BUILD_RUNNING means that it's safe to access the per-batch state to find a batch to help with, and PHJ_BUILD_DONE means that it is too late. The last to detach will free the array of per-batch state as before, but now it will also atomically advance the phase, so that late attachers can avoid the hazard. This mirrors the way per-batch hash tables are freed (see phases PHJ_BATCH_PROBING and PHJ_BATCH_DONE). An earlier attempt to fix this (commit 3b8981b6, later reverted) missed one special case. When the inner side is empty (the "empty inner optimization), the build barrier would only make it to PHJ_BUILD_HASHING_INNER phase before workers attempted to detach from the hashtable. In that case, fast-forward the build barrier to PHJ_BUILD_RUNNING before proceeding, so that our later assertions hold and we can still negotiate who is cleaning up. Revealed by build farm failures, where BarrierAttach() failed a sanity check assertion, because the memory had been clobbered by dsa_free(). In non-assert builds, the result could be a segmentation fault. Back-patch to all supported releases. Author: Thomas Munro <email@example.com> Author: Melanie Plageman <firstname.lastname@example.org> Reported-by: Michael Paquier <email@example.com> Reported-by: David Geier <firstname.lastname@example.org> Tested-by: David Geier <email@example.com> Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
Doc: fix documentation example for bytea hex output format.
commit : 92865681c23000e390345129ee742662e33aa09e author : Tom Lane <firstname.lastname@example.org> date : Sat, 18 Mar 2023 16:11:22 -0400 committer: Tom Lane <email@example.com> date : Sat, 18 Mar 2023 16:11:22 -0400
Per report from rsindlin Discussion: https://firstname.lastname@example.org
Fix pg_dump for hash partitioning on enum columns.
commit : 8f83ce8c5244ce40514e8643a648704d8c85baa9 author : Tom Lane <email@example.com> date : Fri, 17 Mar 2023 13:31:40 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Fri, 17 Mar 2023 13:31:40 -0400
Hash partitioning on an enum is problematic because the hash codes are derived from the OIDs assigned to the enum values, which will almost certainly be different after a dump-and-reload than they were before. This means that some rows probably end up in different partitions than before, causing restore to fail because of partition constraint violations. (pg_upgrade dodges this problem by using hacks to force the enum values to keep the same OIDs, but that's not possible nor desirable for pg_dump.) Users can work around that by specifying --load-via-partition-root, but since that's a dump-time not restore-time decision, one might find out the need for it far too late. Instead, teach pg_dump to apply that option automatically when dealing with a partitioned table that has hash-on-enum partitioning. Also deal with a pre-existing issue for --load-via-partition-root mode: in a parallel restore, we try to TRUNCATE target tables just before loading them, in order to enable some backend optimizations. This is bad when using --load-via-partition-root because (a) we're likely to suffer deadlocks from restore jobs trying to restore rows into other partitions than they came from, and (b) if we miss getting a deadlock we might still lose data due to a TRUNCATE removing rows from some already-completed restore job. The fix for this is conceptually simple: just don't TRUNCATE if we're dealing with a --load-via-partition-root case. The tricky bit is for pg_restore to identify those cases. In dumps using COPY commands we can inspect each COPY command to see if it targets the nominal target table or some ancestor. However, in dumps using INSERT commands it's pretty impractical to examine the INSERTs in advance. To provide a solution for that going forward, modify pg_dump to mark TABLE DATA items that are using --load-via-partition-root with a comment. (This change also responds to a complaint from Robert Haas that the dump output for --load-via-partition-root is pretty confusing.) pg_restore checks for the special comment as well as checking the COPY command if present. This will fail to identify the combination of --load-via-partition-root and --inserts in pre-existing dump files, but that should be a pretty rare case in the field. If it does happen you will probably get a deadlock failure that you can work around by not using parallel restore, which is the same as before this bug fix. Having done this, there seems no remaining reason for the alarmism in the pg_dump man page about combining --load-via-partition-root with parallel restore, so remove that warning. Patch by me; thanks to Julien Rouhaud for review. Back-patch to v11 where hash partitioning was introduced. Discussion: https://email@example.com
tests: Prevent syslog activity by slapd, take 2
commit : dbe926b91ecb01e05343ac8539e66ef0d4c3b11e author : Andres Freund <firstname.lastname@example.org> date : Thu, 16 Mar 2023 23:03:31 -0700 committer: Andres Freund <email@example.com> date : Thu, 16 Mar 2023 23:03:31 -0700
Unfortunately it turns out that the logfile-only option added in b9f8d1cbad7 is only available in openldap starting in 2.6. Luckily the option to control the log level (loglevel/-s) have been around for much longer. As it turns out loglevel/-s only control what goes into syslog, not what ends up in the file specified with 'logfile' and stderr. While we currently are specifying 'logfile', nothing ends up in it, as the option only controls debug messages, and we didn't set a debug level. The debug level can only be configured on the commandline and also prevents forking. That'd require larger changes, so this commit doesn't tackle that issue. Specify the syslog level when starting slapd using -s, as that allows to prevent all syslog messages if one uses '0' instead of 'none', while loglevel doesn't prevent the first message. Discussion: https://firstname.lastname@example.org Backpatch: 11-
tests: Minimize syslog activity by slapd
commit : 54f07ced96e9125e5cdc9dd260e208687b533799 author : Andres Freund <email@example.com> date : Thu, 16 Mar 2023 17:48:47 -0700 committer: Andres Freund <firstname.lastname@example.org> date : Thu, 16 Mar 2023 17:48:47 -0700
Until now the tests using slapd spammed syslog for every connection / query. Use logfile-only to prevent syslog activity. Unfortunately that only takes effect after logging the first message, but that's still much better than the prior situation. Discussion: https://email@example.com Backpatch: 11-
Small tidyup for commit d41a178b, part II.
commit : 8fcd1517f0d7d57fe9cf586a1419936ce90c5e97 author : Thomas Munro <firstname.lastname@example.org> date : Fri, 17 Mar 2023 14:44:12 +1300 committer: Thomas Munro <email@example.com> date : Fri, 17 Mar 2023 14:44:12 +1300
Further to commit 6a9229da, checking for NULL is now redundant. An "out of memory" error would have been thrown already by palloc() and treated as FATAL, so we can delete a few more lines. Back-patch to all releases, like those other commits. Reported-by: Tom Lane <firstname.lastname@example.org> Discussion: https://postgr.es/m/4040668.1679013388%40sss.pgh.pa.us
Work around spurious compiler warning in inet operators
commit : c1266562feebf167105a15b0575ad863492d533d author : Andres Freund <email@example.com> date : Thu, 16 Mar 2023 14:08:44 -0700 committer: Andres Freund <firstname.lastname@example.org> date : Thu, 16 Mar 2023 14:08:44 -0700
gcc 12+ has complaints like the following: ../../../../../pgsql/src/backend/utils/adt/network.c: In function 'inetnot': ../../../../../pgsql/src/backend/utils/adt/network.c:1893:34: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] 1893 | pdst[nb] = ~pip[nb]; | ~~~~~~~~~^~~~~~~~~~ ../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16 27 | unsigned char ipaddr; /* up to 128 bits of address */ | ^~~~~~ ../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16 This is due to a compiler bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104986 It has been a year since the bug has been reported without getting fixed. As the warnings are verbose and use of gcc 12 is becoming more common, it seems worth working around the bug. Particularly because a simple reformulation of the loop condition fixes the issue and isn't any less readable. Author: Tom Lane <email@example.com> Author: Andres Freund <firstname.lastname@example.org> Discussion: https://email@example.com Backpatch: 11-
Small tidyup for commit d41a178b.
commit : 6f508b8bce2ed66eef310e1b77a84ce1cd222aeb author : Thomas Munro <firstname.lastname@example.org> date : Fri, 17 Mar 2023 09:44:42 +1300 committer: Thomas Munro <email@example.com> date : Fri, 17 Mar 2023 09:44:42 +1300
A comment was left behind claiming that we needed to use malloc() rather than palloc() because the corresponding free would run in another thread, but that's not true anymore. Remove that comment. And, with the reason being gone, we might as well actually use palloc(). Back-patch to supported releases, like d41a178b. Discussion: https://postgr.es/m/CA%2BhUKG%2BpdM9v3Jv4tc2BFx2jh_daY3uzUyAGBhtDkotEQDNPYw%40mail.gmail.com
Fix waitpid() emulation on Windows.
commit : 8362884275b5ba0620b1edfff260078910159c98 author : Thomas Munro <firstname.lastname@example.org> date : Wed, 15 Mar 2023 13:17:18 +1300 committer: Thomas Munro <email@example.com> date : Wed, 15 Mar 2023 13:17:18 +1300
Our waitpid() emulation didn't prevent a PID from being recycled by the OS before the call to waitpid(). The postmaster could finish up tracking more than one child process with the same PID, and confuse them. Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(), so that resources are released synchronously. The process and PID continue to exist until we close the process handle, which only happens once we're ready to adjust our book-keeping of running children. This seems to explain a couple of failures on CI. It had never been reported before, despite the code being as old as the Windows port. Perhaps Windows started recycling PIDs more rapidly, or perhaps timing changes due to commit 7389aad6 made it more likely to break. Thanks to Alexander Lakhin for analysis and Andres Freund for tracking down the root cause. Back-patch to all supported branches. Reported-by: Andres Freund <firstname.lastname@example.org> Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de
Fix corner case bug in numeric to_char() some more.
commit : 6d3a9a60f78557dc6ab170db074f9e74da539d93 author : Tom Lane <email@example.com> date : Tue, 14 Mar 2023 19:17:31 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Tue, 14 Mar 2023 19:17:31 -0400
The band-aid applied in commit f0bedf3e4 turns out to still need some work: it made sure we didn't set Np->last_relevant too small (to the left of the decimal point), but it didn't prevent setting it too large (off the end of the partially-converted string). This could result in fetching data beyond the end of the allocated space, which with very bad luck could cause a SIGSEGV, though I don't see any hazard of interesting memory disclosure. Per bug #17839 from Thiago Nunes. The bug's pretty ancient, so back-patch to all supported versions. Discussion: https://email@example.com
Fix JSON error reporting for many cases of erroneous string values.
commit : c25a929a6c8869a148b3ee064eb03ab1d3cb127d author : Tom Lane <firstname.lastname@example.org> date : Mon, 13 Mar 2023 15:19:00 -0400 committer: Tom Lane <email@example.com> date : Mon, 13 Mar 2023 15:19:00 -0400
The majority of error exit cases in json_lex_string() failed to set lex->token_terminator, causing problems for the error context reporting code: it would see token_terminator less than token_start and do something more or less nuts. In v14 and up the end result could be as bad as a crash in report_json_context(). Older versions accidentally avoided that fate; but all versions produce error context lines that are far less useful than intended, because they'd stop at the end of the prior token instead of continuing to where the actually-bad input is. To fix, invent some macros that make it less notationally painful to do the right thing. Also add documentation about what the function is actually required to do; and in >= v14, add an assertion in report_json_context about token_terminator being sufficiently far advanced. Per report from Nikolay Shaplov. Back-patch to all supported versions. Discussion: https://postgr.es/m/7332649.x5DLKWyVIX@thinkpad-pgpro
Fix failure to detect some cases of improperly-nested aggregates.
commit : 62a91a1b092606e55d8a9807d249ceda58feebb0 author : Tom Lane <firstname.lastname@example.org> date : Mon, 13 Mar 2023 12:40:28 -0400 committer: Tom Lane <email@example.com> date : Mon, 13 Mar 2023 12:40:28 -0400
check_agg_arguments_walker() supposed that it needn't descend into the arguments of a lower-level aggregate function, but this is just wrong in the presence of multiple levels of sub-select. The oversight would lead to executor failures on queries that should be rejected. (Prior to v11, they actually were rejected, thanks to a "redundant" execution-time check.) Per bug #17835 from Anban Company. Back-patch to all supported branches. Discussion: https://firstname.lastname@example.org
Fix inconsistent error handling for GSS encryption in PQconnectPoll()
commit : 2bc36a56cbd15415f85f3364044b778b21b0504c author : Michael Paquier <email@example.com> date : Mon, 13 Mar 2023 16:36:34 +0900 committer: Michael Paquier <firstname.lastname@example.org> date : Mon, 13 Mar 2023 16:36:34 +0900
The error cases for TLS and GSS encryption were inconsistent. After TLS fails, the connection is marked as dead and follow-up calls of PQconnectPoll() would return immediately, but GSS encryption was not doing that, so the connection would still have been allowed to enter the GSS handling code. This was handled incorrectly when gssencmode was set to "require". "prefer" was working correctly, and this could not happen under "disable" as GSS encryption would not be attempted. This commit makes the error handling of GSS encryption on par with TLS portion, fixing the case of gssencmode=require. Reported-by: Jacob Champion Author: Michael Paquier Reviewed-by: Jacob Champion, Stephen Frost Discussion: https://email@example.com Backpatch-through: 12
Mark unsafe_tests module as not runnable with installcheck
commit : 13196cc755982e9cbccf92be67e36d5480e73edc author : Andrew Dunstan <firstname.lastname@example.org> date : Sun, 12 Mar 2023 09:00:32 -0400 committer: Andrew Dunstan <email@example.com> date : Sun, 12 Mar 2023 09:00:32 -0400
This was an omission in the original creation of the module. Also slightly adjust some wording to avoid a double "is". Backpatch the non-meson piece of this to release 12, where the module was introduced. Discussion: https://firstname.lastname@example.org
Fix misbehavior in contrib/pg_trgm with an unsatisfiable regex.
commit : 1279414bc613316d3cb7216790bc943d5af8be70 author : Tom Lane <email@example.com> date : Sat, 11 Mar 2023 12:15:41 -0500 committer: Tom Lane <firstname.lastname@example.org> date : Sat, 11 Mar 2023 12:15:41 -0500
If the regex compiler can see that a regex is unsatisfiable (for example, '$foo') then it may emit an NFA having no arcs. pg_trgm's packGraph function did the wrong thing in this case; it would access off the end of a work array, and with bad luck could produce a corrupted output data structure causing more problems later. This could end with wrong answers or crashes in queries using a pg_trgm GIN or GiST index with such a regex. Fix by not trying to de-duplicate if there aren't at least 2 arcs. Per bug #17830 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://email@example.com
Ensure COPY TO on an RLS-enabled table copies no more than it should.
commit : a30310833d071cfe9c1e7a9768e4d4df47685106 author : Tom Lane <firstname.lastname@example.org> date : Fri, 10 Mar 2023 13:52:28 -0500 committer: Tom Lane <email@example.com> date : Fri, 10 Mar 2023 13:52:28 -0500
The COPY documentation is quite clear that "COPY relation TO" copies rows from only the named table, not any inheritance children it may have. However, if you enabled row-level security on the table then this stopped being true, because the code forgot to apply the ONLY modifier in the "SELECT ... FROM relation" query that it constructs in order to allow RLS predicates to be attached. Fix that. Report and patch by Antonin Houska (comment adjustments and test case by me). Back-patch to all supported branches. Discussion: https://postgr.es/m/3472.1675251957@antos
Fix race in SERIALIZABLE READ ONLY.
commit : e30fd0942493b1ba260b122c5ae4d8941926eb8f author : Thomas Munro <firstname.lastname@example.org> date : Thu, 9 Mar 2023 16:33:24 +1300 committer: Thomas Munro <email@example.com> date : Thu, 9 Mar 2023 16:33:24 +1300
Commit bdaabb9b started skipping doomed transactions when building the list of possible conflicts for SERIALIZABLE READ ONLY. That makes sense, because doomed transactions won't commit, but a couple of subtle things broke: 1. If all uncommitted r/w transactions are doomed, a READ ONLY transaction would arbitrarily not benefit from the safe snapshot optimization. It would not be taken immediately, and yet no other transaction would set SXACT_FLAG_RO_SAFE later. 2. In the same circumstances but with DEFERRABLE, GetSafeSnapshot() would correctly exit its wait loop without sleeping and then take the optimization in non-assert builds, but assert builds would fail a sanity check that SXACT_FLAG_RO_SAFE had been set by another transaction. This is similar to the case for PredXact->WritableSxactCount == 0. We should opt out immediately if our possibleUnsafeConflicts list is empty after filtering. The code to maintain the serializable global xmin is moved down below the new opt out site, because otherwise we'd have to reverse its effects before returning. Back-patch to all supported releases. Bug #17368. Reported-by: Alexander Lakhin <firstname.lastname@example.org> Discussion: https://postgr.es/m/17116-d6ca217acc180e30%40postgresql.org Discussion: https://postgr.es/m/20110707212159.GF76634%40csail.mit.edu
Fix corruption due to vacuum_defer_cleanup_age underflowing 64bit xids
commit : 3c92f7e9d851084bf51cb9231353c54474fe0438 author : Andres Freund <email@example.com> date : Tue, 7 Mar 2023 21:36:52 -0800 committer: Andres Freund <firstname.lastname@example.org> date : Tue, 7 Mar 2023 21:36:52 -0800
When vacuum_defer_cleanup_age is bigger than the current xid, including the epoch, the subtraction of vacuum_defer_cleanup_age would lead to a wrapped around xid. While that normally is not a problem, the subsequent conversion to a 64bit xid results in a 64bit-xid very far into the future. As that xid is used as a horizon to detect whether rows versions are old enough to be removed, that allows removal of rows that are still visible (i.e. corruption). If vacuum_defer_cleanup_age was never changed from the default, there is no chance of this bug occurring. This bug was introduced in dc7420c2c92. A lesser version of it exists in 12-13, introduced by fb5344c969a, affecting only GiST. The 12-13 version of the issue can, in rare cases, lead to pages in a gist index getting recycled too early, potentially causing index entries to be found multiple times. The fix is fairly simple - don't allow vacuum_defer_cleanup_age to retreat further than FirstNormalTransactionId. Patches to make similar bugs easier to find, by adding asserts to the 64bit xid infrastructure, have been proposed, but are not suitable for backpatching. Currently there are no tests for vacuum_defer_cleanup_age. A patch introducing infrastructure to make writing a test easier has been posted to the list. Reported-by: Michail Nikolaev <email@example.com> Reviewed-by: Matthias van de Meent <firstname.lastname@example.org> Author: Andres Freund <email@example.com> Discussion: https://firstname.lastname@example.org Backpatch: 12-, but impact/fix is smaller for 12-13
Fix more bugs caused by adding columns to the end of a view.
commit : 5a19da58eed20b5259ad5d6ae53f54eea08d34c9 author : Tom Lane <email@example.com> date : Tue, 7 Mar 2023 18:21:37 -0500 committer: Tom Lane <firstname.lastname@example.org> date : Tue, 7 Mar 2023 18:21:37 -0500
If a view is defined atop another view, and then CREATE OR REPLACE VIEW is used to add columns to the lower view, then when the upper view's referencing RTE is expanded by ApplyRetrieveRule we will have a subquery RTE with fewer eref->colnames than output columns. This confuses various code that assumes those lists are always in sync, as they are in plain parser output. We have seen such problems before (cf commit d5b760ecb), and now I think the time has come to do what was speculated about in that commit: let's make ApplyRetrieveRule synthesize some column names to preserve the invariant that holds in parser output. Otherwise we'll be chasing this class of bugs indefinitely. Moreover, it appears from testing that this actually gives us better results in the test case d5b760ecb added, and likely in other corner cases that we lack coverage for. In HEAD, I replaced d5b760ecb's hack to make expandRTE exit early with an elog(ERROR) call, since the case is now presumably unreachable. But it seems like changing that in back branches would bring more risk than benefit, so there I just updated the comment. Per bug #17811 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://email@example.com
Fix some more cases of missed GENERATED-column updates.
commit : 23b75dd03da1a4a6db080b27bb3c8efcf7dbc9ae author : Tom Lane <firstname.lastname@example.org> date : Mon, 6 Mar 2023 18:31:16 -0500 committer: Tom Lane <email@example.com> date : Mon, 6 Mar 2023 18:31:16 -0500
If UPDATE is forced to retry after an EvalPlanQual check, it neglected to repeat GENERATED-column computations, even though those might well have changed since we're dealing with a different tuple than before. Fixing this is mostly a matter of looping back a bit further when we retry. In v15 and HEAD that's most easily done by altering the API of ExecUpdateAct so that it includes computing GENERATED expressions. Also, if an UPDATE in a partitioned table turns into a cross-partition INSERT operation, we failed to recompute GENERATED columns. That's a bug since 8bf6ec3ba allowed partitions to have different generation expressions; although it seems to have no ill effects before that. Fixing this is messier because we can now have situations where the same query needs both the UPDATE-aligned set of GENERATED columns and the INSERT-aligned set, and it's unclear which set will be generated first (else we could hack things by forcing the INSERT-aligned set to be generated, which is indeed how fe9e658f4 made it work for MERGE). The best fix seems to be to build and store separate sets of expressions for the INSERT and UPDATE cases. That would create ABI issues in the back branches, but so far it seems we can leave this alone in the back branches. Per bug #17823 from Hisahiro Kauchi. The first part of this affects all branches back to v12 where GENERATED columns were added. Discussion: https://firstname.lastname@example.org
Fix assert failures in parallel SERIALIZABLE READ ONLY.
commit : afa122e41c651edfaa17b47652025ea48085eb94 author : Thomas Munro <email@example.com> date : Mon, 6 Mar 2023 15:07:15 +1300 committer: Thomas Munro <firstname.lastname@example.org> date : Mon, 6 Mar 2023 15:07:15 +1300
1. Make sure that we don't decrement SxactGlobalXminCount twice when the SXACT_FLAG_RO_SAFE optimization is reached in a parallel query. This could trigger a sanity check failure in assert builds. Non-assert builds recompute the count in SetNewSxactGlobalXmin(), so the problem was hidden, explaining the lack of field reports. Add a new isolation test to exercise that case. 2. Remove an assertion that the DOOMED flag can't be set on a partially released SERIALIZABLEXACT. Instead, ignore the flag (our transaction was already determined to be read-only safe, and DOOMED is in fact set during partial release, and there was already an assertion that it wasn't set sooner). Improve an existing isolation test so that it reaches that case (previously it wasn't quite testing what it was supposed to be testing; see discussion). Back-patch to 12. Bug #17116. Defects in commit 47a338cf. Reported-by: Alexander Lakhin <email@example.com> Discussion: https://postgr.es/m/17116-d6ca217acc180e30%40postgresql.org
Avoid fetching one past the end of translate()'s "to" parameter.
commit : b162660d3ac44688f18919cd460423022e467512 author : Tom Lane <firstname.lastname@example.org> date : Wed, 1 Mar 2023 11:30:17 -0500 committer: Tom Lane <email@example.com> date : Wed, 1 Mar 2023 11:30:17 -0500
This is usually harmless, but if you were very unlucky it could provoke a segfault due to the "to" string being right up against the end of memory. Found via valgrind testing (so we might've found it earlier, except that our regression tests lacked any exercise of translate()'s deletion feature). Fix by switching the order of the test-for-end-of-string and advance-pointer steps. While here, compute "to_ptr + tolen" just once. (Smarter compilers might figure that out for themselves, but let's just make sure.) Report and fix by Daniil Anisimov, in bug #17816. Discussion: https://firstname.lastname@example.org
Don't force SQL_ASCII/no-locale for installcheck in vcregress.pl
commit : 11290c89bbe52283186b3bd38f06ae677d99ca9f author : Andrew Dunstan <email@example.com> date : Sun, 26 Feb 2023 06:48:41 -0500 committer: Andrew Dunstan <firstname.lastname@example.org> date : Sun, 26 Feb 2023 06:48:41 -0500
It's been this way for a very long time, but it appears to have been masking an issue that only manifests with different settings. Therefore, run the tests in the installation's default encoding/locale. Backpatch to all live branches.
Fix MULTIEXPR_SUBLINK with partitioned target tables, yet again.
commit : 904b171a465583b35ba199d38868e9e49fcb8ef4 author : Tom Lane <email@example.com> date : Sat, 25 Feb 2023 14:44:14 -0500 committer: Tom Lane <firstname.lastname@example.org> date : Sat, 25 Feb 2023 14:44:14 -0500
We already tried to fix this in commits 3f7323cbb et al (and follow-on fixes), but now it emerges that there are still unfixed cases; moreover, these cases affect all branches not only pre-v14. I thought we had eliminated all cases of making multiple clones of an UPDATE's target list when we nuked inheritance_planner. But it turns out we still do that in some partitioned-UPDATE cases, notably including INSERT ... ON CONFLICT UPDATE, because ExecInitPartitionInfo thinks it's okay to clone and modify the parent's targetlist. This fix is based on a suggestion from Andres Freund: let's stop abusing the ParamExecData.execPlan mechanism, which was only ever meant to handle initplans, and instead solve the execution timing problem by having the expression compiler move MULTIEXPR_SUBLINK steps to the front of their expression step lists. This is feasible because (a) all branches still in support compile the entire targetlist of an UPDATE into a single ExprState, and (b) we know that all MULTIEXPR_SUBLINKs do need to be evaluated --- none could be buried inside a CASE, for example. There is a minor semantics change concerning the order of execution of the MULTIEXPR's subquery versus other parts of the parent targetlist, but that seems like something we can get away with. By doing that, we no longer need to worry about whether different clones of a MULTIEXPR_SUBLINK share output Params; their usage of that data structure won't overlap. Per bug #17800 from Alexander Lakhin. Back-patch to all supported branches. In v13 and earlier, we can revert 3f7323cbb and follow-on fixes; however, I chose to keep the SubPlan.subLinkId field added in ccbb54c72. We don't need that anymore in the core code, but it's cheap enough to fill, and removing a plan node field in a minor release seems like it'd be asking for trouble. Andres Freund and Tom Lane Discussion: https://email@example.com
Fix mishandling of OLD/NEW references in subqueries in rule actions.
commit : 4fd093af7186e9e3bad805e580edd1991f8f3350 author : Dean Rasheed <firstname.lastname@example.org> date : Sat, 25 Feb 2023 14:47:03 +0000 committer: Dean Rasheed <email@example.com> date : Sat, 25 Feb 2023 14:47:03 +0000
If a rule action contains a subquery that refers to columns from OLD or NEW, then those are really lateral references, and the planner will complain if it sees such things in a subquery that isn't marked as lateral. However, at rule-definition time, the user isn't required to mark the subquery with LATERAL, and so it can fail when the rule is used. Fix this by marking such subqueries as lateral in the rewriter, at the point where they're used. Dean Rasheed and Tom Lane, per report from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/5e09da43-aaba-7ea7-0a51-a2eb981b058b%40gmail.com
Don't repeatedly register cache callbacks in pgoutput plugin.
commit : 95558bc8ff89c5887f1bffc9d152ca603637e2c0 author : Tom Lane <firstname.lastname@example.org> date : Thu, 23 Feb 2023 15:40:28 -0500 committer: Tom Lane <email@example.com> date : Thu, 23 Feb 2023 15:40:28 -0500
Multiple cycles of starting up and shutting down the plugin within a single session would eventually lead to "out of relcache_callback_list slots", because pgoutput_startup blindly re-registered its cache callbacks each time. Fix it to register them only once, as all other users of cache callbacks already take care to do. This has been broken all along, so back-patch to all supported branches. Shi Yu Discussion: https://postgr.es/m/OSZPR01MB631004A78D743D68921FFAD3FDA79@OSZPR01MB6310.jpnprd01.prod.outlook.com
Fix multi-row DEFAULT handling for INSERT ... SELECT rules.
commit : 98b83b7349821b05134e6f50f516ecac878cb91d author : Dean Rasheed <firstname.lastname@example.org> date : Thu, 23 Feb 2023 10:57:46 +0000 committer: Dean Rasheed <email@example.com> date : Thu, 23 Feb 2023 10:57:46 +0000
Given an updatable view with a DO ALSO INSERT ... SELECT rule, a multi-row INSERT ... VALUES query on the view fails if the VALUES list contains any DEFAULTs that are not replaced by view defaults. This manifests as an "unrecognized node type" error, or an Assert failure, in an assert-enabled build. The reason is that when RewriteQuery() attempts to replace the remaining DEFAULT items with NULLs in any product queries, using rewriteValuesRTEToNulls(), it assumes that the VALUES RTE is located at the same rangetable index in each product query. However, if the product query is an INSERT ... SELECT, then the VALUES RTE is actually in the SELECT part of that query (at the same index), rather than the top-level product query itself. Fix, by descending to the SELECT in such cases. Note that we can't simply use getInsertSelectQuery() for this, since that expects to be given a raw rule action with OLD and NEW placeholder entries, so we duplicate its logic instead. While at it, beef up the checks in getInsertSelectQuery() by checking that the jointree->fromlist node is indeed a RangeTblRef, and that the RTE it points to has rtekind == RTE_SUBQUERY. Per bug #17803, from Alexander Lakhin. Back-patch to all supported branches. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/17803-53c63ed4ecb4eac6%40postgresql.org
Fix snapshot handling in logicalmsg_decode
commit : 497f863f05982c193c12a115f00be6efa7214b29 author : Tomas Vondra <firstname.lastname@example.org> date : Wed, 22 Feb 2023 15:24:09 +0100 committer: Tomas Vondra <email@example.com> date : Wed, 22 Feb 2023 15:24:09 +0100
Whe decoding a transactional logical message, logicalmsg_decode called SnapBuildGetOrBuildSnapshot. But we may not have a consistent snapshot yet at that point. We don't actually need the snapshot in this case (during replay we'll have the snapshot from the transaction), so in practice this is harmless. But in assert-enabled build this crashes. Fixed by requesting the snapshot only in non-transactional case, where we are guaranteed to have SNAPBUILD_CONSISTENT. Backpatch to 11. The issue exists since 9.6. Backpatch-through: 11 Reviewed-by: Andres Freund Discussion: https://firstname.lastname@example.org
Add missing support for the latest SPI status codes.
commit : 52dbd9f845987ff3a6f97d30b3bebb13fdb4b2b4 author : Dean Rasheed <email@example.com> date : Wed, 22 Feb 2023 13:28:30 +0000 committer: Dean Rasheed <firstname.lastname@example.org> date : Wed, 22 Feb 2023 13:28:30 +0000
SPI_result_code_string() was missing support for SPI_OK_TD_REGISTER, and in v15 and later, it was missing support for SPI_OK_MERGE, as was pltcl_process_SPI_result(). The last of those would trigger an error if a MERGE was executed from PL/Tcl. The others seem fairly innocuous, but worth fixing. Back-patch to all supported branches. Before v15, this is just adding SPI_OK_TD_REGISTER to SPI_result_code_string(), which is unlikely to be seen by anyone, but seems worth doing for completeness. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCUg8V%2BK%2BGcafOPqymxk84Y_prXgfe64PDoopjLFH6Z0Aw%40mail.gmail.com https://postgr.es/m/CAEZATCUMe%2B_KedPMM9AxKqm%3DSZogSxjUcrMe%2BsakusZh3BFcQw%40mail.gmail.com
Fix erroneous Valgrind markings in AllocSetRealloc.
commit : 463bef38332efaef39de22e4325688924a934b76 author : Tom Lane <email@example.com> date : Tue, 21 Feb 2023 18:47:47 -0500 committer: Tom Lane <firstname.lastname@example.org> date : Tue, 21 Feb 2023 18:47:47 -0500
If asked to decrease the size of a large (>8K) palloc chunk, AllocSetRealloc could improperly change the Valgrind state of memory beyond the new end of the chunk: it would mark data UNDEFINED as far as the old end of the chunk after having done the realloc(3) call, thus tromping on the state of memory that no longer belongs to it. One would normally expect that memory to now be marked NOACCESS, so that this mislabeling might prevent detection of later errors. If realloc() had chosen to move the chunk someplace else (unlikely, but well within its rights) we could also mismark perfectly-valid DEFINED data as UNDEFINED, causing false-positive valgrind reports later. Also, any malloc bookkeeping data placed within this area might now be wrongly marked, causing additional problems. Fix by replacing relevant uses of "oldsize" with "Min(size, oldsize)". It's sufficient to mark as far as "size" when that's smaller, because whatever remains in the new chunk size will be marked NOACCESS below, and we expect realloc() to have taken care of marking the memory beyond the new official end of the chunk. While we're here, also rename the function's "oldsize" variable to "oldchksize" to more clearly explain what it actually holds, namely the distance to the end of the chunk (that is, requested size plus trailing padding). This is more consistent with the use of "size" and "chksize" to hold the new requested size and chunk size. Add a new variable "oldsize" in the one stanza where we're actually talking about the old requested size. Oversight in commit c477f3e44. Back-patch to all supported branches, as that was, just in case anybody wants to do valgrind testing on back branches. Karina Litskevich Discussion: https://postgr.es/m/CACiT8iaAET-fmzjjZLjaJC4zwSJmrFyL7LAdHwaYyjjQOQ4hcg@mail.gmail.com
Print the correct aliases for DML target tables in ruleutils.
commit : 3dd287c14facd5ee17e386175752a75ce16a1daa author : Tom Lane <email@example.com> date : Fri, 17 Feb 2023 16:40:34 -0500 committer: Tom Lane <firstname.lastname@example.org> date : Fri, 17 Feb 2023 16:40:34 -0500
ruleutils.c blindly printed the user-given alias (or nothing if there hadn't been one) for the target table of INSERT/UPDATE/DELETE queries. That works a large percentage of the time, but not always: for queries appearing in WITH, it's possible that we chose a different alias to avoid conflict with outer-scope names. Since the chosen alias would be used in any Var references to the target table, this'd lead to an inconsistent printout with consequences such as dump/restore failures. The correct logic for printing (or not) a relation alias was embedded in get_from_clause_item. Factor it out to a separate function so that we don't need a jointree node to use it. (Only a limited part of that function can be reached from these new call sites, but this seems like the cleanest non-duplicative factorization.) In passing, I got rid of a redundant "\d+ rules_src" step in rules.sql. Initial report from Jonathan Katz; thanks to Vignesh C for analysis. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://email@example.com Discussion: https://postgr.es/m/CALDaNm1MMntjmT_NJGp-Z=xbF02qHGAyuSHfYHias3TqQbPF2w@mail.gmail.com
Fix handling of SCRAM-SHA-256's channel binding with RSA-PSS certificates
commit : a40e7b75e689c3f7f9c9c77106a1ff59fa61d938 author : Michael Paquier <firstname.lastname@example.org> date : Wed, 15 Feb 2023 10:12:38 +0900 committer: Michael Paquier <email@example.com> date : Wed, 15 Feb 2023 10:12:38 +0900
OpenSSL 1.1.1 and newer versions have added support for RSA-PSS certificates, which requires the use of a specific routine in OpenSSL to determine which hash function to use when compiling it when using channel binding in SCRAM-SHA-256. X509_get_signature_nid(), that is the original routine the channel binding code has relied on, is not able to determine which hash algorithm to use for such certificates. However, X509_get_signature_info(), new to OpenSSL 1.1.1, is able to do it. This commit switches the channel binding logic to rely on X509_get_signature_info() over X509_get_signature_nid(), which would be the choice when building with 1.1.1 or newer. The error could have been triggered on the client or the server, hence libpq and the backend need to have their related code paths patched. Note that attempting to load an RSA-PSS certificate with OpenSSL 1.1.0 or older leads to a failure due to an unsupported algorithm. The discovery of relying on X509_get_signature_info() comes from Jacob, the tests have been written by Heikki (with few tweaks from me), while I have bundled the whole together while adding the bits needed for MSVC and meson. This issue exists since channel binding exists, so backpatch all the way down. Some tests are added in 15~, triggered if compiling with OpenSSL 1.1.1 or newer, where the certificate and key files can easily be generated for RSA-PSS. Reported-by: Gunnar "Nick" Bluth Author: Jacob Champion, Heikki Linnakangas Discussion: https://firstname.lastname@example.org Backpatch-through: 11
Disable WindowAgg inverse transitions when subplans are present
commit : ac55abd33537f5ac38563826bd9423f961ac66f0 author : David Rowley <email@example.com> date : Mon, 13 Feb 2023 17:08:46 +1300 committer: David Rowley <firstname.lastname@example.org> date : Mon, 13 Feb 2023 17:08:46 +1300
When an aggregate function is used as a WindowFunc and a tuple transitions out of the window frame, we ordinarily try to make use of the aggregate function's inverse transition function to "unaggregate" the exiting tuple. This optimization is disabled for various cases, including when the aggregate contains a volatile function. In such a case we'd be unable to ensure that the transition value was calculated to the same value during transitions and inverse transitions. Unfortunately, we did this check by calling contain_volatile_functions() which does not recursively search SubPlans for volatile functions. If the aggregate function's arguments or its FILTER clause contained a subplan with volatile functions then we'd fail to notice this. Here we fix this by just disabling the optimization when the WindowFunc contains any subplans. Volatile functions are not the only reason that a subplan may have nonrepeatable results. Bug: #17777 Reported-by: Anban Company Discussion: https://postgr.es/m/17777-860b739b6efde977%40postgresql.org Reviewed-by: Tom Lane Backpatch-through: 11
Stop recommending auto-download of DTD files, and indeed disable it.
commit : 11f1f9f4fa309d2592acd71de01765a333a435bb author : Tom Lane <email@example.com> date : Wed, 8 Feb 2023 17:15:23 -0500 committer: Tom Lane <firstname.lastname@example.org> date : Wed, 8 Feb 2023 17:15:23 -0500
It appears no longer possible to build the SGML docs without a local installation of the DocBook DTD, because sourceforge.net now only permits HTTPS access, and no common version of xsltproc supports that. Hence, remove the bits of our documentation suggesting that that's possible or useful. In fact, we might as well add the --nonet option to the build recipes automatically, for a bit of extra security. Also fix our documentation-tool-installation recipes for macOS to ensure that xmllint and xsltproc are pulled in from MacPorts or Homebrew. The previous recipes assumed you could use the Apple-supplied versions of these tools; which still works, except that you'd need to set an environment variable to ensure that they would find DTD files provided by those package managers. Simpler and easier to just recommend pulling in the additional packages. In HEAD, also document how to build docs using Meson, and adjust "ninja docs" to just build the HTML docs, for consistency with the default behavior of doc/src/sgml/Makefile. In a fit of neatnik-ism, I also made the ordering of the package lists match the order in which the tools are described at the head of the appendix. Aleksander Alekseev, Peter Eisentraut, Tom Lane Discussion: https://postgr.es/m/CAJ7c6TO8Aro2nxg=EQsVGiSDe-TstP4EsSvDHd7DSRsP40PgGA@mail.gmail.com
Backpatch OpenSSL 3.0.0 compatibility in tests
commit : 6133a0f4c7c39c9490b5aef91efba76ce83c5a02 author : Peter Eisentraut <email@example.com> date : Fri, 5 Jun 2020 11:18:11 +0200 committer: Andrew Dunstan <firstname.lastname@example.org> date : Fri, 5 Jun 2020 11:18:11 +0200
backport of commit f0d2c65f17 to releases 11 and 12 This means the SSL tests will fail on machines with extremely old versions of OpenSSL, but we don't know of anything trying to run such tests. The ability to build is not affected. Discussion: https://email@example.com
Make EXEC_BACKEND more convenient on Linux and FreeBSD.
commit : 6b4dba711a4ec4be87850108d4f9db12eecd399e author : Michael Paquier <firstname.lastname@example.org> date : Wed, 8 Feb 2023 13:09:52 +0900 committer: Michael Paquier <email@example.com> date : Wed, 8 Feb 2023 13:09:52 +0900
Try to disable ASLR when building in EXEC_BACKEND mode, to avoid random memory mapping failures while testing. For developer use only, no effect on regular builds. This has been originally applied as of f3e7806 for v15~, but recently-added buildfarm member gokiburi tests this configuration on older branches as well, causing it to fail randomly as ASLR would be enabled. Suggested-by: Andres Freund <firstname.lastname@example.org> Tested-by: Bossart, Nathan <email@example.com> Discussion: https://postgr.es/m/20210806032944.m4tz7j2w47mant26%40alap3.anarazel.de Backpatch-through: 12