pgBackRest 2.51 commit log

v2.50: Performance Improvements and Bug Fixes

commit   : 1add35624dbf4061bcb19521022b318418b0bba5    
  
author   : David Steele <[email protected]>    
date     : Mon, 22 Jan 2024 09:54:59 -0300    
  
committer: David Steele <[email protected]>    
date     : Mon, 22 Jan 2024 09:54:59 -0300    

Click here for diff

Bug Fixes:  
  
* Fix short read in block incremental restore. (Reviewed by Stephen Frost, Brent Graveland. Reported by Adol Rodriguez, Brent Graveland.)  
* Fix overflow suppressing backup progress in info output. (Fixed by Robert Donovan. Reviewed by Joe Wildish.)  
  
Improvements:  
  
* Preserve partial files during block incremental delta restore. (Reviewed by Stephen Frost.)  
* Add support for alternate compile-time page sizes. (Contributed by Viktor Kurilko. Reviewed by David Steele.)  
* Skip files truncated during backup when bundling. (Contributed by Georgy Shelkovy. Reviewed by David Steele.)  
* Improve SFTP storage error messages. (Contributed by Reid Thompson. Reviewed by David Steele.)  

M CONTRIBUTING.md
M README.md
M doc/resource/exe.cache
M doc/resource/git-history.cache
M doc/xml/auto/metric-coverage-report.auto.xml
M doc/xml/release/2024/2.50.xml
M meson.build
M src/build/configure.ac
M src/configure
M src/version.h
M test/code-count/file-type.yaml

Cleanup tablespace tests in command/backup module.

commit   : fd8974914d8702e5d889f03e08eeda676435faa2    
  
author   : David Steele <[email protected]>    
date     : Fri, 12 Jan 2024 13:55:08 -0300    
  
committer: David Steele <[email protected]>    
date     : Fri, 12 Jan 2024 13:55:08 -0300    

Click here for diff

Tablespaces were enabled for tests that did not have tablespaces, resulting in tablespace_map being present in backups even when it was not needed.  
  
Instead of specifying if tablespaces are present, automatically detect tablespaces in hrnBackupPqScript().  

M test/src/common/harnessBackup.c
M test/src/common/harnessBackup.h
M test/src/module/command/backupTest.c

commit   : 3926dd346ef7b6e239e939e9a046a4be8597c584    
  
author   : David Steele <[email protected]>    
date     : Thu, 4 Jan 2024 14:55:44 -0300    
  
committer: David Steele <[email protected]>    
date     : Thu, 4 Jan 2024 14:55:44 -0300    

Click here for diff

M LICENSE
M src/common/macro.h
M src/postgres/interface/static.vendor.h
M src/postgres/interface/version.vendor.h

Remove installation of old yum.p.o repository key.

commit   : a484862763e85a7cc11b6a45366b3b78b06cd373    
  
author   : David Steele <[email protected]>    
date     : Thu, 4 Jan 2024 14:53:44 -0300    
  
committer: David Steele <[email protected]>    
date     : Thu, 4 Jan 2024 14:53:44 -0300    

Click here for diff

M doc/xml/user-guide.xml

Break up scripts in db unit test module.

commit   : 55f22489630ae409f03a2ad9ec37c3e38a3f7ab2    
  
author   : David Steele <[email protected]>    
date     : Wed, 3 Jan 2024 18:15:50 -0300    
  
committer: David Steele <[email protected]>    
date     : Wed, 3 Jan 2024 18:15:50 -0300    

Click here for diff

Long scripts followed by a number of tests are really challenging to debug and update.  
  
Instead, break up the scripts to be inline with the tests that they drive. This should make maintenance of the tests much simpler.  

M test/src/module/db/dbTest.c

Improve SFTP storage error messages.

commit   : 7c17eec3db455d3e8e660f4f8247b6975517648f    
  
author   : Reid Thompson <[email protected]>    
date     : Wed, 3 Jan 2024 11:01:50 -0500    
  
committer: GitHub <[email protected]>    
date     : Wed, 3 Jan 2024 11:01:50 -0500    

Click here for diff

Use storageSftpEvalLibSsh2Error() in more locations to provide better error information. Also add storageSftpLibSsh2SessionLastError() for the same reason.

M doc/xml/release/2024/2.50.xml
M src/storage/sftp/read.c
M src/storage/sftp/storage.c
M src/storage/sftp/write.c
M test/src/common/harnessLibSsh2.c
M test/src/module/storage/sftpTest.c

Remove FreeBSD 12 and add FreeBSD 14 to Cirrus CI.

commit   : 802ae7914897f3d53788e526ce22562e434c92ea    
  
author   : David Steele <[email protected]>    
date     : Wed, 3 Jan 2024 12:43:50 -0300    
  
committer: David Steele <[email protected]>    
date     : Wed, 3 Jan 2024 12:43:50 -0300    

Click here for diff

FreeBSD 12 is now EOL.  
  
Also update the image version for FreeBSD 13.  

M .cirrus.yml

Remove resolved comment.

commit   : 8a8cfba62bfe2293d6355f68d44bab5323a91118    
  
author   : Reid Thompson <[email protected]>    
date     : Wed, 27 Dec 2023 12:53:53 -0300    
  
committer: David Steele <[email protected]>    
date     : Wed, 27 Dec 2023 12:53:53 -0300    

Click here for diff

M test/src/module/storage/sftpTest.c

Fix short read in block incremental restore.

commit   : f4a4af299bee93ab7dde599ccdb35becc1c2afad    
  
author   : David Steele <[email protected]>    
date     : Wed, 27 Dec 2023 12:49:47 -0300    
  
committer: GitHub <[email protected]>    
date     : Wed, 27 Dec 2023 12:49:47 -0300    

Click here for diff

During restore it is possible to read all the blocks out of a compressed super block without reading all the input. This is because the compression format may have some trailing bytes that are not required for decompression but are required to indicate that data has ended. If a buffer aligned with the compressed data in a certain way, these last bytes might not be read.
  

  
Explicitly read out any final bytes at the end of each super block to handle this case. This should always result in no additional data out and we check for that, but it does move the read position to the beginning of the next compressed super block so decompression can begin without error.

M doc/xml/release/2024/2.50.xml
M doc/xml/release/contributor.xml
M src/command/restore/blockDelta.c
M src/common/io/read.c
M src/common/io/read.h
M test/define.yaml
M test/src/module/command/restoreTest.c
M test/src/module/common/ioTest.c

Fix incorrect test comment.

commit   : c47b39acf67b375cbee9773904ad1afc6a1bb9ec    
  
author   : David Steele <[email protected]>    
date     : Wed, 27 Dec 2023 12:39:45 -0300    
  
committer: David Steele <[email protected]>    
date     : Wed, 27 Dec 2023 12:39:45 -0300    

Click here for diff

Left over from an older implementation.  

M test/src/module/command/backupTest.c

Move block testBlockDelta() to harness module.

commit   : 4324b568a922105b272e2031559dae5f374a04da    
  
author   : David Steele <[email protected]>    
date     : Tue, 26 Dec 2023 21:07:56 -0300    
  
committer: David Steele <[email protected]>    
date     : Tue, 26 Dec 2023 21:07:56 -0300    

Click here for diff

This makes the function available to other test modules.  
  
Also rename to hrnBlockDeltaRender().  

M test/define.yaml
A test/src/common/harnessBlockIncr.c
A test/src/common/harnessBlockIncr.h
M test/src/module/command/backupTest.c

Refactor skip files truncated during backup when bundling.

commit   : 9049fec2c02c4110e16ffdff7b56f83d8ff32b4a    
  
author   : David Steele <[email protected]>    
date     : Fri, 22 Dec 2023 13:16:45 -0300    
  
committer: David Steele <[email protected]>    
date     : Fri, 22 Dec 2023 13:16:45 -0300    

Click here for diff

Refactor 02eea555 to always close the file immediately on EOF and use backupCopyResultCopy to continue processing. Closing the file immediately saves a later EOF check and is friendlier to added logic in this area. Using backupCopyResultCopy to continue is clearer also makes it easier to add new logic.  
  
Also store zero checksum so the bulk of results collection can be moved within the copy block.  

M doc/xml/release/2024/2.50.xml
M src/command/backup/file.c

Allow const checksum buffers to be returned from backupFile().

commit   : c8795094d418802a0e24d4d19da66973c71b00e5    
  
author   : David Steele <[email protected]>    
date     : Fri, 22 Dec 2023 12:48:01 -0300    
  
committer: David Steele <[email protected]>    
date     : Fri, 22 Dec 2023 12:48:01 -0300    

Click here for diff

This allows less duplication of buffers.  
  
For delta check return file->pgFileSize/file->pgFileChecksum instead of pgTestSize/pgTestChecksum since this saves one buffer duplication and we know these values are equal since we just checked them.  
  
Also add an assert to ensure copyChecksum is valid relative to size.  

M doc/xml/release/2024/2.50.xml
M src/command/backup/file.c
M src/command/backup/file.h
M src/command/backup/protocol.c

Do not preserve block incremental if file is less than prior block size.

commit   : 4f760df417e17dc03ce07c0116dfaac506e04c84    
  
author   : David Steele <[email protected]>    
date     : Fri, 22 Dec 2023 00:59:12 -0300    
  
committer: GitHub <[email protected]>    
date     : Fri, 22 Dec 2023 00:59:12 -0300    

Click here for diff

If a file stored with block incremental shrinks below the prior block size then the map is useless and the entire file needs to be stored again.
  

  
In this case use the new block incremental values (even if none) rather than preserving the old ones.

M doc/xml/release/2024/2.50.xml
M src/info/manifest.c
M test/src/module/command/backupTest.c

Update warning for backup resume invalid repo file.

commit   : 3cd8249dbaa085bd075464d420f68a2a8b34a165    
  
author   : David Steele <[email protected]>    
date     : Thu, 21 Dec 2023 15:28:37 -0300    
  
committer: David Steele <[email protected]>    
date     : Thu, 21 Dec 2023 15:28:37 -0300    

Click here for diff

This warning has had a note since the C migration that it should be moved below the backup file log message, so do that.  
  
Also update the warning message a bit to correct for tense. This message was likely in a different place originally.  

M src/command/backup/backup.c
M test/src/module/command/backupTest.c

Refactor backupFile() to remove backupCopyResultReCopy.

commit   : 701865eca1c8896a913e05f41a748c225b84af11    
  
author   : David Steele <[email protected]>    
date     : Thu, 21 Dec 2023 15:20:03 -0300    
  
committer: David Steele <[email protected]>    
date     : Thu, 21 Dec 2023 15:20:03 -0300    

Click here for diff

Having two enum values for file copy makes things a bit more complicated than they need to be (especially in an upcoming commit).  
  
Instead add a flag to indicate that the repository file was invalid since the only purpose is to trigger a warning message.  

M src/command/backup/backup.c
M src/command/backup/file.c
M src/command/backup/file.h
M src/command/backup/protocol.c

Preserve partial files during block incremental delta restore.

commit   : a42614e8f3a9e6871c16331f01c2cd765ab181aa    
  
author   : David Steele <[email protected]>    
date     : Thu, 21 Dec 2023 15:08:07 -0300    
  
committer: GitHub <[email protected]>    
date     : Thu, 21 Dec 2023 15:08:07 -0300    

Click here for diff

Previously files that were smaller than the expected size were not preserved for block incremental, even though it is possible that block incremental could make use of a partial file.
  

  
One example is when a restore encounters an error. On retry the partial file can be used as a starting point rather than copying again from the beginning. Another example is restoring a backup where a file is larger than what already exists in the data directory.
  

  
Preserve any size file when block incremental will be used for the delta in order to reuse partial files when possible. If the file is smaller than expected then disable the whole-file checksum to reduce overhead.

M doc/xml/release/2024/2.50.xml
M src/command/restore/file.c
M test/src/module/command/restoreTest.c

Refactor backup incremental manifest generation.

commit   : ad8febec0851473240f6f20bd4eb299b525a6d44    
  
author   : David Steele <[email protected]>    
date     : Thu, 21 Dec 2023 13:32:10 -0300    
  
committer: GitHub <[email protected]>    
date     : Thu, 21 Dec 2023 13:32:10 -0300    

Click here for diff

This refactor should provide more clarity on what factors affect an incremental, rather that just having one big expression do it all. Overall this may be slightly more efficient since some values are reused that before were recalculated.
  

  
No behavioral changes are introduced.

M doc/xml/release/2024/2.50.xml
M src/info/manifest.c

Add tests to command/backup and info/manifest modules.

commit   : f3584e2143de518bda52151a7102ed638dcb0e95    
  
author   : David Steele <[email protected]>    
date     : Thu, 21 Dec 2023 11:20:10 -0300    
  
committer: David Steele <[email protected]>    
date     : Thu, 21 Dec 2023 11:20:10 -0300    

Click here for diff

These tests exercise various interesting cases and provide coverage for proposed improvements.  

M test/src/module/command/backupTest.c
M test/src/module/info/manifestTest.c

Fix overflow suppressing backup status in info output.

commit   : 25f14898babf4ad78baef3e8343933a863633975    
  
author   : Robert Donovan <[email protected]>    
date     : Thu, 21 Dec 2023 13:16:13 +0000    
  
committer: GitHub <[email protected]>    
date     : Thu, 21 Dec 2023 13:16:13 +0000    

Click here for diff

Writing the sz and szCplt parameters in the lock file used jsonWriteUInt64() but reading these parameters used jsonReadUInt(). This caused a silent exception for any backups larger than MAX_UINT and prevented the info command from reporting progress.
  

  
Correct this so the reads are symmetric and verified before/after with a test.

M doc/xml/release/2024/2.50.xml
M doc/xml/release/contributor.xml
M src/common/lock.c
M test/src/module/common/lockTest.c

Use original file size to log size changes during backup.

commit   : 8af3c1c9acc7117a0cf8e3ce862c5639c82bf78d    
  
author   : David Steele <[email protected]>    
date     : Sun, 17 Dec 2023 13:15:03 -0300    
  
committer: David Steele <[email protected]>    
date     : Sun, 17 Dec 2023 13:15:03 -0300    

Click here for diff

c9703b35 added logging for file size changes during backup. Later 5ed6f8df added the sizeOriginal member to ManifestFile, which arguably is better to use for logging rather than size before backup since it will always contain the original size. Size could in theory be modified for deduplication purposes.  
  
Update logging to use sizeOriginal.  

M src/command/backup/backup.c

Add block incremental test where timestamp changes but file is the same.

commit   : bb6e5164ee6efd40fde73acd8f7b30c4afa1548d    
  
author   : David Steele <[email protected]>    
date     : Sat, 16 Dec 2023 11:42:27 -0300    
  
committer: David Steele <[email protected]>    
date     : Sat, 16 Dec 2023 11:42:27 -0300    

Click here for diff

If delta is not enabled, then the timestamp is used to determine if a file has changed. If the timestamp changes but the file is the same then the prior map will be stored unchanged in the new backup. This is not quite as bad as storing the entire file but it is obviously not ideal.  
  
This will be fixed in a future commit, but add the test now to show the current behavior.  

M test/src/module/command/backupTest.c

Skip files truncated during backup when bundling.

commit   : 02eea555c7169e99892fdbc8ce7cc1ee59f50509    
  
author   : Georgy Shelkovy <[email protected]>    
date     : Thu, 14 Dec 2023 22:21:06 +0500    
  
committer: GitHub <[email protected]>    
date     : Thu, 14 Dec 2023 22:21:06 +0500    

Click here for diff

In bundle mode pgBackRest skips files of zero size, that is, it does not queue them for copying.
  

  
After splitting the files into bundles, pgBackRest launches one or more processes that directly perform the backup, namely, read the files and, if necessary, write them to the bundles.
  

  
If during the time between the distribution of all files among bundles and the direct copying of a file to a bundle, this file of non-zero size was truncated to zero size (for example, when the table was truncated), then pgBackRest still unconditionally places such a zero-size file in the bundle, taking up space in it equal to the size of the headings, and additionally writes the original file size to the manifest.
  

  
In debug build an assertion was added, that does not allow zero-size files to be written to bundles, which leads to an error.
  

  
To solve the problem, this patch, when reading the next file, loads one buffer from the file to detect if it is zero-size. If so it marks the file as truncated and continues on to the next file.
  

  
The advantages of the solution are that, firstly, the assert will not fire on debug builds, and secondly, we will not place zero-size files in bundles, which exactly corresponds to the specification.
  

  
The patch adds the backupCopyResultTruncate value to the BackupCopyResult enumeration to use it to indicate the result when a non-zero size file is truncated to zero size during the backup process.

M doc/xml/release/2024/2.50.xml
M src/command/backup/backup.c
M src/command/backup/file.c
M src/command/backup/file.h
M src/info/manifest.c
M test/src/module/command/backupTest.c

Add support for alternate compile-time page sizes.

commit   : 89d5278b74ab0d880345575d7ac4c239cd1fce54    
  
author   : Viktor Kurilko <[email protected]>    
date     : Thu, 14 Dec 2023 19:28:52 +0300    
  
committer: GitHub <[email protected]>    
date     : Thu, 14 Dec 2023 19:28:52 +0300    

Click here for diff

Alternate pages sizes can be selected at compile-time, .e.g. 4096. While compile-time settings are generally not well tested by core, some established forks such as Greenplum use them.

M doc/xml/release/2024/2.50.xml
M doc/xml/release/contributor.xml
M src/command/backup/backup.c
M src/command/backup/file.c
M src/command/backup/file.h
M src/command/backup/pageChecksum.c
M src/command/backup/pageChecksum.h
M src/command/backup/protocol.c
M src/postgres/interface.c
M src/postgres/interface.h
M src/postgres/interface/page.c
D src/postgres/interface/pageChecksum.vendor.c.inc
M src/postgres/interface/static.vendor.h
M test/code-count/file-type.yaml
M test/src/common/harnessPostgres.c
M test/src/module/command/backupTest.c
M test/src/module/postgres/interfaceTest.c

Fix flapping test on older ninja versions in test unit.

commit   : d205a61949ae8491f5ce9b2019cda79d2fe7b216    
  
author   : David Steele <[email protected]>    
date     : Fri, 1 Dec 2023 11:54:30 -0300    
  
committer: David Steele <[email protected]>    
date     : Fri, 1 Dec 2023 11:54:30 -0300    

Click here for diff

Older versions of ninja may fail to rebuild correctly when changes are made to the configuration. In this case there is an automatic retry but the unexpected log output would cause the test to fail.  
  
For tests that are expected to succeed, check that the log is empty but also accept a retry message as long as the test does eventually succeed.  
  
Add a new harness function, harnessLogResultEmptyOrContains(), to make this work and also clean up some adjacent code.  

M test/src/common/harnessLog.c
M test/src/common/harnessLog.h
M test/src/common/harnessTest.h
M test/src/module/test/testTest.c

Use unique port for each server unit test.

commit   : 7ce0f5a94c229abe415029ce80241a64665c93a2    
  
author   : David Steele <[email protected]>    
date     : Thu, 30 Nov 2023 16:43:09 -0300    
  
committer: David Steele <[email protected]>    
date     : Thu, 30 Nov 2023 16:43:09 -0300    

Click here for diff

If the same port is reused too quickly bind may fail with this error:  
  
FileOpenError: unable to bind socket: [98] Address already in use  
  
We specify SO_REUSEADDR when creating the socket but apparently this is not always enough if the port is reused very rapidly.  
  
Fix this (hopefully) by using a unique port for each test that needs one. This does in theory limit the number of tests that can use ports, but we allow 768 per test, whereas the test that uses the most ports is common/io-tls with 4.  

M test/src/common/harnessServer.c
M test/src/common/harnessServer.h
M test/src/module/command/serverTest.c
M test/src/module/common/ioHttpTest.c
M test/src/module/common/ioTlsTest.c
M test/src/module/protocol/protocolTest.c
M test/src/module/storage/azureTest.c
M test/src/module/storage/gcsTest.c
M test/src/module/storage/s3Test.c

Output coverage report on test failure in CI.

commit   : a14732789baacb4d35d5da6d0adf337ceea7f595    
  
author   : David Steele <[email protected]>    
date     : Wed, 29 Nov 2023 09:31:57 -0300    
  
committer: David Steele <[email protected]>    
date     : Wed, 29 Nov 2023 09:31:57 -0300    

Click here for diff

This allows analysis of coverage failures that only happen in CI. It is not ideal since the report needs to be copied from the log output into an HTML file where it can be viewed, but better than nothing.  

M .github/workflows/test.yml
M test/lib/pgBackRestTest/Common/CoverageTest.pm
M test/test.pl

Improve comments in socket test harness.

commit   : cb6bceb9f1074436b7d0b67a94d782113b5f2b35    
  
author   : David Steele <[email protected]>    
date     : Tue, 28 Nov 2023 16:38:42 -0300    
  
committer: David Steele <[email protected]>    
date     : Tue, 28 Nov 2023 16:38:42 -0300    

Click here for diff

M test/src/common/harnessSocket.c
M test/src/common/harnessSocket.h

Allow custom type/message for errRetryAdd().

commit   : 70e15dacc77bcc2b7e4c2a222d683872d2180f5b    
  
author   : David Steele <[email protected]>    
date     : Tue, 28 Nov 2023 16:35:37 -0300    
  
committer: David Steele <[email protected]>    
date     : Tue, 28 Nov 2023 16:35:37 -0300    

Click here for diff

It may be useful to customize the message or add a message that was never thrown. The latter case will be used in an upcoming commit.  

M src/common/error/retry.c
M src/common/error/retry.h
M src/common/io/http/request.c
M src/common/io/socket/client.c
M src/protocol/server.c
M test/src/module/common/errorRetryTest.c

Begin v2.50 development.

commit   : 85bc9f27d8b70a879051f604861078034263b095    
  
author   : David Steele <[email protected]>    
date     : Mon, 27 Nov 2023 09:06:53 -0300    
  
committer: David Steele <[email protected]>    
date     : Mon, 27 Nov 2023 09:06:53 -0300    

Click here for diff

M doc/resource/git-history.cache
M doc/xml/release.xml
A doc/xml/release/2024/2.50.xml
M meson.build
M src/build/configure.ac
M src/configure
M src/version.h