Skip to content

Comments

HBASE-29905 BackupLogCleaner: skip tables no longer in the backup set#7761

Open
janvanbesien wants to merge 1 commit intoapache:masterfrom
NGDATA:HBASE-29905_backup_cleaner_stale_entries
Open

HBASE-29905 BackupLogCleaner: skip tables no longer in the backup set#7761
janvanbesien wants to merge 1 commit intoapache:masterfrom
NGDATA:HBASE-29905_backup_cleaner_stale_entries

Conversation

@janvanbesien
Copy link

BackupLogCleaner.serverToPreservationBoundaryTs() computes the WAL deletion boundary by iterating over tableSetTimestampMap from persisted BackupInfo sessions. This map can contain entries for tables that were once part of the backup set but have since had all their backups deleted. Their stale, old timestamps drag the minimum WAL boundary back, preventing old WALs from being cleaned up.

Fix: when computing boundaries, load the incrbackupset per backup root from BackupSystemTable and skip tables not in the active set. The incrbackupset is populated on every full backup and pruned when backups are deleted, so it accurately reflects which tables still need WAL retention.

@janvanbesien janvanbesien force-pushed the HBASE-29905_backup_cleaner_stale_entries branch from 5a9a2ef to 9f69699 Compare February 17, 2026 07:50
@DieterDP-ng
Copy link
Contributor

Hi Jan,

2 remarks regarding the format of the PR:

  • HBASE accepts PRs for the master branch. If accepted for master, it's backported to the other branches by the person doing the merge.
  • HBASE asks that the commit header message matches the title of the ticket it solves.

}

@Test
public void testRemovedTableDoesNotPinWals() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this test is misleading. I'd also add a textual description as to the specific case you're checking here.

The current name implies it has to do with a table that was removed. That is also the problem originally encountered (in NGDATA): logs were not being removed for a table that has already been removed. But you're not testing for that here, this test doesn't remove a table.

There's a difference in the logic that's relevant. If you were to extend your testcase to the following: full backup F1, incremental I1, full backup F2 (table 1 only), incremental I2 (table 1 only), delete F1, I1, I'm pretty sure the test would fail because I2 would still refer to table2.

However, if you deleted table 2 after I1, I suspect it will work as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to also modify your commit message a bit to explain the main use case being tables that have since been deleted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review dieter.

As I understand it, the problem can be triggered by explicitly removing a table from the backup (i.e. explicitly deciding to no longer backup said table) or by removing the table completely (i.e. implicitly it is no longer in any future backup either). You're right that I mixed the two cases and that the primary trigger was the latter, not the former.

I now have a test for both scenarios. Does that make sense?

Also updated commit message and rebased on master.

…es in system:backup table

BackupLogCleaner.serverToPreservationBoundaryTs() computes the WAL
deletion boundary by iterating over tableSetTimestampMap from persisted
BackupInfo sessions. This map can contain entries for tables that were
once part of the backup set but have since been deleted (or had all
their backups deleted). Their stale, old timestamps drag the minimum WAL
boundary back, preventing old WALs from being cleaned up.

Fix: when computing boundaries, load the incrbackupset per backup root
from BackupSystemTable and skip tables not in the active set. The
incrbackupset is populated on every full backup and pruned when backups
are deleted, so it accurately reflects which tables still need WAL
retention.
@janvanbesien janvanbesien force-pushed the HBASE-29905_backup_cleaner_stale_entries branch from 9f69699 to 81e9e0d Compare February 17, 2026 13:53
@janvanbesien janvanbesien changed the base branch from branch-2.6 to master February 17, 2026 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants