Add maximum bound for `attempted_by` array #974

brandur · 2025-07-05T17:11:22Z

As described by #972, it may be possible for huge numbers of snoozes to
blow out a job row's attempted_by array as a job is locked over and
over again. Multiplied by many jobs, this can produce vast quantities of
data that gets sent over the network.

Here, put in a ratchet on attempted_by so that if the array becomes
larger than 100 elements, we knock the oldest one off in favor of the
most recent client and the most fresh 99.

Unfortunately the implementation isn't particularly clean in either
Postgres or SQLite. In Postgres it would've been cleaner if we'd had the
attempted_by in reverse order so the new client was on front because
the built-in array functions would be friendlier to that layout, but
because it's not, we have to do something a little hackier involving a
CASE statement instead.

SQLite is even worse. SQLite has no array functions at all, which
doesn't help, but moreover every strategy I tried ended up blocked by a
sqlc SQLite bug, so after trying everything I could think of, I ended up
having to extract the piece that does the array truncation into a SQL
template string to get this over the line. This could be removed in the
future if any one of a number of outstanding sqlc bugs are fixed (e.g.
[1]).

[1] sqlc-dev/sqlc#3610

brandur · 2025-07-05T21:15:20Z

riverdriver/riverpgxv5/internal/dbsqlc/river_job.sql

+                            @attempted_by::text
+                        )
+                        ELSE array_append(river_job.attempted_by, @attempted_by::text)
+                        END


@bgentry I don't love how gross this got. Feel free to try and take a shot at golfing it down a bit.

It's unfortunate that new entries were appended instead of prepended, because if they were prepended, we could do this much more nicely like:

attempted_by = trim_array( array_prepend(river_job.attempted_by, @attempted_by::text), array_length(river_job.attempted_by, 1) - @max_attempted_by )

Not worth trying to change it at this point though.

We could also use more of a table-based approach like SQLite, but I left this as array functions as I suspect they're more performant.

This might be just a little cleaner than the CASE?

diff --git a/riverdriver/riverdatabasesql/internal/dbsqlc/river_job.sql.go b/riverdriver/riverdatabasesql/internal/dbsqlc/river_job.sql.go index df48a3a..9025b95 100644 --- a/riverdriver/riverdatabasesql/internal/dbsqlc/river_job.sql.go +++ b/riverdriver/riverdatabasesql/internal/dbsqlc/river_job.sql.go @@ -281,18 +281,14 @@ SET state = 'running', attempt = river_job.attempt + 1, attempted_at = coalesce($1::timestamptz, now()), - attempted_by = CASE WHEN array_length(river_job.attempted_by, 1) >= $2::int - THEN array_append( - -- +2 instead of +1 because in one of those history - -- making mistakes that's likely in aggregate cost - -- humanity >$10B in bugs and lost productivity by - -- now, like strings, Postgres array indexing start - -- at 1 instead of 0. - river_job.attempted_by[array_length(river_job.attempted_by, 1) + 2 - $2:], - $3::text - ) - ELSE array_append(river_job.attempted_by, $3::text) - END + attempted_by = ( + coalesce(river_job.attempted_by, '{}'::text[]) + || $2::text + -- +2 instead of +1 because in one of those history making mistakes + -- that's likely in aggregate cost humanity >$10B in bugs and lost + -- productivity by now, like strings, Postgres array indexing start + -- at 1 instead of 0. + )[ GREATEST(cardinality(river_job.attempted_by) + 2 - $3::int, 1) : ] FROM locked_jobs WHERE

I modified the existing one to flatten it out a bit more, but TBH I have trouble reading this new version, so going to largely leave it as the original.

bgentry

One idea to improve but otherwise LGTM!

CHANGELOG.md

bgentry · 2025-07-06T20:03:32Z

riverdriver/riverpgxv5/internal/dbsqlc/river_job.sql

+                            @attempted_by::text
+                        )
+                        ELSE array_append(river_job.attempted_by, @attempted_by::text)
+                        END


This might be just a little cleaner than the CASE?

diff --git a/riverdriver/riverdatabasesql/internal/dbsqlc/river_job.sql.go b/riverdriver/riverdatabasesql/internal/dbsqlc/river_job.sql.go index df48a3a..9025b95 100644 --- a/riverdriver/riverdatabasesql/internal/dbsqlc/river_job.sql.go +++ b/riverdriver/riverdatabasesql/internal/dbsqlc/river_job.sql.go @@ -281,18 +281,14 @@ SET state = 'running', attempt = river_job.attempt + 1, attempted_at = coalesce($1::timestamptz, now()), - attempted_by = CASE WHEN array_length(river_job.attempted_by, 1) >= $2::int - THEN array_append( - -- +2 instead of +1 because in one of those history - -- making mistakes that's likely in aggregate cost - -- humanity >$10B in bugs and lost productivity by - -- now, like strings, Postgres array indexing start - -- at 1 instead of 0. - river_job.attempted_by[array_length(river_job.attempted_by, 1) + 2 - $2:], - $3::text - ) - ELSE array_append(river_job.attempted_by, $3::text) - END + attempted_by = ( + coalesce(river_job.attempted_by, '{}'::text[]) + || $2::text + -- +2 instead of +1 because in one of those history making mistakes + -- that's likely in aggregate cost humanity >$10B in bugs and lost + -- productivity by now, like strings, Postgres array indexing start + -- at 1 instead of 0. + )[ GREATEST(cardinality(river_job.attempted_by) + 2 - $3::int, 1) : ] FROM locked_jobs WHERE

As described by #972, it may be possible for huge numbers of snoozes to blow out a job row's `attempted_by` array as a job is locked over and over again. Multiplied by many jobs, this can produce vast quantities of data that gets sent over the network. Here, put in a ratchet on `attempted_by` so that if the array becomes larger than 100 elements, we knock the oldest one off in favor of the most recent client and the most fresh 99. Unfortunately the implementation isn't particularly clean in either Postgres or SQLite. In Postgres it would've been cleaner if we'd had the `attempted_by` in reverse order so the new client was on front because the built-in array functions would be friendlier to that layout, but because it's not, we have to do something a little hackier involving a `CASE` statement instead. SQLite is even worse. SQLite has no array functions at all, which doesn't help, but moreover every strategy I tried ended up blocked by a sqlc SQLite bug, so after trying everything I could think of, I ended up having to extract the piece that does the array truncation into a SQL template string to get this over the line. This could be removed in the future if any one of a number of outstanding sqlc bugs are fixed (e.g. [1]). [1] sqlc-dev/sqlc#3610

brandur force-pushed the brandur-max-attempted-by-length branch 10 times, most recently from e8c7aec to 55a1f73 Compare July 5, 2025 21:03

brandur commented Jul 5, 2025

View reviewed changes

brandur requested a review from bgentry July 5, 2025 21:15

bgentry approved these changes Jul 6, 2025

View reviewed changes

brandur force-pushed the brandur-max-attempted-by-length branch from 55a1f73 to 78c9baa Compare July 7, 2025 23:06

brandur force-pushed the brandur-max-attempted-by-length branch from d7d9cc0 to 90ee8dc Compare July 7, 2025 23:10

brandur merged commit bec6982 into master Jul 7, 2025
10 checks passed

brandur deleted the brandur-max-attempted-by-length branch July 7, 2025 23:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add maximum bound for `attempted_by` array #974

Add maximum bound for `attempted_by` array #974

Uh oh!

brandur commented Jul 5, 2025

Uh oh!

brandur Jul 5, 2025

Uh oh!

bgentry Jul 6, 2025

Uh oh!

brandur Jul 7, 2025

Uh oh!

bgentry left a comment

Uh oh!

Uh oh!

bgentry Jul 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add maximum bound for attempted_by array #974

Add maximum bound for attempted_by array #974

Uh oh!

Conversation

brandur commented Jul 5, 2025

Uh oh!

brandur Jul 5, 2025

Choose a reason for hiding this comment

Uh oh!

bgentry Jul 6, 2025

Choose a reason for hiding this comment

Uh oh!

brandur Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

bgentry left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bgentry Jul 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add maximum bound for `attempted_by` array #974

Add maximum bound for `attempted_by` array #974