Skip to content

Conversation

@mcollina
Copy link
Member

@mcollina mcollina commented Feb 8, 2026

Summary

Fix releaseWritingBuf() to correctly handle partial writes that split multi-byte UTF-8 characters.

The previous implementation incorrectly converted byte counts to character counts by using:

n = Buffer.from(writingBuf).subarray(0, n).toString().length;

When n bytes cuts through a multi-byte character, the incomplete UTF-8 sequence becomes U+FFFD (replacement character) via .toString(), which has a different .length than the original character. This caused:

  • 3-byte characters (CJK) to be silently dropped
  • 4-byte characters (emoji) to leave lone surrogates in the buffer

The fix backs up from the byte position to find a valid UTF-8 character boundary by checking for continuation bytes (pattern 10xxxxxx), then decodes the properly-aligned bytes to get the correct character count.

Also fixes a typo where this._asyncDrainScheduled was used instead of the private field this.#asyncDrainScheduled.

Fixes: #61744

Fix releaseWritingBuf() to correctly handle partial writes that split
multi-byte UTF-8 characters. The previous implementation incorrectly
converted byte counts to character counts, causing:

- 3-byte characters (CJK) to be silently dropped
- 4-byte characters (emoji) to leave lone surrogates in the buffer

The fix backs up from the byte position to find a valid UTF-8 character
boundary by checking for continuation bytes (pattern 10xxxxxx), then
decodes the properly-aligned bytes to get the correct character count.

Also fixes a typo where this._asyncDrainScheduled was used instead of
the private field this.#asyncDrainScheduled.

Fixes: nodejs#61744
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/streams

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. stream Issues and PRs related to the stream subsystem. labels Feb 8, 2026
@codecov
Copy link

codecov bot commented Feb 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.74%. Comparing base (f77a709) to head (4f5cf65).
⚠️ Report is 59 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #61745      +/-   ##
==========================================
- Coverage   89.74%   89.74%   -0.01%     
==========================================
  Files         674      675       +1     
  Lines      204360   204537     +177     
  Branches    39265    39308      +43     
==========================================
+ Hits       183412   183569     +157     
- Misses      13251    13257       +6     
- Partials     7697     7711      +14     
Files with missing lines Coverage Δ
lib/internal/streams/fast-utf8-stream.js 74.57% <100.00%> (+0.33%) ⬆️

... and 44 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mcollina mcollina added the request-ci Add this label to start a Jenkins CI on a PR. label Feb 9, 2026
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Feb 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-ci PRs that need a full CI run. stream Issues and PRs related to the stream subsystem.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UTF-8 character corruption in fast-utf8-stream.js via releaseWritingBuf() leads to silent data loss on partial writes

4 participants