-
Notifications
You must be signed in to change notification settings - Fork 10
Fix use-after-free in IOCP socket service shutdown #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
📝 Walkthrough🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #131 +/- ##
===========================================
+ Coverage 81.03% 81.21% +0.17%
===========================================
Files 64 64
Lines 5685 5685
===========================================
+ Hits 4607 4617 +10
+ Misses 1078 1068 -10 see 2 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
|
An automated preview of the documentation is available at https://131.corosio.prtest3.cppalliance.org/index.html If more commits are pushed to the pull request, the docs will rebuild at the same URL. 2026-02-12 20:56:03 UTC |
|
GCOVR code coverage report https://131.corosio.prtest3.cppalliance.org/gcovr/index.html Build time: 2026-02-12 21:04:15 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/corosio/src/detail/iocp/sockets.cpp`:
- Around line 620-633: The destructor currently deletes nodes returned by
socket_wrapper_list_.pop_front() and acceptor_wrapper_list_.pop_front(), but
those popped nodes keep stale next_/prev_ pointers that later cause list
corruption when their destructor calls unregister_impl() -> remove(); fix by
ensuring pop_front() clears the popped node's linkage (set node->next_ =
node->prev_ = nullptr) so deleted nodes are clearly unlinked, or alternatively
harden remove() in intrusive.hpp to detect and ignore already-unlinked nodes
(e.g., verify a node is linked to this list before dereferencing prev_/next_);
target the pop_front implementation and/or remove/unregister_impl functions to
implement this defensive reset/check.
| win_sockets:: | ||
| ~win_sockets() | ||
| { | ||
| // Delete wrappers that survived shutdown. This runs after | ||
| // win_scheduler is destroyed (reverse creation order), so | ||
| // all coroutine frames and their tcp_socket members are gone. | ||
| for (auto* w = socket_wrapper_list_.pop_front(); w != nullptr; | ||
| w = socket_wrapper_list_.pop_front()) | ||
| delete w; | ||
|
|
||
| for (auto* w = acceptor_wrapper_list_.pop_front(); w != nullptr; | ||
| w = acceptor_wrapper_list_.pop_front()) | ||
| delete w; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, find the file and examine the destructor context
cat -n src/corosio/src/detail/iocp/sockets.cpp | sed -n '610,640p'Repository: cppalliance/corosio
Length of output: 1035
🏁 Script executed:
# Find the list member declarations to identify the list type
rg -n "socket_list_|acceptor_list_|socket_wrapper_list_|acceptor_wrapper_list_" src/corosio/src/detail/iocp/sockets.cpp -B2 -A2Repository: cppalliance/corosio
Length of output: 2035
🏁 Script executed:
# Find the win_sockets class definition
rg -n "class win_sockets|struct win_sockets" src/corosio/src/detail/iocp/ -A30Repository: cppalliance/corosio
Length of output: 8992
🏁 Script executed:
# Find member variable declarations in win_sockets class
rg -n "socket_list_|acceptor_list_|socket_wrapper_list_|acceptor_wrapper_list_" src/corosio/src/detail/iocp/sockets.hppRepository: cppalliance/corosio
Length of output: 320
🏁 Script executed:
# Search for intrusive_list or list template type definition
rg -n "intrusive_list|class.*list" src/corosio/src/detail/iocp/ -B2 -A2 | head -50Repository: cppalliance/corosio
Length of output: 3489
🏁 Script executed:
# Look for the list implementation in capy namespace or detail
find src -name "*.hpp" -o -name "*.h" | xargs rg -l "intrusive_list|pop_front.*remove" | head -10Repository: cppalliance/corosio
Length of output: 484
🏁 Script executed:
# Read the intrusive list implementation
cat -n src/corosio/src/detail/intrusive.hpp | head -200Repository: cppalliance/corosio
Length of output: 5998
🏁 Script executed:
# Look specifically for pop_front and remove implementations
rg -n "pop_front|void remove" src/corosio/src/detail/intrusive.hpp -A15Repository: cppalliance/corosio
Length of output: 418
🏁 Script executed:
# Look at the full remove implementation and check node initialization
rg -n "remove.*noexcept" src/corosio/src/detail/intrusive.hpp -A10Repository: cppalliance/corosio
Length of output: 352
🏁 Script executed:
# Check if there are constructors for win_socket_impl_internal or win_socket_impl
rg -n "win_socket_impl_internal::|win_socket_impl::" src/corosio/src/detail/iocp/sockets.cpp -A5 | head -40Repository: cppalliance/corosio
Length of output: 939
🏁 Script executed:
# Check the destructor chain: look for ~win_socket_impl and ~win_socket_impl_internal
rg -n "~win_socket_impl|unregister_impl" src/corosio/src/detail/iocp/sockets.cpp -B2 -A8 | head -60Repository: cppalliance/corosio
Length of output: 647
🏁 Script executed:
# Check if nodes could be re-added to socket_list_ between pop_front and remove
rg -n "socket_list_.push_back|socket_list_.push_front" src/corosio/src/detail/iocp/sockets.cppRepository: cppalliance/corosio
Length of output: 114
🏁 Script executed:
# Check the full unregister_impl implementations for both socket and acceptor
sed -n '691,697p' src/corosio/src/detail/iocp/sockets.cpp && echo "---" && sed -n '833,837p' src/corosio/src/detail/iocp/sockets.cppRepository: cppalliance/corosio
Length of output: 298
🏁 Script executed:
# Verify shutdown() completely drains both lists
sed -n '635,660p' src/corosio/src/detail/iocp/sockets.cppRepository: cppalliance/corosio
Length of output: 818
Do not call remove() on intrusive list nodes that were already returned by pop_front().
When shutdown() drains socket_list_ and acceptor_list_ via pop_front(), the returned nodes retain stale next_/prev_ pointers to their former neighbors. Later, when those nodes are deleted in ~win_sockets(), the destructor chain triggers unregister_impl() → remove() on these already-popped nodes.
The remove() implementation (intrusive.hpp lines 116–126) directly dereferences w->prev_ and w->next_ without validation, and conditionally updates head_ and tail_ based on those stale pointers. If a popped node was the first in the list (prev_ == nullptr), remove() will set head_ to a stale pointer; if it was the last (next_ == nullptr), remove() will set tail_ to a stale pointer. This silently corrupts the list structure, even though the lists are empty at this point.
This pattern works only because the comment correctly notes that "all coroutine frames and their tcp_socket members are gone" at destruction time and the lists are no longer accessed. However, the implementation is fragile and would fail silently if the shutdown sequence is ever changed. Either reset node pointers in pop_front() (see line 112 of intrusive.hpp—currently only updates the list, not the popped node), or document this contract explicitly and consider defensive checks in remove() to reject already-unlinked nodes.
🤖 Prompt for AI Agents
In `@src/corosio/src/detail/iocp/sockets.cpp` around lines 620 - 633, The
destructor currently deletes nodes returned by socket_wrapper_list_.pop_front()
and acceptor_wrapper_list_.pop_front(), but those popped nodes keep stale
next_/prev_ pointers that later cause list corruption when their destructor
calls unregister_impl() -> remove(); fix by ensuring pop_front() clears the
popped node's linkage (set node->next_ = node->prev_ = nullptr) so deleted nodes
are clearly unlinked, or alternatively harden remove() in intrusive.hpp to
detect and ignore already-unlinked nodes (e.g., verify a node is linked to this
list before dereferencing prev_/next_); target the pop_front implementation
and/or remove/unregister_impl functions to implement this defensive reset/check.
During execution_context destruction, win_sockets::shutdown() was eagerly deleting socket wrapper objects while the scheduler still had pending IOCP operations referencing them. When win_scheduler:: shutdown() later drained those operations and destroyed coroutine frames, the tcp_socket destructors inside those frames would call release() on already-freed wrappers, corrupting the heap. The fix follows the same pattern Asio uses in win_iocp_socket_service_base::base_shutdown(): shutdown() now only closes socket handles to force pending I/O to complete; wrapper deletion is deferred to ~win_sockets(), which runs after the scheduler has finished draining all outstanding operations. Also fixes accept_op's destroy path to properly release its pre-allocated accepted_socket and peer_wrapper before destroying the coroutine frame.
post_deferred_completions() was called with completed_ops_ by reference. When PostQueuedCompletionStatus failed, the recovery path did completed_ops_.splice(ops) where ops aliased completed_ops_, corrupting the intrusive queue and permanently losing operations. Leaked ops never decrement outstanding_work_ so run() hangs. Drain completed_ops_ into a local queue before calling post_deferred_completions, matching the epoll/kqueue backends.
Reset stop_event_posted_ flag in restart() so a stopped IOCP scheduler can run again. Set linger(true, 0) on server sockets in accept churn benchmarks to avoid TIME_WAIT accumulation.
Summary by CodeRabbit
Bug Fixes
Refactor