Skip to content

fix(routes): add IPIP return path on leader when local=false#7

Merged
kvaps merged 1 commit intomainfrom
fix/cilium-leader-return-ipip
Feb 16, 2026
Merged

fix(routes): add IPIP return path on leader when local=false#7
kvaps merged 1 commit intomainfrom
fix/cilium-leader-return-ipip

Conversation

@kvaps
Copy link
Member

@kvaps kvaps commented Feb 16, 2026

Summary

  • When running with --local=false and --compatibility=cilium, the leader did not create IPIP return routes for non-leader nodes in the same location
  • Non-leaders encapsulated traffic to the leader via IPIP through Cilium's VxLAN overlay, but the leader sent replies directly via the physical interface — these packets were dropped by the cloud network (IP protocol 4 blocked) or by reverse path filtering
  • Add routing rules (iif kilo0 → table 1107) on the leader that use the overlay gateway (Cilium internal IP) so IPIP outer packets traverse the VxLAN tunnel back to non-leaders

Test plan

  • Deployed on a mixed cluster (Azure workers + bare-metal control plane) with --local=false --encapsulate=crosssubnet --compatibility=cilium
  • Verified leader creates correct routes: 10.2.1.x via <cilium_ip> dev tunl0 table 1107 onlink
  • Verified iif kilo0 policy rules are created for non-leader private IPs
  • Ping from remote zone to worker nodes works (was failing before)
  • kubectl exec on worker nodes works (was timing out before)

Summary by CodeRabbit

  • New Features
    • Improved mesh network routing to ensure traffic is properly tunneled through overlay gateways when operating in non-local contexts with encapsulation enabled.

When running with --local=false and --compatibility=cilium, the leader
node did not create IPIP return routes for non-leader nodes in the same
location. This caused asymmetric routing: non-leaders encapsulated
traffic to the leader via IPIP (through Cilium's VxLAN overlay), but
the leader sent replies directly via the physical interface, which could
be dropped by cloud networks blocking IP protocol 4 or by reverse path
filtering on the non-leaders.

Add a new routing block for the !local case that creates:
- Routes in table 1107 using the overlay gateway (Cilium internal IP)
  so IPIP outer packets traverse the VxLAN tunnel
- Policy rules matching traffic arriving on the WireGuard interface
  (iif kilo0) destined for non-leader private IPs

This only activates when the encapsulator returns a gateway different
from the node's private IP, i.e. when an overlay like Cilium is in use.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 16, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This change adds a new routing block in pkg/mesh/routes.go that generates additional routes for non-local scenarios, directing traffic destined to other nodes' private IPs through an overlay gateway via the tunnel interface when encapsulation is enabled.

Changes

Cohort / File(s) Summary
Overlay Gateway Routing
pkg/mesh/routes.go
Adds conditional routing logic to generate routes for non-local nodes' private IPs through an overlay gateway via IPIP tunnel, including corresponding policy rules and onlink routes when not in local location and encapsulation is enabled.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Through tunnels deep and gateways grand,
Routes now whisper across the land,
No local node left out of sight,
Overlay magic makes packets alight!
IPIP paths guide the way,
Mesh traffic hops by night and day. 🌙

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/cilium-leader-return-ipip

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @kvaps, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical routing problem in environments using kilo with "--local=false" and Cilium compatibility. Previously, leader nodes would not correctly route IPIP return traffic to non-leader nodes, causing network communication failures. The changes introduce specific routing rules on the leader to ensure that IPIP encapsulated traffic is properly sent back through the overlay network, thereby restoring full connectivity and preventing packet drops.

Highlights

  • IPIP Return Path Fix: Resolved an issue where leader nodes, when configured with "--local=false" and "--compatibility=cilium", failed to establish correct IPIP return paths for non-leader nodes, leading to dropped packets.
  • Overlay Gateway Routing: Implemented routing rules on the leader to direct IPIP outer packets through the overlay gateway (Cilium internal IP) via the VxLAN tunnel, preventing direct replies that were blocked by cloud networks or reverse path filtering.
  • Enhanced Connectivity: Ensured proper functionality for cross-subnet communication, including pings from remote zones and "kubectl exec" commands on worker nodes, which were previously failing or timing out.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • pkg/mesh/routes.go
    • Added conditional logic to create IPIP return routes for non-leader nodes when "local" is false and an encapsulation strategy is active.
    • Included new "netlink.Route" and "netlink.Rule" configurations to utilize the overlay gateway for IPIP traffic.
Activity
  • Deployed the changes on a mixed cluster (Azure workers + bare-metal control plane) with "--local=false --encapsulate=crosssubnet --compatibility=cilium".
  • Verified that the leader node correctly creates routes like "10.2.1.x via <cilium_ip> dev tunl0 table 1107 onlink".
  • Confirmed the creation of "iif kilo0" policy rules for non-leader private IPs.
  • Validated that ping from remote zones to worker nodes now functions correctly.
  • Ensured "kubectl exec" on worker nodes no longer times out.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a fix for a routing issue where IPIP return paths were missing on leader nodes when local=false, particularly in environments using Cilium. The change correctly adds routes and rules to ensure return traffic is properly encapsulated and routed through the overlay network. The approach is sound, and I have one suggestion to ensure the source IP of the new route is explicitly set for correctness and consistency.

@kvaps kvaps marked this pull request as ready for review February 16, 2026 10:49
@kvaps kvaps merged commit 294b7b6 into main Feb 16, 2026
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant