Skip to content

feat(grpc): Implement PickFirst load balancer#2570

Open
nathanielford wants to merge 41 commits into
hyperium:masterfrom
nathanielford:implement/PickFirstLB
Open

feat(grpc): Implement PickFirst load balancer#2570
nathanielford wants to merge 41 commits into
hyperium:masterfrom
nathanielford:implement/PickFirstLB

Conversation

@nathanielford
Copy link
Copy Markdown
Collaborator

@nathanielford nathanielford commented Mar 25, 2026

Motivation

Full implementation of the pick first load balancer, including 'Happy eyeballs' features.

Solution

Load balancing implementation to pick the first available endpoint to connect to, maintaining stickiness across endpoint updates if configured. Handles accepting new LB configuration and subchannel reconstruction.

Prototype is at https://github.com/nathanielford/grpc-rust-testbed/tree/main/pick_first_lib

Notes

  • Ended up including all happy eyeball features because it wasn't clear where best to slice the line. Considering this a full implementation, and it should be reviewed as such.
  • This does use tokio::spawn and tokio::time, which may need to be replaced to make things runtime agnostic. Please comment in the PR whether this is the case.

@nathanielford nathanielford requested a review from dfawley March 25, 2026 15:30
@nathanielford nathanielford self-assigned this Mar 25, 2026
@nathanielford nathanielford requested review from arjan-bal and removed request for dfawley March 25, 2026 16:17
@nathanielford nathanielford force-pushed the implement/PickFirstLB branch 3 times, most recently from a442272 to 73397ff Compare March 25, 2026 17:29
Copy link
Copy Markdown
Collaborator

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving initial comments while I review the remaining changes.

Comment thread grpc/src/client/load_balancing/pick_first.rs
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs
@arjan-bal arjan-bal removed their assignment Mar 30, 2026
@nathanielford nathanielford force-pushed the implement/PickFirstLB branch from 73397ff to bdf7fc3 Compare April 6, 2026 19:32
@nathanielford nathanielford force-pushed the implement/PickFirstLB branch from 7ea10ad to e1bcaf4 Compare April 27, 2026 20:55
@nathanielford nathanielford requested a review from arjan-bal April 27, 2026 21:38
@nathanielford nathanielford marked this pull request as ready for review April 27, 2026 21:39
@arjan-bal arjan-bal assigned arjan-bal and unassigned nathanielford Apr 28, 2026
…A61 endpoint handling

This commit implements the PickFirst load balancer policy for Tonic gRPC, focusing on:
- Efficient subchannel management with backoff preservation.
- "Stickiness" support: continuing to use an existing Ready subchannel if it remains in resolver updates.
- Compliance with gRFC A61: endpoints are now shuffled before being flattened into an address list, ensuring multiple addresses for a single endpoint (e.g., IPv4/IPv6) stay together.
- Clean state reset: subchannels and selected state are now cleared when receiving an empty address list.
- Alignment with the updated synchronous testing framework in master.

Includes comprehensive test coverage for basic connection, failover, stickiness, exhaustion, deterministic endpoint shuffling, de-duplication, and empty updates.
…active failover

This change enhances the PickFirst load balancing policy to better support
gRFC A61 (Happy Eyeballs) and improve connection establishment latency.

Key changes:
- Implement IPv6/IPv4 address interleaving in `compile_address` to ensure
  subsequent connection attempts alternate between protocol families.
- Introduce a `subchannel_states` cache in `PickFirstPolicy` to track the
  connectivity status of managed subchannels.
- Refactor connection logic to use a `frontier_index` and proactively skip
  subchannels known to be in `TransientFailure` (e.g., during backoff).
- Update `advance_frontier` to safely maintain the index within the bounds
  of the address list, ensuring the policy remains reactive to recovery.
- Add deterministic unit tests for shuffling and interleaving logic.
@nathanielford
Copy link
Copy Markdown
Collaborator Author

Note that this now includes #2631, which will merge first so as to keep a clearer record. However, I needed the fix to get the tests ported from Go to work.

Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
@dfawley dfawley assigned arjan-bal and unassigned nathanielford May 20, 2026
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated

// Should NOT have any more events (no Connect, no UpdatePicker),
// because it stuck to the original selected subchannel.
std::thread::sleep(Duration::from_millis(50));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use tokio::time::timeout or tokio::time::sleep to avoid blocking the entire thread. Same comment for other tests.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Tokio only runs tests with a single thread, so there's no way for the code under test to fail if the thread is blocked. (https://docs.rs/tokio/latest/tokio/attr.test.html#current-thread-runtime)

I've replaced everything to use helpers to pull from the channel.

This might be less perfect for test cases that know a call into the policy should trigger an event inline, but I think it's OK. Let me know what you think.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be less perfect for test cases that know a call into the policy should trigger an event inline.

Looks good to me. I didn't notice that the mpsc channel was not an async channel, so I didn't consider the result of calling recv.

Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment on lines +1208 to +1221
let mut found = None;
for _ in 0..10 {
match rx.try_recv() {
Ok(event) => {
found = Some(event);
break;
}
Err(std::sync::mpsc::TryRecvError::Empty) => {
// Yield to runtime to allow timer task to run.
tokio::time::sleep(std::time::Duration::from_millis(10)).await;
}
Err(e) => panic!("error recv: {:?}", e),
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of looping, we can use a tokio::time::timout of 10s to wrap the rx.recv. We should define 10s as a global const named DEFAULT_TEST_DURATION and re-use it for other tests. For events not expected to happen, we should define a constant DEFAULT_TEST_SHORT_DURATION and set it to 10 millis.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. But note that we can't use recv, as that will block the thread. PTAL.

There's an unfortunate difference between go and rust, which is that we don't get the concept of a context with a deadline in it from the start. So if we use the 10s timeout for every operation, it doesn't apply to the whole test. We could follow this convention, but I don't know if it's worth it. For now I'm just making every thing that could timeout use 10s. I don't think we'll have an issue where things will run really slowly and take many tens of seconds and then timeout, so I think this is fine.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But note that we can't use recv, as that will block the thread. PTAL.

I didn't notice that the mpsc channel was not an async channel. I suspect this is because the TestChannelController is using a std::sync::mpsc instead of tokio mpsc channel, making the receiver blocking. In the DNS resolver tests, I've used an unbounded tokio mspc channel for which sends are sync, but receives can be awaited:

pub(crate) fn new_pair() -> (Self, mpsc::UnboundedReceiver<ResolverUpdate>) {
let (update_tx, update_rx) = mpsc::unbounded_channel();
let cc = Self {
update_result: Ok(()),
update_tx,
};
(cc, update_rx)
}

Maybe a similar pattern can avoid the need for polling in the pickfirst tests? This can be attempted in a later PR.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the test event channel should just be synchronous since all accesses are done synchronously (from non-async lb policy methods). But the work scheduler should probably be an async channel. We can do that in a follow-up?

Comment thread grpc/src/client/load_balancing/pick_first.rs Outdated
Comment thread grpc/src/client/load_balancing/pick_first.rs
Comment thread grpc/src/client/load_balancing/pick_first.rs

// If the timer expires during a connection pass, the LB should advance to
// the next subchannel and trigger a connection attempt.
#[tokio::test(start_paused = true)]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use start_paused for this test specifically?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why it would have been like that. Maybe something about the sleep stuff below? I think this is fixed now.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can revert the Cargo.toml changes now.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Collaborator

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-test changes look good. Left some minor comments on the test code.

@arjan-bal arjan-bal assigned dfawley and unassigned arjan-bal May 21, 2026
@arjan-bal arjan-bal changed the title Implement PickFirst load balancer feat(grpc): Implement PickFirst load balancer May 21, 2026
Copy link
Copy Markdown
Collaborator

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread grpc/src/client/load_balancing/pick_first.rs

// If the timer expires during a connection pass, the LB should advance to
// the next subchannel and trigger a connection attempt.
#[tokio::test(start_paused = true)]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can revert the Cargo.toml changes now.


// Should NOT have any more events (no Connect, no UpdatePicker),
// because it stuck to the original selected subchannel.
std::thread::sleep(Duration::from_millis(50));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be less perfect for test cases that know a call into the policy should trigger an event inline.

Looks good to me. I didn't notice that the mpsc channel was not an async channel, so I didn't consider the result of calling recv.

Comment on lines +1208 to +1221
let mut found = None;
for _ in 0..10 {
match rx.try_recv() {
Ok(event) => {
found = Some(event);
break;
}
Err(std::sync::mpsc::TryRecvError::Empty) => {
// Yield to runtime to allow timer task to run.
tokio::time::sleep(std::time::Duration::from_millis(10)).await;
}
Err(e) => panic!("error recv: {:?}", e),
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But note that we can't use recv, as that will block the thread. PTAL.

I didn't notice that the mpsc channel was not an async channel. I suspect this is because the TestChannelController is using a std::sync::mpsc instead of tokio mpsc channel, making the receiver blocking. In the DNS resolver tests, I've used an unbounded tokio mspc channel for which sends are sync, but receives can be awaited:

pub(crate) fn new_pair() -> (Self, mpsc::UnboundedReceiver<ResolverUpdate>) {
let (update_tx, update_rx) = mpsc::unbounded_channel();
let cc = Self {
update_result: Ok(()),
update_tx,
};
(cc, update_rx)
}

Maybe a similar pattern can avoid the need for polling in the pickfirst tests? This can be attempted in a later PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants