Overview
The Gossip protocol in Agave implements a distributed, eventually-consistent communication layer that allows validators to share cluster-wide information. It uses a Cluster Replicated Data Store (CRDS) with push and pull mechanisms to propagate data efficiently across thousands of nodes without centralized coordination.Architecture
Core Components
ClusterInfo (cluster_info.rs:150-200): The main gossip coordinator that manages the local CRDS, handles incoming/outgoing messages, and maintains network topology.
CRDS (crds.rs:66-85): The replicated data store mapping CrdsValueLabel -> CrdsValue, supporting partial updates and concurrent access.
CrdsGossip: Coordinates push and pull gossip strategies.
CrdsGossipPull (crds_gossip_pull.rs:1-150): Implements anti-entropy pull requests using bloom filters.
CrdsGossipPush: Implements eager push dissemination to a subset of peers.
ContactInfo (contact_info.rs): Contains node identity and network endpoints (gossip, TVU, TPU, etc.).
Network Topology
Gossip organizes the network in layers:- Layer 0: Leader nodes
- Layer 1: As many nodes as possible (efficient fan-out)
- Layer 2: Remaining nodes (can fit 2^20 nodes if layer 1 has 2^10)
CRDS (Cluster Replicated Data Store)
Data Model
Fromcrds.rs:66-85, the CRDS stores versioned values:
- Partial Updates: Each
CrdsValueLabelmaps to oneCrdsValue - Non-Atomic: Full record updates are not atomic
- Versioned: Each value has wallclock timestamp and local metadata
- Sharded:
CRDS_SHARDS_BITS = 12(4096 shards) for parallelism
CrdsValue Types
CRDS stores various cluster information:- ContactInfo: Node identity and network endpoints
- Vote: Validator vote state
- LowestSlot: Oldest slot a node has available
- SnapshotHashes: Available snapshot information
- EpochSlots: Slots the node has observed
- DuplicateShred: Evidence of slashing conditions
Merge Strategy
Fromcrds.rs:187-200, values are merged using the overrides function:
Versioned CRDS Values
Fromcrds.rs:121-132:
num_push_recv field tracks message origin:
None: Local message or pull requestSome(0): Received via pull responseSome(k)where k > 0: Received via push with k-1 duplicates
Gossip Pull Protocol
Anti-Entropy Mechanism
The pull protocol implements anti-entropy to ensure eventual consistency:- Construct Bloom Filter: Create bloom filter of local CRDS data
- Send Pull Request: Ask random peer for data not in bloom filter
- Receive Response: Peer sends values not matching bloom filter
- Merge Data: Integrate new values into local CRDS
crds_gossip_pull.rs:1-13:
Bloom Filter Configuration
Fromcrds_gossip_pull.rs:53-57:
CrdsFilter
TheCrdsFilter structure (crds_gossip_pull.rs:60-83):
- Higher mask_bits = smaller subset of data
- Allows gradual sync of large datasets
- Computed from ratio of items to max capacity
Pull Request Flow
Fromcrds_gossip_pull.rs:67-73:
PULL_REQUEST_PERIOD = 5)
Gossip Push Protocol
Eager Push Dissemination
The push protocol eagerly sends new data to peers:- New Local Value: Node creates or receives new CRDS value
- Select Peers: Choose subset of cluster based on stake weight
- Push Message: Send value to selected peers
- Prune Messages: Peers can request to be removed from push path
PUSH_MESSAGE_MAX_PAYLOAD_SIZE).
Push Deduplication
The CRDS tracks push duplicates inVersionedCrdsValue.num_push_recv:
- First push:
num_push_recv = Some(1) - Subsequent pushes: Increment counter
- Metrics track redundant pushes for monitoring
Stake-Weighted Selection
Peers are selected for push based on stake weight:- Higher stake nodes more likely to be selected
- Ensures critical validators receive updates quickly
- Uses
WeightedShufflefor stake-proportional randomization
Cluster Discovery
Entrypoints
Fromcluster_info.rs:156:
- Used for initial cluster discovery
- Node sends pull requests to entrypoints
- Learns about other cluster members from responses
Contact Info Propagation
Each node publishes itsContactInfo containing:
- Pubkey: Node identity
- Gossip Socket: Where to send gossip messages
- TVU Socket: Transaction Verification Unit endpoint
- TVU Forwards: For forwarding shreds
- TPU Socket: Transaction Processing Unit endpoint
- TPU Forwards: For forwarding transactions
- RPC Socket: JSON RPC endpoint (if enabled)
- Wallclock: Timestamp for versioning
Gossip Timing
Core Intervals
Fromcluster_info.rs:97-102:
- Process incoming messages
- Send push messages (every round)
- Send pull requests (every 5 rounds)
- Prune inactive peers
- Update metrics
Timeouts and Cleanup
- CRDS Timeout: 15 seconds (
CRDS_GOSSIP_PULL_CRDS_TIMEOUT_MS) - Ping Cache TTL: 1280 seconds (
GOSSIP_PING_CACHE_TTL) - Failed Inserts: 20 seconds (
FAILED_INSERTS_RETENTION_MS)
Network Layer
Socket Management
ClusterInfo manages multiple UDP sockets:- Gossip Socket: Primary gossip communication
- TVU Sockets: Multiple sockets for receiving shreds
- Minimum: 1 (
MINIMUM_NUM_TVU_RECEIVE_SOCKETS) - Default: 1 (
DEFAULT_NUM_TVU_RECEIVE_SOCKETS)
- Minimum: 1 (
- TVU Retransmit: 12 sockets default (
DEFAULT_NUM_TVU_RETRANSMIT_SOCKETS)
Channel Capacity
Fromcluster_info.rs:106-121:
Ping/Pong Protocol
Nodes exchange pings to verify liveness:- Ping Cache: 126,976 capacity (
GOSSIP_PING_CACHE_CAPACITY) - Rate Limiting: 1280/64 = 20 second delay between pings
- Token-based: Pings contain random tokens, verified in pongs
Data Propagation
Push Fanout
Push messages are sent to a calculated subset:- Prune Data: Max 32 nodes per prune message (
MAX_PRUNE_DATA_NODES) - Payload Limits:
- Push:
PUSH_MESSAGE_MAX_PAYLOAD_SIZE - Pull Response:
PULL_RESPONSE_MAX_PAYLOAD_SIZE
- Push:
Pull Response Sizing
Pull responses are sized to fit in packets:- Minimum serialized size tracked (
PULL_RESPONSE_MIN_SERIALIZED_SIZE) - Responses split across multiple messages if needed
- Ensures efficient network utilization
Duplicate Shred Handling
Duplicate shreds (slashing evidence) have special handling:- Tracked separately in CRDS (
duplicate_shredsBTreeMap) - Maximum payload size:
DUPLICATE_SHRED_MAX_PAYLOAD_SIZE - Prioritized propagation for security
Metrics and Monitoring
GossipStats
Fromcluster_info.rs:160, comprehensive metrics:
- Messages sent/received
- Push/pull counts
- Duplicate detection
- Timing information
- Signature sampling
CRDS Statistics
Fromcrds.rs:104-118:
Signature Sampling
Fromcrds.rs:60-64, rare signatures are logged:
Configuration
Capacity Limits
Fromcluster_info.rs:128:
Contact Info Management
- Debug: Log contact info every 10 seconds
- Save: Persist contact info every 60 seconds
Key Files
cluster_info.rs:1-200- Main ClusterInfo coordinationcrds.rs:1-200- CRDS data store implementationcrds_gossip_pull.rs:1-150- Pull protocol with bloom filterscrds_gossip_push.rs- Push protocol implementationcontact_info.rs- Node contact informationcrds_value.rs- CRDS value typescrds_data.rs- CRDS data variants
Related Components
- ClusterInfoVoteListener: Monitors votes from gossip
- TVU: Uses gossip for shred reception coordination
- RepairService: Uses cluster info for repair requests
- ServeRepair: Responds to repair requests using cluster topology