AccountsDB

Overview

AccountsDB is Agave’s high-performance persistent storage system for account data. It provides concurrent append-only writes with parallel reads, using memory-mapped files for efficient data access. The architecture supports rapid account lookups, snapshot generation, and efficient storage reclamation.

Architecture

Core Components

AccountsDB consists of several integrated subsystems: AccountsDb (accounts-db/src/accounts_db.rs): The main storage engine that manages persistent account data across memory-mapped append-vec files. AccountsIndex (accounts-db/src/accounts_index.rs): In-memory and disk-backed index mapping pubkeys to account storage locations, supporting both cached and persistent modes. AccountsCache (accounts-db/src/accounts_cache.rs): Write-through cache for recently modified accounts before they’re flushed to persistent storage. Accounts (accounts-db/src/accounts.rs): Synchronization layer handling account locks and coordinating access between AccountsDB and the index.

Storage Model

Accounts are stored in AppendVec files with the following characteristics:

Each AppendVec stores accounts for a single slot
Concurrent single-thread append with many concurrent readers
Memory-mapped files for zero-copy access
Stored at path: <path>/<pid>/data/

The system tracks updates using a global atomic write_version counter. Each commit increments this version, allowing the index to identify the latest account state across multiple slots.

Persistent Store Layout:
  accounts_db/
    ├── accounts/
    │   ├── 0.0      # AppendVec for slot 0
    │   ├── 1.0      # AppendVec for slot 1
    │   └── ...
    └── index/       # Persistent index data (optional)

Account Indexing

Index Structure

The AccountsIndex maps Pubkey -> SlotList<AccountInfo> where:

SlotList: SmallVec of (Slot, AccountInfo) tuples tracking account versions
AccountInfo: Contains storage location (offset, storage ID) and metadata
RefCount: Tracks number of storages containing the account (typically 1)

The index supports two operational modes: In-Memory Only (IndexLimit::InMemOnly): All index data kept in RAM for maximum performance. Disk-Backed (IndexLimit::Limit): Less-frequently-accessed entries evicted to disk, reducing memory pressure.

Index Configuration

Key parameters defined in accounts_index.rs:58-75:

BINS_DEFAULT: 8192 - Number of bins for sharding the index
BINS_FOR_TESTING: 2 - Reduced bins for test environments
Default write cache limit: 15GB (WRITE_CACHE_LIMIT_BYTES_DEFAULT)

Scanning and Iteration

AccountsIndex supports flexible scanning with configurable filters:

pub enum ScanFilter {
    All,                      // Scan both in-memory and on-disk
    OnlyAbnormal,             // Skip normal entries (ref_count=1, slot_list.len=1)
    OnlyAbnormalWithVerify,   // Verify on-disk entries are abnormal
    OnlyAbnormalTest,         // Simulate uncached disk index
}

pub enum ScanOrder {
    Unsorted,  // Fastest, no ordering guarantees
    Sorted,    // Ordered by pubkey
}

Snapshot and Account Loading

Snapshot Generation

AccountsDB generates snapshots containing:

Account Storage: All active AppendVec files
Index State: Metadata for reconstructing the accounts index
Bank State: Slot, parent relationships, and bank metadata

Snapshots enable fast validator bootstrapping without replaying the entire ledger.

Account Loading Process

When loading accounts for transaction processing:

Index Lookup: Query AccountsIndex for account location
Storage Access: Read from memory-mapped AppendVec at specified offset
Cache Check: Check AccountsCache for recent modifications
Ancestor Search: Walk slot ancestry until account found

The system uses Ancestors (set of parent slots) to determine account visibility at any point in the chain.

Storage Management

Shrinking and Compaction

AccountsDB reclaims space through shrinking:

Identifies storages with significant dead accounts
Copies alive accounts to new AppendVec
Updates index to point to new locations
Deletes old storage files

Shrinking is triggered when:

Storage utilization falls below threshold
Periodic cleanup intervals elapse

Reference counts guide shrinking decisions:

ref_count = 1: Account exists in only one storage
ref_count > 1: Account duplicated across storages (older versions exist)

See ShrinkCollectAliveSeparatedByRefs structure at accounts_db.rs:145-152 for reference count categorization.

Ancient Append Vecs

Older storages are converted to ancient format:

Larger file sizes for better storage efficiency
Special handling during shrink operations
Separate retention policies

Cleanup and Purging

The system tracks obsolete accounts and reclaims storage:

Dead accounts: Accounts with zero lamports or overwritten
Rooted slots: Only clean slots that are finalized
Background service: AccountsBackgroundService performs periodic cleanup

Concurrency and Locking

Account Locks

The AccountLocks structure (accounts.rs:69) manages read/write locks:

Read locks: Multiple concurrent readers allowed
Write locks: Exclusive access required
Validation: validate_account_locks ensures transaction conflicts are detected

Locks are held only during transaction processing, not storage operations.

Write Cache

The write cache (AccountsCache) provides:

Per-slot caching of modified accounts
Lock-free reads from cache
Periodic flushing to persistent storage
Configurable size limits (default 15GB)

When the cache exceeds limits, the oldest slots are flushed to AppendVec files.

Performance Optimizations

Parallel Processing

AccountsDB leverages parallelism extensively:

Parallel scanning: Use Rayon for multi-threaded account scans
Concurrent storage access: Memory-mapped files enable lock-free reads
Sharded index: BINS_DEFAULT = 8192 bins reduce lock contention

Memory Management

Memory-mapped I/O: Zero-copy access to account data
io_uring support: Asynchronous I/O for large operations (configurable)
Buffer limits: TOTAL_IO_URING_BUFFERS_SIZE_LIMIT = 2GB

Index Optimization

SmallVec for SlotList: Optimize for common case (single slot)
Disk eviction: Move cold index data to disk to reduce RAM
Sharding: 8192 bins reduce lock contention on index updates

Configuration

AccountsDbConfig

Key configuration options in AccountsDbConfig:

pub struct AccountsDbConfig {
    pub index: AccountsIndexConfig,
    pub accounts_hash_cache_path: Option<PathBuf>,
    pub filler_accounts_config: FillerAccountsConfig,
    pub write_cache_limit_bytes: Option<u64>,
    pub ancient_append_vec_offset: Option<i64>,
    // ... additional fields
}

Testing Configuration

Predefined configs for testing:

ACCOUNTS_DB_CONFIG_FOR_TESTING: Minimal configuration with 2 bins
ACCOUNTS_DB_CONFIG_FOR_BENCHMARKS: Production-like with 8192 bins

Key Files

accounts_db.rs:1-200 - Main AccountsDB implementation and storage logic
accounts_index.rs:1-200 - Index structure and account lookup
accounts.rs:1-150 - Synchronization and lock management
accounts_cache.rs - Write-through cache for hot accounts
account_storage.rs - AppendVec file management
account_info.rs - AccountInfo metadata structure

Runtime/Bank: Primary consumer of AccountsDB for transaction processing
Snapshots: Generate and load snapshots from AccountsDB state
AccountsBackgroundService: Asynchronous cleanup and maintenance

RPC API

Plugin Interface

Core Components

Overview

Architecture

Core Components

Storage Model

Account Indexing

Index Structure

Index Configuration

Scanning and Iteration

Snapshot and Account Loading

Snapshot Generation

Account Loading Process

Storage Management

Shrinking and Compaction

Ancient Append Vecs

Cleanup and Purging

Concurrency and Locking

Account Locks

Write Cache

Performance Optimizations

Parallel Processing

Memory Management

Index Optimization

Configuration

AccountsDbConfig

Testing Configuration

Key Files

RPC API

Plugin Interface

Core Components

​Overview

​Architecture

​Core Components

​Storage Model

​Account Indexing

​Index Structure

​Index Configuration

​Scanning and Iteration

​Snapshot and Account Loading

​Snapshot Generation

​Account Loading Process

​Storage Management

​Shrinking and Compaction

​Ancient Append Vecs

​Cleanup and Purging

​Concurrency and Locking

​Account Locks

​Write Cache

​Performance Optimizations

​Parallel Processing

​Memory Management

​Index Optimization

​Configuration

​AccountsDbConfig

​Testing Configuration

​Key Files

​Related Components

Overview

Architecture

Core Components

Storage Model

Account Indexing

Index Structure

Index Configuration

Scanning and Iteration

Snapshot and Account Loading

Snapshot Generation

Account Loading Process

Storage Management

Shrinking and Compaction

Ancient Append Vecs

Cleanup and Purging

Concurrency and Locking

Account Locks

Write Cache

Performance Optimizations

Parallel Processing

Memory Management

Index Optimization

Configuration

AccountsDbConfig

Testing Configuration

Key Files

Related Components