Overview
AccountsDB is Agave’s high-performance persistent storage system for account data. It provides concurrent append-only writes with parallel reads, using memory-mapped files for efficient data access. The architecture supports rapid account lookups, snapshot generation, and efficient storage reclamation.Architecture
Core Components
AccountsDB consists of several integrated subsystems: AccountsDb (accounts-db/src/accounts_db.rs): The main storage engine that manages persistent account data across memory-mapped append-vec files.
AccountsIndex (accounts-db/src/accounts_index.rs): In-memory and disk-backed index mapping pubkeys to account storage locations, supporting both cached and persistent modes.
AccountsCache (accounts-db/src/accounts_cache.rs): Write-through cache for recently modified accounts before they’re flushed to persistent storage.
Accounts (accounts-db/src/accounts.rs): Synchronization layer handling account locks and coordinating access between AccountsDB and the index.
Storage Model
Accounts are stored in AppendVec files with the following characteristics:- Each AppendVec stores accounts for a single slot
- Concurrent single-thread append with many concurrent readers
- Memory-mapped files for zero-copy access
- Stored at path:
<path>/<pid>/data/
write_version counter. Each commit increments this version, allowing the index to identify the latest account state across multiple slots.
Account Indexing
Index Structure
The AccountsIndex mapsPubkey -> SlotList<AccountInfo> where:
- SlotList: SmallVec of
(Slot, AccountInfo)tuples tracking account versions - AccountInfo: Contains storage location (offset, storage ID) and metadata
- RefCount: Tracks number of storages containing the account (typically 1)
IndexLimit::InMemOnly): All index data kept in RAM for maximum performance.
Disk-Backed (IndexLimit::Limit): Less-frequently-accessed entries evicted to disk, reducing memory pressure.
Index Configuration
Key parameters defined inaccounts_index.rs:58-75:
BINS_DEFAULT: 8192- Number of bins for sharding the indexBINS_FOR_TESTING: 2- Reduced bins for test environments- Default write cache limit: 15GB (
WRITE_CACHE_LIMIT_BYTES_DEFAULT)
Scanning and Iteration
AccountsIndex supports flexible scanning with configurable filters:Snapshot and Account Loading
Snapshot Generation
AccountsDB generates snapshots containing:- Account Storage: All active AppendVec files
- Index State: Metadata for reconstructing the accounts index
- Bank State: Slot, parent relationships, and bank metadata
Account Loading Process
When loading accounts for transaction processing:- Index Lookup: Query AccountsIndex for account location
- Storage Access: Read from memory-mapped AppendVec at specified offset
- Cache Check: Check AccountsCache for recent modifications
- Ancestor Search: Walk slot ancestry until account found
Ancestors (set of parent slots) to determine account visibility at any point in the chain.
Storage Management
Shrinking and Compaction
AccountsDB reclaims space through shrinking:- Identifies storages with significant dead accounts
- Copies alive accounts to new AppendVec
- Updates index to point to new locations
- Deletes old storage files
- Storage utilization falls below threshold
- Periodic cleanup intervals elapse
ref_count = 1: Account exists in only one storageref_count > 1: Account duplicated across storages (older versions exist)
ShrinkCollectAliveSeparatedByRefs structure at accounts_db.rs:145-152 for reference count categorization.
Ancient Append Vecs
Older storages are converted to ancient format:- Larger file sizes for better storage efficiency
- Special handling during shrink operations
- Separate retention policies
Cleanup and Purging
The system tracks obsolete accounts and reclaims storage:- Dead accounts: Accounts with zero lamports or overwritten
- Rooted slots: Only clean slots that are finalized
- Background service:
AccountsBackgroundServiceperforms periodic cleanup
Concurrency and Locking
Account Locks
TheAccountLocks structure (accounts.rs:69) manages read/write locks:
- Read locks: Multiple concurrent readers allowed
- Write locks: Exclusive access required
- Validation:
validate_account_locksensures transaction conflicts are detected
Write Cache
The write cache (AccountsCache) provides:
- Per-slot caching of modified accounts
- Lock-free reads from cache
- Periodic flushing to persistent storage
- Configurable size limits (default 15GB)
Performance Optimizations
Parallel Processing
AccountsDB leverages parallelism extensively:- Parallel scanning: Use Rayon for multi-threaded account scans
- Concurrent storage access: Memory-mapped files enable lock-free reads
- Sharded index:
BINS_DEFAULT = 8192bins reduce lock contention
Memory Management
- Memory-mapped I/O: Zero-copy access to account data
io_uringsupport: Asynchronous I/O for large operations (configurable)- Buffer limits:
TOTAL_IO_URING_BUFFERS_SIZE_LIMIT = 2GB
Index Optimization
- SmallVec for SlotList: Optimize for common case (single slot)
- Disk eviction: Move cold index data to disk to reduce RAM
- Sharding: 8192 bins reduce lock contention on index updates
Configuration
AccountsDbConfig
Key configuration options inAccountsDbConfig:
Testing Configuration
Predefined configs for testing:ACCOUNTS_DB_CONFIG_FOR_TESTING: Minimal configuration with 2 binsACCOUNTS_DB_CONFIG_FOR_BENCHMARKS: Production-like with 8192 bins
Key Files
accounts_db.rs:1-200- Main AccountsDB implementation and storage logicaccounts_index.rs:1-200- Index structure and account lookupaccounts.rs:1-150- Synchronization and lock managementaccounts_cache.rs- Write-through cache for hot accountsaccount_storage.rs- AppendVec file managementaccount_info.rs- AccountInfo metadata structure
Related Components
- Runtime/Bank: Primary consumer of AccountsDB for transaction processing
- Snapshots: Generate and load snapshots from AccountsDB state
- AccountsBackgroundService: Asynchronous cleanup and maintenance