Skip to content

Commit

Permalink
Initial cap 62
Browse files Browse the repository at this point in the history
  • Loading branch information
SirTyson committed Dec 2, 2024
1 parent cbd5984 commit 0b41ae7
Show file tree
Hide file tree
Showing 2 changed files with 283 additions and 0 deletions.
2 changes: 2 additions & 0 deletions core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@
| [CAP-0058](cap-0058.md) | Constructors for Soroban Contracts | Dmytro Kozhevin | Implemented |
| [CAP-0059](cap-0059.md) | Host functions for BLS12-381 | Jay Geng | Implemented |
| [CAP-0060](cap-0060.md) | Update to Wasmi register machine| Graydon Hoare | Accepted |
| [CAP-0057](cap-0057.md) | Smart Contract Standardized Asset (Stellar Asset Contract) Extension: Memo | Tomer Weller | Draft |
| [CAP-0057](cap-0057.md) | Soroban Live State Prioritization | Garand Tyson | Draft |

### Rejected Proposals
| Number | Title | Author | Status |
Expand Down
281 changes: 281 additions & 0 deletions core/cap-0062.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,281 @@
```
CAP: 0062
Title: Soroban Live State Prioritization
Working Group:
Owner: Garand Tyson <@SirTyson>
Authors: Garand Tyson <@SirTyson>
Consulted: Dmytro Kozhevin <@dmkozh>, Nicolas Barry <@MonsieurNicolas>
Status: Draft
Created: 2024-12-02
Discussion: https://github.com/stellar/stellar-protocol/discussions/1575
Protocol version: 23
```

## Simple Summary

This proposal allows the network to evict `PERSISTENT` entries from the live state BucketList to a separate BucketList
containing archival state. Note that evicted entries are not deleted from validators.

## Working Group

As specified in the Preamble.

## Motivation

This is a first step towards full State Archival, which will lower the storage requirements of validators
and decrease the growth of History Archives. Additionally, this opens the door for database optimizations of
live Soroban state, increasing limits and throughput, as well as automatic entry restoration.

### Goals Alignment

This change is aligned with the goal of lowering the cost and increasing the scale of the network.

## Abstract

Currently both live and archived Soroban state are maintained in a single BucketList DB. This makes data prioritization difficult
since all state exists in a singular data structure. This proposal separates archived state and live state into two separate DBs.
While these DBs are still both persisted on disk by validators, optimizing live state access is significantly easier. In particular,
live state can be entirely cached in memory, removing disk reads from the Soroban transaction execution path, greatly increasing
read limits and throughput.

Each validator will maintain two BucketLists, the Live BucketList and the Hot Archive BucketList. `PERSISTENT` Soroban
data entries and code entries that have expired will be "evicted" from the Live BucketList and moved to the Hot Archive
BucketList. Both the Live BucketList and Hot Archive BucketList are persisted on disk and written to History Archives.

Note that this proposal is a subset of [CAP-0057](cap-0057.md). The Hot Archive here is the same as CAP-0057, and this CAP
is a strict subset of CAP-0057. The purpose is not to replace CAP-0057, but to offer a smaller first step implementation wise.

## Specification

### XDR changes

```
enum BucketListType
{
LIVE = 0,
HOT_ARCHIVE = 1,
};
enum HotArchiveBucketEntryType
{
HOT_ARCHIVE_METAENTRY = -1, // Bucket metadata, should come first.
HOT_ARCHIVE_ARCHIVED = 0, // Entry is Archived
HOT_ARCHIVE_LIVE = 1, // Entry was previously HOT_ARCHIVE_ARCHIVED but
// has been added back to the live BucketList.
};
union HotArchiveBucketEntry switch (HotArchiveBucketEntryType type)
{
case HOT_ARCHIVE_ARCHIVED:
LedgerEntry archivedEntry;
case HOT_ARCHIVE_LIVE:
LedgerKey key;
case HOT_ARCHIVE_METAENTRY:
BucketMetadata metaEntry;
};
struct LedgerCloseMetaV1
{
...
// same as LedgerCloseMetaV1 up to here
// Soroban code, data, and TTL keys that are being evicted at this ledger.
// Note: This has been renamed from evictedTemporaryLedgerKeys in V1
LedgerKey evictedLedgerKeys<>;
...
};
enum LedgerEntryChangeType
{
LEDGER_ENTRY_CREATED = 0, // entry was added to the ledger
LEDGER_ENTRY_UPDATED = 1, // entry was modified in the ledger
LEDGER_ENTRY_REMOVED = 2, // entry was removed from the ledger
LEDGER_ENTRY_STATE = 3, // value of the entry
LEDGER_ENTRY_RESTORE = 4 // archived entry was restored in the ledger
};
union LedgerEntryChange switch (LedgerEntryChangeType type)
{
case LEDGER_ENTRY_CREATED:
LedgerEntry created;
case LEDGER_ENTRY_UPDATED:
LedgerEntry updated;
case LEDGER_ENTRY_REMOVED:
LedgerKey removed;
case LEDGER_ENTRY_STATE:
LedgerEntry state;
case LEDGER_ENTRY_RESTORE:
LedgerEntry restored;
};
// Ledger access settings for contracts.
struct ConfigSettingContractLedgerCostV0
{
...
// Note: The following fields are the same as they were before,
// but have just been renamed
// The following parameters determine the write fee per 1KB.
// Write fee grows linearly until bucket list reaches this size
int64 sorobanLiveStateTargetSizeBytes;
// Fee per 1KB write when the bucket list is empty
int64 writeFee1KBSorobanStateLow;
// Fee per 1KB write when the bucket list has reached `sorobanLiveStateTargetSizeBytes`
int64 writeFee1KBSorobanStateHigh;
// Write fee multiplier for any additional data past the first `sorobanLiveStateSizeBytes`
uint32 sorobanStateWriteFeeGrowthFactor;
};
```

### Semantics

With this change, the current BucketList is divided into two separate BucketLists as follows.

#### Live State BucketList

The Live State BucketList most closely resembles the current BucketList. It will contain all “live” state of the ledger, including
Stellar classic entries, live Soroban entries, network config settings, etc.
The Live State BucketList is published to the History Archive on every checkpoint ledger via the "history" category. The Live BucketList
functions identically to the current BucketList.

#### Hot Archive BucketList

The Hot Archive is a BucketList that stores evicted `PERSISTENT` entries
It contains `HotArchiveBucketEntry` type entries and is constructed as follows:

1. Whenever a `PERSISTENT` entry is evicted, the entry is deleted from the Live State BucketList and added to the Hot Archive as a `HOT_ARCHIVE_ARCHIVED`
entry. The corresponding `TTLEntry` is deleted and not stored in the Hot Archive.
2. If an archived entry is restored and the entry currently exists in the Hot Archive, the `HOT_ARCHIVE_ARCHIVED` node previously stored in the Hot
Archive is overwritten by a `HOT_ARCHIVE_LIVE` entry. Similar to `DEADENTRY` in the Live BucketList, `HOT_ARCHIVE_LIVE` are dropped when the bottom
most Bucket merges.

For Bucket merges, the newest version of a given key is always taken. At the bottom level, `HOT_ARCHIVE_LIVE` entries are dropped.
The `HOT_ARCHIVE_LIVE` state indicates that the given key currently exists in the Live BucketList. Thus, any Hot Archive reference
is out of date and can be dropped.

The current Hot Archive is published to the History Archive via the "history" category on every checkpoint ledger.

##### Hot Archive BucketList Depth and Spill Schedule

The Hot Archive is guaranteed to experience less churn than the Live BucketList, given that the only events
than can modify the Hot Archive BucketList are eviction and restoration. Thus, the spill schedule and depth
of the Hot Archive BucketList should not match that of the Live BucketList. Long term, it is expected that
the Hot Archive BucketList become significantly bigger than the Live BucketList (assuming State Archival
for classic entry types and increased Soroban adoption).

Specific configurations will need to be investigated, but a slower spill schedule is most likely optimal. A
deeper BucketList (to accommodate the expected size difference between Live and Hot Archive) may not be
necessary, as the spill schedule impacts the maximum "capacity" of a BucketList along with the depth.

#### Changes to History Archives

The `HistoryArchiveState` will be extended as follows:

```
version: number identifying the file format version
server: an optional debugging string identifying the software that wrote the file
networkPassphrase: an optional string identifying the networkPassphrase
currentLedger: a number denoting the ledger this file describes the state of
currentBuckets: an array containing an encoding of the live state bucket list for this ledger
hotArchiveBuckets: an array containing an encoding of the hot archive bucket list for this ledger
```

Bucket files for the `hotArchiveBuckets` will be stored just as Bucket files are currently stored.

#### Changes to LedgerHeader

While the XDR structure of `LedgerHeader` remains unchanged, `bucketListHash` will be changed to the following:

```
header.bucketListHash = SHA256(liveStateBucketListHash, hotArchiveHash)
```

#### Meta

Whenever a `PERSISTENT` entry is evicted (i.e. removed from the Live State BucketList and added to the Hot Archive),
the entry key and its associated TTL key will be emitted via `evictedLedgerKeys`
(this field has been renamed and was previously named `evictedTemporaryLedgerKeys`).

Whenever an entry is restored via `RestoreFootprintOp`, the `LedgerEntry` being restored and its
associated TTL will be emitted as a `LedgerEntryChange` of type `LEDGER_ENTRY_RESTORE`. Note that
entries being restored may not have been evicted such that entries currently in the Live State BucketList
can see `LEDGER_ENTRY_RESTORE` events.

#### Default Values for Network Configs

`sorobanLiveStateTargetSizeBytes` (formerly `bucketListTargetSizeBytes`) will need to be changed
to reflect the new size calculations. Additionally, the "fee curve" may need to be revisted
now that classic state growth no longer affects Soroban fees.

## Design Rationale

### CAP-0062 vs. CAP-0057

This CAP is a strict subset of [CAP-0057](cap-0057.md). The intention is not to replace CAP-0057, but to implement this CAP
first as a step towards CAP-0057. At current usage levels, "full state archival" as specified by CAP-0057 (i.e. evicted entries are deleted
from validators) is not a necessary or useful optimization. The space savings of validators and the History Archive would be
minimal, and the additional complexity, both from a computational and UX standpoint, do not justify the minimal storage savings.
Long term, should usage increase, the tradeoffs introduced by CAP-0057 will make more sense, but that is not the case currently
or in the near term future.

The reasoning behind introducing CAP-0057 before it was strictly necessary was to prevent breakage of applications. If applications
and users were not prepared for State Archival, it would be very difficult to enable CAP-0057. However, because
[CAP-0046-12](./cap-0046-12.md) is implemented and State Archival semantics are already required, upgrading to CAP-0057 should not
be breaking, especially since much of the complexity, like proof generation, is handled automatically by RPC endpoints.

While there are no immediate benefits to the "full" state archival introduced in CAP-0057, there are significant short term and long
term gains from "partial state archival" introduced in this CAP. Specifically, this CAP will allow for full in-memory Soroban live state
caching and automatic restoration of Soroban entries via `InvokeHostFunctionOp`, to be detailed in a future CAP.

### Fee Changes

Because ledger state is now split into two BucketLists, the BucketList size component for storage fees must be
reconsidered. Currently, the total size of the BucketList is used for Soroban related fees. This is frustrating
to end users, as classic state dominates the BucketList size such that changes unrelated to Soroban have the most
impact towards Soroban related fees. The purpose of the fee is to give back pressure to new writes when the BucketList
is large. This should discourage writes (or financially prevent DOS attacks) and give time for State Archival to evict
state, reduce the size of the BucketList, thus reducing fees. The issue is, classic entries dominate storage and are
not subject to State Archival, so this back pressure system is not effective. In practice, should BucketList size
rapidly increase, it is most likely due to classic state, and the only way to lower fees is via network config upgrade.

This CAP changes the BucketList size used in fee calculations to Live Soroban State Size. Specifically, Live Soroban
State Size is is the total amount of state currently consumed by Soroban entries in the Live BucketList, in bytes.
This includes TTL entries and shadows.

The Hot Archive size does not impact fees. Long term, it is expected that CAP-0057 will be implemented and the
Hot Archive will be dropped from validators. Thus long term, from a network health standpoint, the size of
the Hot Archive does not have a significant impact on network sustainability.

Prior to CAP-0057, large Hot Archives do impact network sustainability, specifically with respect to History
Archive size, catchup time, and node disk requirements. However, a fee based on Hot Archive size does not
seem to address this issue. First, there is currently no way to remove state from the Hot Archive. Only
restore operations remove Hot Archive state, but this state is just added back to the Live BucketList. Thus
fees would continually increase without any benefit to network sustainability, as no action or system is
currently in place to reduce Hot Archive size.

While not improving network sustainability, Hot Archive fees could prevent state based DOS attacks, where
state is rapidly added to the network. However, unlike the live BucketList, operations cannot directly
write state to the Hot Archive (with the exception of restores, which always decrease Hot Archive size).
The rate of eviction (i.e. the rate at which state is added to the Hot Archive) can already be controlled
via network config settings. The rate at which state is added to the Live BucketList is also already
controlled by network config settings. Thus, enough tools are already in place to prevent state based DOS
attacks without the addition of a Hot Archive based fee.

### Expectations for Downstream Systems

Other than small changes in restoration meta and eviction meta, no significant changes will impact
downstream systems.

## Security Concerns

This introduces no novel security concerns. All state is still maintained by validators and written to History
Archives. Splitting state into separate DBs has no inherit risks.

## Test Cases

TBD

## Implementation

TBD

0 comments on commit 0b41ae7

Please sign in to comment.