Until now, state sync on the Internet Computer used the legacy P2P layer, which works over TCP.
The new P2P layer is tailored for state sync and “uses the QUIC transport protocol, making it simpler, more performant, secure and ready for future evolution.”
The Internet Computer blockchain “relies on a peer-to-peer (P2P) protocol that distributes messages among nodes of each subnet.”
These messages are “generated by the Internet Computer protocol’s clients.”
These clients each implement higher-layer protocols, “such as the Internet Computer Consensus protocol, or the state-sync protocol, which this article will explain in detail later.”
Essentially, each client “has very different requirements and behaviors.”
Until now, a single P2P component was used “for all intra-subnet communication, by all protocol clients.” This approach led “to complex code and sub-optimal performance, making it harder to introduce changes in the implementation.”
The new P2P layer for state sync “decouples the P2P communication of different clients, simplifies the code, and makes it possible to introduce new features and improved networking for the Internet Computer.”
Part of this is achieved by adopting the QUIC transport protocol, “which replaces the currently used TCP. QUIC has many advantages over TCP in the setting of the Internet Computer — first and foremost, the ability to multiplex multiple streams.”
At the time of this blog post, the NNS (the DAO that governs the Internet Computer) has already begun “with the rollout of the new P2P layer for state sync as a response to the community’s proposal adoption of replica version cabe2ae3 for eight subnets.”
This means the new QUIC transport is now active “on all of the Internet Computer subnets. However, state sync is using the new P2P layer only in a smaller set of subnets for the moment.”
This update from the Internet Computer developers explains the technical changes in detail, and “how the new QUIC features are being leveraged to create a new and improved P2P layer for the Internet Computer.”
The peer-to-peer protocol of the Internet Computer
The peer-to-peer (P2P) layer of the Internet Computer is “responsible for message delivery within subnets of nodes.”
Each subnet operates a separate peer-to-peer network, and nodes “within each such subnet use this network to send messages to each other.”
On top of this layer there are multiple components that “use the P2P layer to exchange messages between peers, the most famous being the Internet Computer Consensus protocol.”
Other components, including the state sync protocol, “also use the P2P layer to exchange messages with peers on the same subnet.”
The state sync protocol of the Internet Computer
At a very high level, the state sync protocol of the Internet Computer “enables nodes, for example, ones that have fallen behind, to synchronize the replicated state of the subnet without having to (re-)execute all the blocks in the respective subnet blockchain.”
Instead, they can immediately download “the required state, and verify its authenticity relative to the subnet’s chain key, using this state sync protocol.”
The state sync protocol works “as follows: Up-to-date nodes create a so-called checkpoint every couple of hundreds of rounds (usually 500).”
This involves writing the replicated state to disk, “computing a hash of this state, and consensus attempting to agree on this state by including the hash in so-called catch-up packages (CUPs). The hash is computed in a manner that makes it impossible to come up with two different states that have the same hash, which means that the existence of an agreed-upon CUP for a particular height also implies that (1) the majority of nodes agree on the state at a particular height and (2) a majority of nodes is actually able to serve this state as part of state sync.”
The team is also working “on revamping the P2P layer used by other clients.”
Their plan is to also “make it use the new transport component, and therefore eventually have messages from all clients multiplexed on the same QUIC connection, obsoleting the TCP transport. Using QUIC could help dynamically prioritize traffic of different clients. For example, since state sync may be a bandwidth-heavy operation, we may want to prioritize it over other clients, so it finishes as soon as possible. However, if many nodes on the same subnet are state-syncing, we might also want to make sure that the consensus protocol has precedence over state sync, otherwise it might not make progress fast enough.”
The revamped P2P layer for consensus, “which is currently being implemented, significantly improves the existing P2P layer in several aspects including scalability, performance, and reliability of the P2P and the Consensus layers.”
For more technical details on this update, check here.