From 6265f16b6326ec35458f64fe89451b0ea43df93d Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Wed, 12 Mar 2025 17:09:26 +0100 Subject: [PATCH 01/21] dht: initial draft --- src/routing/kad-dht.md | 176 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 176 insertions(+) create mode 100644 src/routing/kad-dht.md diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md new file mode 100644 index 00000000..f83242bc --- /dev/null +++ b/src/routing/kad-dht.md @@ -0,0 +1,176 @@ +--- +title: Kademlia DHT +description: > + The IPFS Distributed Hash Table (DHT) specification defines a structured + overlay network used for peer routing and content routing in the + InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT + specification, adapting and adding features to support IPFS-specific + requirements. +date: 2022-08-26 +maturity: reliable +editors: + - name: Guillaume Michel + github: guillaumemichel + affiliation: + name: Shipyard + url: https://ipshipyard.com +tags: ['routing'] +order: 1 +--- + +The IPFS Distributed Hash Table (DHT) specification defines a structured +overlay network used for peer routing and content routing in the +InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT +specification, adapting and adding features to support IPFS-specific +requirements. + +## Introduction + +FIXME: + +### Relation to [libp2p kad-dht](https://github.com/libp2p/specs/tree/master/kad-dht) + +The IPFS Kademlia DHT specification is a specialization of the libp2p Kademlia DHT. + +It is possible to use an alternative DHT specification alongside an IPFS +implementation, rather than the one detailed here. This document specifically +outlines all protocol customizations and adaptations required for participation +in the [Amino DHT](#relation-to-the-amino-dht). If you're designing a new +Kademlia-based DHT for use with IPFS, some details in this specification may +appear overly specific or prescriptive. + +### Relation to the [Amino DHT](https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino) + +The Amino DHT is the swarm of peers also referred to as the _Public IPFS DHT_. +It implements the IPFS Kademlia DHT specification and uses the protocol +identifier `/ipfs/kad/1.0.0`. The Amino DHT can be joined by using the [Amino +DHT +Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers). + +The Amino DHT is utilized by multiple IPFS implementations, including +[`kubo`](https://github.com/ipfs/kubo) and +[`helia`](https://github.com/ipfs/helia). Multiple DHT swarms can coexist and +nodes MAY participate in multiple DHT swarms. DHT swarms can be either public +or private. + +Note that there could be multiple distinct DHT swarms using the same protocol +identifier as long as they don't have any common peers. This practice is +discouraged as networks will immediately merge if they enter in contact. Each +DHT swarm SHOULD have a dedicated protocol identifier. + +## Protocol Parameters + +FIXME: move parameters to appropriate sections + +The IPFS Kademlia DHT defines a number of Client and Server parameters that +need to be set to ensure the DHT operates correctly as a system. + +### Protocol Identifier + +All nodes participating in the same DHT swarm MUST use the same protocol +identifier. The protocol identifier uniquely identifies a DHT swarm. It follows +the format `//kad/`, e.g `/ipfs/kad/1.0.0` for the Amino +DHT protocol version `1.0.0`, or `/ipfs/lan/kad/1.0.0` for a local DHT swarm. + +### Routing Table Bucket Size + +DHT Servers MUST have a routing table bucket size of `20` (see [Routing +Table](#routing-table)). This corresponds to the `k` value as defined in the +original Kademlia paper [0]. The `k` value is also used as a replication factor +and defines how many peers are returned to a lookup request. + +While DHT Client technically don't need to store a routing table, DHT Clients +MUST nonetheless use a replication factor of `20`. If Client implementations +decide to include a routing table, they SHOULD use a bucket size of `20`. + +### Provide Validity + +Provide Validity defines the time-to-live (TTL) of a Provider Record on a DHT +Server. DHT Servers MUST implement a Provide Validity of `48h`. + +### Provider Record Republish Interval + +Because of the churn in the network, Provider Records need to be republished +more often than their validity period. DHT Clients SHOULD republish Provider +Records every `22h` +([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17-provider-record-liveness.md#42-alternative-k-values-and-their-performance-comparison)). + +### Provider Addresses TTL + +DHT Servers SHOULD persist the multiaddresses of providers for `24h` after the +`PROVIDE` operation. This allows DHT Servers to serve the multiaddresses of the +content provider alongside the provide record, avoiding an additional DHT walk +for the Client +([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17.1-sharing-prs-with-multiaddresses.md)). + +### Concurrency + +Implementation specific. Recommendation is `10` + +### Resiliency + +Implementation specific. Recommendation is `3` + +### Routing Table Refresh Interval + +SHOULD `10min`. Only peers that have been seen in the last 10 minutes should remain in the routing table. If peer hasn't been seen recently, try to ping it to see if it's still alive. + +## DHT Swarm + +## Routing Table + +### Routing Table Refresh + +### Public addresses + +### IP Diversity Filter + +SHOULD implement. + +## Lookup Process + +### Lookup termination + +This is hard + +## Peer Routing + +DHT Clients that want to be routable must make sure they are in the peerstore of the closest DHT servers to their own PeerID. + +When performing a `FIND_NODE` lookup, the client will converge to the closest nodes in XOR distance to the requested PeerID. These nodes are expected to know the multiaddrs of the target peer. The + +### Signed Peer Records + +## Content Routing + +### Provider Records + +### IPNS + +### Validators + +## Wire format + +Currently same as libp2p kad-dht + +Profobuf + +## Backpressure + +TBD + +## Client Optimizations + +### Checking peer behaviour before adding to routing table + +Make a `FIND_NODE` request and inspect response before adding node to RT. Followed https://blog.ipfs.tech/2023-ipfs-unresponsive-nodes/ + +## libp2p Kademlia DHT Implementations + +* Go: [`libp2p/go-libp2p-kad-dht`](https://github.com/libp2p/go-libp2p-kad-dht) +* JS: [libp2p/kad-dht](https://github.com/libp2p/js-libp2p/tree/main/packages/kad-dht) +* Rust: [libp2p-kad](https://github.com/libp2p/rust-libp2p/tree/master/protocols/kad) + +## References + +[0]: Maymounkov, P., & Mazières, D. (2002). Kademlia: A Peer-to-Peer Information System Based on the XOR Metric. In P. Druschel, F. Kaashoek, & A. Rowstron (Eds.), Peer-to-Peer Systems (pp. 53–65). Berlin, Heidelberg: Springer Berlin Heidelberg. [DOI](https://doi.org/10.1007/3-540-45748-8_5) [pdf](https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf) From 1d3f8d2fb9f5bac64cb945911af33e5acf9784e5 Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Thu, 13 Mar 2025 17:40:21 +0100 Subject: [PATCH 02/21] routing table --- src/routing/kad-dht.md | 235 ++++++++++++++++++++++++++++++----------- 1 file changed, 175 insertions(+), 60 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index f83242bc..45785573 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -6,10 +6,11 @@ description: > InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT specification, adapting and adding features to support IPFS-specific requirements. -date: 2022-08-26 +date: FIXME maturity: reliable editors: - name: Guillaume Michel + url: https://guillaume.michel.id github: guillaumemichel affiliation: name: Shipyard @@ -28,6 +29,10 @@ requirements. FIXME: +Distributed Key-Value Store + +Goal of DHT is to find the closest peers to some key (in a specific geometry). Once this routing to the closest nodes is possible, nodes can interact with these nodes in various ways, including in asking them to store and serve data. + ### Relation to [libp2p kad-dht](https://github.com/libp2p/specs/tree/master/kad-dht) The IPFS Kademlia DHT specification is a specialization of the libp2p Kademlia DHT. @@ -39,112 +44,218 @@ in the [Amino DHT](#relation-to-the-amino-dht). If you're designing a new Kademlia-based DHT for use with IPFS, some details in this specification may appear overly specific or prescriptive. -### Relation to the [Amino DHT](https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino) +### Relation to the [Amino DHT](#amino-dht) -The Amino DHT is the swarm of peers also referred to as the _Public IPFS DHT_. -It implements the IPFS Kademlia DHT specification and uses the protocol -identifier `/ipfs/kad/1.0.0`. The Amino DHT can be joined by using the [Amino -DHT -Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers). +Nodes participating in the [Amino DHT Swarm](#amino-dht) MUST implement the +IPFS Kademlia DHT specification. The IPFS Kademlia DHT specification MAY be +used in other DHT swarms as well. -The Amino DHT is utilized by multiple IPFS implementations, including -[`kubo`](https://github.com/ipfs/kubo) and -[`helia`](https://github.com/ipfs/helia). Multiple DHT swarms can coexist and -nodes MAY participate in multiple DHT swarms. DHT swarms can be either public -or private. +## DHT Swarms -Note that there could be multiple distinct DHT swarms using the same protocol -identifier as long as they don't have any common peers. This practice is -discouraged as networks will immediately merge if they enter in contact. Each -DHT swarm SHOULD have a dedicated protocol identifier. +A DHT swarm is a group of interconnected nodes running the IPFS Kademlia DHT protocol, collectively identified by a unique protocol identifier. IPFS nodes MAY participate in multiple DHT swarms simultaneously. DHT swarms can be either public or private. -## Protocol Parameters +### Protocol Identifier -FIXME: move parameters to appropriate sections +All nodes participating in the same DHT swarm MUST use the same libp2p protocol +identifier. The libp2p protocol identifier uniquely identifies a DHT swarm. It +follows the format `//kad/`, e.g `/ipfs/kad/1.0.0` for +the Amino DHT protocol version `1.0.0`, or `/ipfs/lan/kad/1.0.0` for a local +DHT swarm. -The IPFS Kademlia DHT defines a number of Client and Server parameters that -need to be set to ensure the DHT operates correctly as a system. +Note that there could be multiple distinct DHT swarms using the same libp2p +protocol identifier as long as they don't have any common peers. This practice +is discouraged as networks will immediately merge if they enter in contact. +Each DHT swarm SHOULD have a dedicated protocol identifier. -### Protocol Identifier +### Amino DHT -All nodes participating in the same DHT swarm MUST use the same protocol -identifier. The protocol identifier uniquely identifies a DHT swarm. It follows -the format `//kad/`, e.g `/ipfs/kad/1.0.0` for the Amino -DHT protocol version `1.0.0`, or `/ipfs/lan/kad/1.0.0` for a local DHT swarm. +The [Amino DHT](https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino) is +the swarm of peers also referred to as the _Public IPFS DHT_. It implements the +IPFS Kademlia DHT specification and uses the protocol identifier +`/ipfs/kad/1.0.0`. The Amino DHT can be joined by using the [Amino DHT +Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers). -### Routing Table Bucket Size +The Amino DHT is utilized by multiple IPFS implementations, including +[`kubo`](https://github.com/ipfs/kubo) and +[`helia`](https://github.com/ipfs/helia). + +### Client and Server Mode + +A node operating in Server Mode (or DHT Server) is responsible for responding +to lookup queries from other nodes and storing records. It stores a share of +the global DHT state, and needs to ensure that this state is up-to-date. + +A node operating in Client Mode (or DHT Client) is simply a client able to make +requests to DHT Servers. DHT Client don't answer to queries and don't store +records. + +Having a large number of reliable DHT servers benefits the network by +distributing the load of handling queries and storing records. Nodes SHOULD +operate in Server Mode if they are publicly reachable and have sufficient +resources. Conversely, nodes behind NATs or firewalls, or with intermittent +availability, low bandwidth, or limited CPU, RAM, or storage resources, SHOULD +operate in Client Mode. Operating a DHT server without the capacity to respond +quickly to queries negatively impacts network performance. + +DHT Servers advertise the libp2p Kademlia protocol identifier via the [libp2p +identify +protocol](https://github.com/libp2p/specs/blob/master/identify/README.md). In +addition DHT Servers accept incoming streams using the Kademlia protocol +identifier. DHT Clients do not advertise support for the libp2p Kademlia +protocol identifier. In addition they do not offer the Kademlia protocol +identifier for incoming streams. + +## Kademlia Keyspace + +Kademlia [0] operates on a binary keyspace defined as $\{0, 1\}^m$. In +particular, the IPFS Kademlia DHT uses a keyspace of length $m=256, containing +all bitstrings of 256 bits. The distance between any pair of keys is defined as +the bitwise XOR of the two keys, resulting in a new key representing the +distance between the two keys. This keyspace is used for indexing both nodes +and content. + +The Kademlia node identifier is derived from the node's [Peer +ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md). The +Kademlia node identifier is computed as the digest of the SHA256 hash function +of the binary representation of the Peer ID. The Kademlia identifier is a +256-bit number, which is used as the node's identifier in the Kademlia +keyspace. + +Example: + +```sh +PeerID b58 representation: 12D3KooWKudojFn6pff7Kah2Mkem3jtFfcntpG9X3QBNiggsYxK2 +PeerID hex representation: 0024080112209e3b433cbd31c2b8a6ebbdca998bd0f4c2141c9c9af5422e976051b1e63af14d +Kademlia identifier (hex): e43d28f0996557c0d5571d75c62a57a59d7ac1d30a51ecedcdb9d5e4afa56100 +``` -DHT Servers MUST have a routing table bucket size of `20` (see [Routing -Table](#routing-table)). This corresponds to the `k` value as defined in the -original Kademlia paper [0]. The `k` value is also used as a replication factor -and defines how many peers are returned to a lookup request. +## Routing Table -While DHT Client technically don't need to store a routing table, DHT Clients -MUST nonetheless use a replication factor of `20`. If Client implementations -decide to include a routing table, they SHOULD use a bucket size of `20`. +The Kademlia Routing Table maintains contact information about other DHT +Servers in the network. It has knowledge about all nearby nodes and +progressively fewer nodes as the XOR distance increases. This structure allows +efficient and rapid navigation of the network during lookups. -### Provide Validity +The Routing Table MUST contain information about at least `k` DHT Servers whose +Kademlia Identifier shares a common prefix of length `l` with the local node, +for every `l` in `[0, 255]`, provided such nodes exist. The set of `k` peers +sharing a common prefix of length `l` with the local node is called the +_bucket_ `l`. -Provide Validity defines the time-to-live (TTL) of a Provider Record on a DHT -Server. DHT Servers MUST implement a Provide Validity of `48h`. +In practice, buckets with smaller indices will typically be full, as many nodes +in the network share shorter prefix lengths with the local node. Conversely, +buckets beyond a certain index usually remain empty, since it's statistically +unlikely that any node will have an identifier sharing a very long common +prefix with the local node. For more information see [bucket population +measurements](https://github.com/probe-lab/network-measurements/blob/master/results/rfm19-dht-routing-table-health.md#peers-distribution-in-the-k-buckets). -### Provider Record Republish Interval +The IPFS Kademlia DHT uses a bucket size of `k = 20`. This corresponds to the +`k` value as defined in the original Kademlia paper [0]. The `k` value is also +used as a replication factor and defines how many peers are returned to a +lookup request. -Because of the churn in the network, Provider Records need to be republished -more often than their validity period. DHT Clients SHOULD republish Provider -Records every `22h` -([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17-provider-record-liveness.md#42-alternative-k-values-and-their-performance-comparison)). +Note that DHT Clients are never included in a Routing Table. -### Provider Addresses TTL +Each DHT Server MUST store the public +[multiaddresses](https://github.com/libp2p/specs/blob/master/addressing/README.md) +for every node in its Routing Table. DHT Servers MUST discard nodes with only +private and/or relay multiaddresses. Additionally, DHT Servers must verify that +these nodes are reachable and replace any nodes that are no longer accessible. -DHT Servers SHOULD persist the multiaddresses of providers for `24h` after the -`PROVIDE` operation. This allows DHT Servers to serve the multiaddresses of the -content provider alongside the provide record, avoiding an additional DHT walk -for the Client -([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17.1-sharing-prs-with-multiaddresses.md)). +### Replacement Policy -### Concurrency +Nodes MUST NOT be removed from the Routing Table as long as they remain online. +Therefore, the bucket replacement policy is based on seniority, ensuring that +the most stable peers are eventually retained in the Routing Table. -Implementation specific. Recommendation is `10` +#### IP Diversity Filter -### Resiliency +SHOULD implement -Implementation specific. Recommendation is `3` +FIXME: -### Routing Table Refresh Interval +### Routing Table Refresh -SHOULD `10min`. Only peers that have been seen in the last 10 minutes should remain in the routing table. If peer hasn't been seen recently, try to ping it to see if it's still alive. +There are several strategies a DHT Server can use to verify that nodes in its +Routing Table remain reachable. Implementations may choose their own methods, +provided they avoid serving unresponsive nodes. One recommended strategy is to +periodically refresh the Routing Table. -## DHT Swarm +DHT Servers SHOULD perform a Routing Table Refresh every `10` minutes. During +this process, the server sends a ping request to all nodes it hasn’t heard from +recently (e.g in the last 5 minutes). Any peer that fails to respond MUST be +removed from the Routing Table. -## Routing Table +After removing unresponsive peers, any buckets that are not full MUST be +replenished with fresh, online peers. This can be accomplished by either adding +recently connected peers or by executing a `FIND_NODE` request with a randomly +generated Peer ID matching the bucket. `FIND_NODE` requests should only be run +for buckets up to the last non-empty bucket. -### Routing Table Refresh +Finally, the refresh process concludes by executing a `FIND_NODE` request for +the local node's Peer ID, ensuring the DHT Server maintains up-to-date +information on its closest peers. -### Public addresses +## Lookup Process -### IP Diversity Filter +Iterative vs Recursive -SHOULD implement. +### Server behavior -## Lookup Process +In public DHT swarms, DHT Servers MUST never respond with private or loopback multiaddresses. + +Should Server tell Client about Server? And about Client? + +### Concurrency + +Implementation specific. Recommendation is `10` ### Lookup termination This is hard +#### Resiliency + +Implementation specific. Recommendation is `3` + ## Peer Routing DHT Clients that want to be routable must make sure they are in the peerstore of the closest DHT servers to their own PeerID. When performing a `FIND_NODE` lookup, the client will converge to the closest nodes in XOR distance to the requested PeerID. These nodes are expected to know the multiaddrs of the target peer. The +### Routing to non-DHT Servers + ### Signed Peer Records ## Content Routing +### Content Kademlia Identifier + +sha256 + ### Provider Records +#### Provide Validity + +Provide Validity defines the time-to-live (TTL) of a Provider Record on a DHT +Server. DHT Servers MUST implement a Provide Validity of `48h`. + +#### Provider Record Republish Interval + +Because of the churn in the network, Provider Records need to be republished +more often than their validity period. DHT Clients SHOULD republish Provider +Records every `22h` +([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17-provider-record-liveness.md#42-alternative-k-values-and-their-performance-comparison)). + +#### Provider Addresses TTL + +DHT Servers SHOULD persist the multiaddresses of providers for `24h` after the +`PROVIDE` operation. This allows DHT Servers to serve the multiaddresses of the +content provider alongside the provide record, avoiding an additional DHT walk +for the Client +([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17.1-sharing-prs-with-multiaddresses.md)). + ### IPNS ### Validators @@ -161,6 +272,10 @@ TBD ## Client Optimizations +### LAN DHT Swarms + +Fine to store private multiaddresses in the routing table and serve them to other nodes in the same LAN DHT swarm. + ### Checking peer behaviour before adding to routing table Make a `FIND_NODE` request and inspect response before adding node to RT. Followed https://blog.ipfs.tech/2023-ipfs-unresponsive-nodes/ From 4d2b7895ed3429e23f08be7754b95e7de29a2d34 Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Mon, 17 Mar 2025 11:49:54 +0100 Subject: [PATCH 03/21] lookup process --- src/routing/kad-dht.md | 93 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 80 insertions(+), 13 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 45785573..0045029b 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -27,12 +27,18 @@ requirements. ## Introduction -FIXME: +`FIXME:` Distributed Key-Value Store Goal of DHT is to find the closest peers to some key (in a specific geometry). Once this routing to the closest nodes is possible, nodes can interact with these nodes in various ways, including in asking them to store and serve data. +### DHT Operations + +* Peer Routing +* Value storage and retrieval +* Content provider advertisement and dsicovery + ### Relation to [libp2p kad-dht](https://github.com/libp2p/specs/tree/master/kad-dht) The IPFS Kademlia DHT specification is a specialization of the libp2p Kademlia DHT. @@ -170,9 +176,7 @@ the most stable peers are eventually retained in the Routing Table. #### IP Diversity Filter -SHOULD implement - -FIXME: +`FIXME:` DHT Servers SHOULD implement an [IP Diversity Filter](https://github.com/libp2p/go-libp2p-kbucket/blob/ddb36fa029a18ea0fd5a2b61eeb7235913749615/peerdiversity/filter.go#L45). ### Routing Table Refresh @@ -198,25 +202,82 @@ information on its closest peers. ## Lookup Process -Iterative vs Recursive +When performing a lookup for a Kademlia Identifier in the DHT, a node begins by +sending requests to known DHT servers whose identifiers are close to the +target. Each response provides information on peers that are even closer to the +target identifier, and the process continues iteratively until the absolute +closest peers are discovered. + +### Iterative vs Recursive Lookup + +In an iterative lookup, the querying node sends requests to several known DHT +servers. Each server returns a list of peers that are closer to the target +Kademlia Identifier, but does not continue the lookup process. The querying +node then directly contacts these closer peers, repeating the process until the +closest nodes are found. + +In a recursive lookup, the querying node delegates the task to a peer that is +closer to the target. That peer then queries its own closer peers on behalf of +the original node, and this delegation continues recursively until the target +is reached. + +The IPFS Kademlia DHT uses an iterative lookup approach because recursive +lookups can enable [amplification +attacks](https://en.wikipedia.org/wiki/Denial-of-service_attack#Amplification) +and make error handling more complex. ### Server behavior -In public DHT swarms, DHT Servers MUST never respond with private or loopback multiaddresses. +Upon receiving a lookup request for a Kademlia Identifier, a DHT Server MUST +return the Peer ID and multiaddresses of the `k` closest nodes to the requested +Kademlia Identifier that are stored in its Routing Table. DHT Servers SHOULD +NOT return any information about unresponsive nodes. + +In public DHT swarms, DHT Servers MUST filter out private and loopback +multiaddresses, and MUST NOT include peers whose only addresses are private or +loopback. -Should Server tell Client about Server? And about Client? +`FIXME:` Define whether DHT Server should return information about itself and +about requester. -### Concurrency +### Client behavior -Implementation specific. Recommendation is `10` +When a client initiates a lookup for a Kademlia Identifier `kid` (DHT Servers +can initiate lookups as clients), it starts by selecting the closest nodes to +`kid` in XOR distance, and put them in a list/set. Then it sends requests for +`kid` to the closest nodes (see [concurrency](#concurrency)) to `kid` from the +list. -### Lookup termination +Upon receiving a response, the client adds freshly received peers to the list +of closest peers. It sends a request to the closest peer to `kid` that hasn't +been queried yet. The client ignores timeouts and invalid responses. -This is hard +When a client (or a DHT server acting as a client) initiates a lookup for a +Kademlia Identifier `kid`, it begins by selecting the known nodes closest to +`kid` in terms of XOR distance, and adds them to a candidate list. It then +sends lookup requests to the closest nodes from that list. -#### Resiliency +As responses are received, any newly discovered peers are added to the +candidate list. The client proceeds by sending a request to the nearest peer to +`kid` that has not yet been queried. Invalid responses and timeouts are simply +discarded. -Implementation specific. Recommendation is `3` +#### Termination + +The lookup process continues until the `k` closest reachable peers to `kid` +have been successfully queried. The process may also be terminated early if the +request-specific success criteria are met. Additionally, if every candidate +peer has been queried without discovering any new ones, the lookup will +terminate. + +#### Concurrency + +A client MAY have multiple concurrent in-flight queries to distinct nodes for +the same lookup. This behavior is specific to the client and does not affect +how DHT servers operate. + +It is recommended that the maximum number of in-flight requests (denoted by +`α`) be set to `10`. ## Peer Routing @@ -224,6 +285,8 @@ DHT Clients that want to be routable must make sure they are in the peerstore of When performing a `FIND_NODE` lookup, the client will converge to the closest nodes in XOR distance to the requested PeerID. These nodes are expected to know the multiaddrs of the target peer. The +### `FIND_NODE` Termination + ### Routing to non-DHT Servers ### Signed Peer Records @@ -234,6 +297,10 @@ When performing a `FIND_NODE` lookup, the client will converge to the closest no sha256 +### Lookup Termination and Resiliency + +Resiliency: Implementation specific. Recommendation is `3` + ### Provider Records #### Provide Validity From f1a6ef353e827a771f4c87965743de54ff3b5b5c Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Mon, 17 Mar 2025 13:45:43 +0100 Subject: [PATCH 04/21] rpc messages --- src/routing/kad-dht.md | 136 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 133 insertions(+), 3 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 0045029b..c76e9c68 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -327,11 +327,141 @@ for the Client ### Validators -## Wire format +## RPC Messages -Currently same as libp2p kad-dht +Remote procedure calls are performed by: + +1. Opening a new stream. +2. Sending the RPC request message. +3. Listening for the RPC response message. +4. Closing the stream. + +On any error, the stream is reset. + +Implementations MAY re-use streams by sending one or more RPC request messages +on a single outgoing stream before closing it. Implementations MUST handle +additional RPC request messages on an incoming stream. + +All RPC messages sent over a stream are prefixed with the message length in +bytes, encoded as an unsigned variable length integer as defined by the +[multiformats unsigned-varint +spec](https://github.com/multiformats/unsigned-varint). + +All RPC messages conform to the following protobuf: + +```protobuf +syntax = "proto3"; + +// Record represents a dht record that contains a value +// for a key value pair +message Record { + // The key that references this record + bytes key = 1; + + // The actual value this record is storing + bytes value = 2; + + // Note: These fields were removed from the Record message + // + // Hash of the authors public key + // optional string author = 3; + // A PKI signature for the key+value+author + // optional bytes signature = 4; + + // Time the record was received, set by receiver + // Formatted according to https://datatracker.ietf.org/doc/html/rfc3339 + string timeReceived = 5; +}; + +message Message { + enum MessageType { + PUT_VALUE = 0; + GET_VALUE = 1; + ADD_PROVIDER = 2; + GET_PROVIDERS = 3; + FIND_NODE = 4; + PING = 5; + } + + enum ConnectionType { + // sender does not have a connection to peer, and no extra information (default) + NOT_CONNECTED = 0; + + // sender has a live connection to peer + CONNECTED = 1; + + // sender recently connected to peer + CAN_CONNECT = 2; + + // sender recently tried to connect to peer repeatedly but failed to connect + // ("try" here is loose, but this should signal "made strong effort, failed") + CANNOT_CONNECT = 3; + } + + message Peer { + // ID of a given peer. + bytes id = 1; + + // multiaddrs for a given peer + repeated bytes addrs = 2; + + // used to signal the sender's connection capabilities to the peer + ConnectionType connection = 3; + } + + // defines what type of message it is. + MessageType type = 1; + + // defines what coral cluster level this query/response belongs to. + // in case we want to implement coral's cluster rings in the future. + int32 clusterLevelRaw = 10; // NOT USED + + // Used to specify the key associated with this message. + // PUT_VALUE, GET_VALUE, ADD_PROVIDER, GET_PROVIDERS + bytes key = 2; + + // Used to return a value + // PUT_VALUE, GET_VALUE + Record record = 3; + + // Used to return peers closer to a key in a query + // GET_VALUE, GET_PROVIDERS, FIND_NODE + repeated Peer closerPeers = 8; + + // Used to return Providers + // GET_VALUE, ADD_PROVIDER, GET_PROVIDERS + repeated Peer providerPeers = 9; +} +``` + +These are the requirements for each `MessageType`: + +* `FIND_NODE`: In the request `key` must be set to the binary `PeerId` of the +node to be found. In the response `closerPeers` is set to the DHT Server's `k` +closest `Peer`s. + +* `GET_VALUE`: In the request `key` is an unstructured array of bytes. +`closerPeers` is set to the `k` closest peers. If `key` is found in the +datastore `record` is set to the value for the given key. + +* `PUT_VALUE`: In the request `record` is set to the record to be stored and +`key` on `Message` is set to equal `key` of the `Record`. The target node +validates `record`, and if it is valid, it stores it in the datastore and as a +response echoes the request. + +* `GET_PROVIDERS`: In the request `key` is set to the multihash contained in +the target CID. The target node returns the known `providerPeers` (if any) and +the `k` closest known `closerPeers`. + +* `ADD_PROVIDER`: In the request `key` is set to the multihash contained in the +target CID. The target node verifies `key` is a valid multihash, all +`providerPeers` matching the RPC sender's PeerID are recorded as providers. + +* `PING`: Deprecated message type replaced by the dedicated [ping +protocol](https://github.com/libp2p/specs/blob/master/ping/ping.md). -Profobuf +If a DHT server receives an invalid request, it simply closes the libp2p stream +without responding. ## Backpressure From 3bc396d2e31ec6636540e39f9125d36e91f2ee7e Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Mon, 17 Mar 2025 13:52:07 +0100 Subject: [PATCH 05/21] format --- src/routing/kad-dht.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index c76e9c68..11477b91 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -113,8 +113,8 @@ identifier for incoming streams. ## Kademlia Keyspace -Kademlia [0] operates on a binary keyspace defined as $\{0, 1\}^m$. In -particular, the IPFS Kademlia DHT uses a keyspace of length $m=256, containing +Kademlia [0] operates on a binary keyspace defined as $\lbrace 0,1 \rbrace^m$. In +particular, the IPFS Kademlia DHT uses a keyspace of length $m=256$, containing all bitstrings of 256 bits. The distance between any pair of keys is defined as the bitwise XOR of the two keys, resulting in a new key representing the distance between the two keys. This keyspace is used for indexing both nodes From a5eefc39d8fc802691cc6a63e0ef67f5942dbc4c Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Mon, 17 Mar 2025 14:48:02 +0100 Subject: [PATCH 06/21] peer routing --- src/routing/kad-dht.md | 35 +++++++++++++++++++++++++++++++---- 1 file changed, 31 insertions(+), 4 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 11477b91..58712b52 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -281,16 +281,43 @@ It is recommended that the maximum number of in-flight requests (denoted by ## Peer Routing -DHT Clients that want to be routable must make sure they are in the peerstore of the closest DHT servers to their own PeerID. +Implementations typically provide two interfaces for peer routing using the +`FIND_NODE` RPC: `FindPeer`, which locates a specific Peer ID, and +`GetClosestPeers`, which finds the `k` closest peers to a given key. -When performing a `FIND_NODE` lookup, the client will converge to the closest nodes in XOR distance to the requested PeerID. These nodes are expected to know the multiaddrs of the target peer. The +### `FindPeer` -### `FIND_NODE` Termination +`FindPeer` is the process of discovering the multiaddresses of a given Peer ID. +The requester uses the `FIND_NODE` RPC, including the bytes representation of +the target Peer ID in the `key` field. The lookup eventually converges on the +target Peer ID. The lookup process terminates early if the requester has +established a connection to the target Peer ID. -### Routing to non-DHT Servers +#### Discovering non-DHT Servers + +DHT clients that want to remain routable must ensure their multiaddresses are +stored in the peerstore of the DHT Servers closest to them in XOR distance. +Since peerstore entries expire over time, DHT Clients SHOULD periodically +reconnect to their closest DHT servers to prevent their information from being +removed. It is recommended to perform this reconnection every 10 minutes. + +When receiving a `FIND_NODE` request for a given Peer ID, DHT Servers MUST +always respond with the information of that Peer ID, if it is included in their +peerstore, even if the target node isn't a DHT Server or only advertises +private addresses. + +### `GetClosestPeers` + +`GetClosestPeers` also makes use of the `FIND_NODE` RPC, but allows the sender +to look for the `k` closest peers to any key. The `key` provided to `FIND_NODE` +corresponds to the preimage of the Kademlia Identifier. + +`GetClosestPeers` is used for Content Routing. ### Signed Peer Records +`FIXME`: Signed Peer Records are not yet implemented in the IPFS Kademlia DHT. + ## Content Routing ### Content Kademlia Identifier From b811eaf95580bfca86382a2bbd26c1eba02d2c67 Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Tue, 18 Mar 2025 10:52:06 +0100 Subject: [PATCH 07/21] provider records --- src/routing/kad-dht.md | 87 +++++++++++++++++++++++++++++++++++------- 1 file changed, 74 insertions(+), 13 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 58712b52..92618c50 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -264,11 +264,15 @@ discarded. #### Termination -The lookup process continues until the `k` closest reachable peers to `kid` -have been successfully queried. The process may also be terminated early if the -request-specific success criteria are met. Additionally, if every candidate -peer has been queried without discovering any new ones, the lookup will -terminate. +The resilience parameter (`β`) defines the number of closest reachable peers +that must be successfully queried before a lookup is considered complete. It is +recommended to set `β` to `3`, ensuring that multiple nodes confirm the lookup +result for increased reliability. + +The lookup process continues until the `β` closest reachable peers to `kid` +have been queried. However, the process MAY terminate earlier if the +request-specific success criteria are met. Additionally, if all candidate peers +have been queried without discovering any new ones, the lookup MUST terminate. #### Concurrency @@ -310,30 +314,71 @@ private addresses. `GetClosestPeers` also makes use of the `FIND_NODE` RPC, but allows the sender to look for the `k` closest peers to any key. The `key` provided to `FIND_NODE` -corresponds to the preimage of the Kademlia Identifier. +corresponds to the preimage of the Kademlia Identifier, as described +[below](#content-kademlia-identifier). -`GetClosestPeers` is used for Content Routing. +`GetClosestPeers` is used for the purpose of Content Routing. ### Signed Peer Records `FIXME`: Signed Peer Records are not yet implemented in the IPFS Kademlia DHT. -## Content Routing +## Provider Record Routing + +Provider Record Routing is the process of locating peers that provide a +specific piece of content, identified by its CID. This is achieved by storing +and retrieving Provider Records in the DHT. + +### Provider Records + +A Provider Record is an entry stored in the DHT associating a CID with one or +more Peer IDs providing the corresponding content. Instead of storing the +content itself, the DHT stores provider records pointing to the peers hosting +the content. + +A Provider Record is identified by the multihash contained by the CID. ### Content Kademlia Identifier -sha256 +The Kademlia Identifier associated with a CID is derived from the multihash +contained by the CID, by hashing it with the SHA256 hash function. The +resulting 256-bit digest is used as the Kademlia Identifier for the content. -### Lookup Termination and Resiliency +Example: -Resiliency: Implementation specific. Recommendation is `3` +```sh +CIDv1 (base32) : bafybeihfg3d7rdltd43u3tfvncx7n5loqofbsobojcadtmokrljfthuc7y +CID contained hash (hex) : 1220e536c7f88d731f374dccb568aff6f56e838a19382e488039b1ca8ad2599e82fe +Kademlia Identifier (hex): d623250f3f660ab4c3a53d3c97b3f6a0194c548053488d093520206248253bcb +``` -### Provider Records +### Content Provider Advertisement + +When a node wants to indicate that it provides the content associated with a +given CID, it first finds the `k` closest DHT Servers to the Kademlia +Identifier associated with the CID using [`GetClosestPeers`](#getclosestpeers). +The `key` in the `FIND_NODE` payload is set to the multihash contained in the +CID. + +Once the `k` closest DHT Servers are found, the node sends each of them an +`ADD_PROVIDER` RPC, using the same `key` and setting its own Peer ID as +`providerPeers`. + +The DHT Servers MUST make 2 checks before adding the provided `record` to their +datastore: +1. Verify that `key` is set, and doesn't exceed `80` bytes in size +2. Discard `providerPeers` whose Peer ID is not matching the sender's Peer ID + +Upon successful verification, the DHT Server stores the Provider Record in its +datastore, and responds by echoing the request to confirm success. If +verification fails, the server MUST close the stream without sending a +response. #### Provide Validity Provide Validity defines the time-to-live (TTL) of a Provider Record on a DHT -Server. DHT Servers MUST implement a Provide Validity of `48h`. +Server. DHT Servers MUST implement a Provide Validity of `48h`, and discard the +record after expiration. #### Provider Record Republish Interval @@ -350,8 +395,24 @@ content provider alongside the provide record, avoiding an additional DHT walk for the Client ([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17.1-sharing-prs-with-multiaddresses.md)). +### Content Provider Lookup + +To find providers for a given CID, a node initiates a lookup using the +GET_PROVIDERS RPC. This process follows the same approach as a FIND_NODE +lookup, but with one key difference: if a DHT server holds a matching provider +record, it MUST include it in the response. + +Clients MAY terminate the lookup early if they are satisfied with the returned +providers. If a node does not find any provider records and is unable to +discover closer DHT servers after querying the β closest reachable servers, the +request is considered a failure. + +## DHT Record Storage + ### IPNS +### Resiliency + ### Validators ## RPC Messages From ee9c473266a93c0e580620b2c60857ed2a5ee99a Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Tue, 18 Mar 2025 15:29:14 +0100 Subject: [PATCH 08/21] ipns --- src/routing/kad-dht.md | 96 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 89 insertions(+), 7 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 92618c50..877be759 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -336,7 +336,12 @@ more Peer IDs providing the corresponding content. Instead of storing the content itself, the DHT stores provider records pointing to the peers hosting the content. -A Provider Record is identified by the multihash contained by the CID. +A Provider Record is identified by the multihash contained by the CID. It +functions as an append-only list, where multiple providers can add themselves +as content hosts. Since strict consistency across the network is not required, +different DHT servers MAY store slightly different sets of providers, but the +lookup mechanism ensures that clients can still discover multiple sources +efficiently. ### Content Kademlia Identifier @@ -398,22 +403,97 @@ for the Client ### Content Provider Lookup To find providers for a given CID, a node initiates a lookup using the -GET_PROVIDERS RPC. This process follows the same approach as a FIND_NODE +`GET_PROVIDERS` RPC. This process follows the same approach as a `FIND_NODE` lookup, but with one key difference: if a DHT server holds a matching provider record, it MUST include it in the response. Clients MAY terminate the lookup early if they are satisfied with the returned providers. If a node does not find any provider records and is unable to -discover closer DHT servers after querying the β closest reachable servers, the -request is considered a failure. +discover closer DHT servers after querying the `β` closest reachable servers, +the request is considered a failure. -## DHT Record Storage +## Value Storage and Retrieval + +The IPFS Kademlia DHT allows users to store and retrieve records directly +within the DHT. These records serve as key-value mappings, where the key and +value are defined as arrays of bytes. Each record belongs to a specific +keyspace, which defines its type and structure. + +The IPFS Kademlia DHT supports two types of records, each stored in its own +keyspace: + +1. **Public Key Records** (`/pk/`) – Used to store public keys that cannot be + derived from Peer IDs. +2. **IPNS Records** (`/ipns/`) – Used for decentralized naming and content + resolution. + +Records MUST meet validity criteria specific to their record type before being +stored or updated. DHT Servers MUST verify the validity of each record before +accepting it. + +### Routing + +The Kademlia Identifier of a record is derived by applying the SHA256 hash +function to the record’s key and using the resulting digest in binary format. + +To store a value in the DHT, a client first finds the `k` closest peers to the +record’s Kademlia Identifier using `GetClosestPeers`. The client then sends a +`PUT_VALUE` RPC to each of these peers, including the `key` and the `record`. +DHT servers MUST validate the record based on its type before accepting it. + +Retrieving values from the DHT follows a process similar to provider record +lookups. Clients send a `GET_VALUE` RPC, which directs the search toward the +`k` closest nodes to the target `key`. If a DHT Server holds a matching +`record`, it MUST include it in its response. The conditions for terminating +the lookup depend on the specific record type. + +### Public Keys + +Some public keys are too large to be embedded within libp2p Peer IDs ([keys +larger than 42 +bytes](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#peer-ids)). +In such cases, the Peer ID is derived from the hash of the public key, but the +full key still needs to be accessible. To facilitate retrieval, public keys MAY +be stored directly in the DHT under the `/pk/` keyspace. + +1. Key: `/pk/` (binary Peer ID format). +2. Value: The full public key (in binary format). + +#### Validation + +DHT servers MUST verify that the Peer ID derived from the full public key +matches the Peer ID encoded in the key. If the derived Peer ID does not match, +the record MUST be rejected. ### IPNS -### Resiliency +IPNS (InterPlanetary Naming System) allows peers to publish mutable records +that point to content in IPFS. These records MAY be stored in the DHT under the +`/ipns/` namespace. + +Record format and validation is documented in the [IPNS +specification](https://specs.ipfs.tech/ipns/ipns-record/). IPNS records are +limited in size to +[10KiB]((https://specs.ipfs.tech/ipns/ipns-record/#record-size-limit)). + +#### Quorum -### Validators +A quorum is the minimum number of distinct responses a client must collect from +DHT Servers to determine a valid result. Since different DHT Servers may store +different versions of an IPNS record, a client fetches the record from multiple +DHT Servers to increase the likelihood of retrieving the most recent version. + +For IPNS lookups, the default quorum value is `16`, meaning the client attempts +to collect responses from at least `16` DHT Servers out of `20` before +determining the best available record. + +#### Entry Correction + +Because some DHT servers may store outdated versions of a record, clients need +to ensure that the latest valid version is propagated. After obtaining a +quorum, the client MUST send the most recent valid record to any of the `k` +closest DHT Servers to the record’s Kademlia Identifier that did not return the +latest version. ## RPC Messages @@ -571,6 +651,8 @@ Make a `FIND_NODE` request and inspect response before adding node to RT. Follow * JS: [libp2p/kad-dht](https://github.com/libp2p/js-libp2p/tree/main/packages/kad-dht) * Rust: [libp2p-kad](https://github.com/libp2p/rust-libp2p/tree/master/protocols/kad) +--- + ## References [0]: Maymounkov, P., & Mazières, D. (2002). Kademlia: A Peer-to-Peer Information System Based on the XOR Metric. In P. Druschel, F. Kaashoek, & A. Rowstron (Eds.), Peer-to-Peer Systems (pp. 53–65). Berlin, Heidelberg: Springer Berlin Heidelberg. [DOI](https://doi.org/10.1007/3-540-45748-8_5) [pdf](https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf) From 921530018aa0d3e9b0cfa6bc3a5ab31314a5094b Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Tue, 18 Mar 2025 15:42:08 +0100 Subject: [PATCH 09/21] format --- src/routing/kad-dht.md | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 877be759..5bc0622c 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -6,7 +6,7 @@ description: > InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT specification, adapting and adding features to support IPFS-specific requirements. -date: FIXME +date: 2025-03-18 maturity: reliable editors: - name: Guillaume Michel @@ -639,20 +639,28 @@ TBD ### LAN DHT Swarms -Fine to store private multiaddresses in the routing table and serve them to other nodes in the same LAN DHT swarm. +Fine to store private multiaddresses in the routing table and serve them to +other nodes in the same LAN DHT swarm. ### Checking peer behaviour before adding to routing table -Make a `FIND_NODE` request and inspect response before adding node to RT. Followed https://blog.ipfs.tech/2023-ipfs-unresponsive-nodes/ +Make a `FIND_NODE` request and inspect response before adding node to RT. +Followed https://blog.ipfs.tech/2023-ipfs-unresponsive-nodes/ ## libp2p Kademlia DHT Implementations * Go: [`libp2p/go-libp2p-kad-dht`](https://github.com/libp2p/go-libp2p-kad-dht) -* JS: [libp2p/kad-dht](https://github.com/libp2p/js-libp2p/tree/main/packages/kad-dht) -* Rust: [libp2p-kad](https://github.com/libp2p/rust-libp2p/tree/master/protocols/kad) +* JS: +[libp2p/kad-dht](https://github.com/libp2p/js-libp2p/tree/main/packages/kad-dht) +* Rust: +[libp2p-kad](https://github.com/libp2p/rust-libp2p/tree/master/protocols/kad) --- ## References -[0]: Maymounkov, P., & Mazières, D. (2002). Kademlia: A Peer-to-Peer Information System Based on the XOR Metric. In P. Druschel, F. Kaashoek, & A. Rowstron (Eds.), Peer-to-Peer Systems (pp. 53–65). Berlin, Heidelberg: Springer Berlin Heidelberg. [DOI](https://doi.org/10.1007/3-540-45748-8_5) [pdf](https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf) +[0]: Maymounkov, P., & Mazières, D. (2002). Kademlia: A Peer-to-Peer +Information System Based on the XOR Metric. In P. Druschel, F. Kaashoek, & A. +Rowstron (Eds.), Peer-to-Peer Systems (pp. 53–65). Berlin, Heidelberg: Springer +Berlin Heidelberg. [DOI](https://doi.org/10.1007/3-540-45748-8_5) +[pdf](https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf) From 59b0ff18f3da85f7c2e3263ad1234c1ec498067f Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Wed, 19 Mar 2025 10:15:50 +0100 Subject: [PATCH 10/21] client optimizations --- src/routing/kad-dht.md | 44 ++++++++++++++++++++++++++++-------------- 1 file changed, 29 insertions(+), 15 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 5bc0622c..7cf41680 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -2,10 +2,9 @@ title: Kademlia DHT description: > The IPFS Distributed Hash Table (DHT) specification defines a structured - overlay network used for peer routing and content routing in the - InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT - specification, adapting and adding features to support IPFS-specific - requirements. + overlay network used for peer and content routing in the InterPlanetary File + System (IPFS). It extends the libp2p Kademlia DHT specification, adapting and + adding features to support IPFS-specific requirements. date: 2025-03-18 maturity: reliable editors: @@ -20,10 +19,9 @@ order: 1 --- The IPFS Distributed Hash Table (DHT) specification defines a structured -overlay network used for peer routing and content routing in the -InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT -specification, adapting and adding features to support IPFS-specific -requirements. +overlay network used for peer and content routing in the InterPlanetary File +System (IPFS). It extends the libp2p Kademlia DHT specification, adapting and +adding features to support IPFS-specific requirements. ## Introduction @@ -36,8 +34,8 @@ Goal of DHT is to find the closest peers to some key (in a specific geometry). O ### DHT Operations * Peer Routing +* Content provider advertisement and discovery * Value storage and retrieval -* Content provider advertisement and dsicovery ### Relation to [libp2p kad-dht](https://github.com/libp2p/specs/tree/master/kad-dht) @@ -58,7 +56,10 @@ used in other DHT swarms as well. ## DHT Swarms -A DHT swarm is a group of interconnected nodes running the IPFS Kademlia DHT protocol, collectively identified by a unique protocol identifier. IPFS nodes MAY participate in multiple DHT swarms simultaneously. DHT swarms can be either public or private. +A DHT swarm is a group of interconnected nodes running the IPFS Kademlia DHT +protocol, collectively identified by a unique protocol identifier. IPFS nodes +MAY participate in multiple DHT swarms simultaneously. DHT swarms can be either +public or private. ### Protocol Identifier @@ -639,13 +640,26 @@ TBD ### LAN DHT Swarms -Fine to store private multiaddresses in the routing table and serve them to -other nodes in the same LAN DHT swarm. +Implementations MAY support private or LAN-specific DHT swarms, which operate +within a local network and remain isolated from the public DHT. Nodes MAY +participate in multiple DHT swarms simultaneously, provided that each swarm has +a unique protocol identifier. -### Checking peer behaviour before adding to routing table +Private DHT swarms MAY store and serve private multiaddresses, as they are not +exposed to the public network. -Make a `FIND_NODE` request and inspect response before adding node to RT. -Followed https://blog.ipfs.tech/2023-ipfs-unresponsive-nodes/ +### Verifying DHT Server + +Implementations MAY perform additional checks to ensure that DHT servers behave +correctly before adding them to the routing table. In the past, misconfigured +nodes have been added to routing tables, leading to [network +slowdowns](https://blog.ipfs.tech/2023-ipfs-unresponsive-nodes/). + +For example, kubo verifies a DHT server by sending a FIND_NODE request for its +own Peer ID before adding it to the routing table +([reference](https://github.com/libp2p/go-libp2p-kad-dht/blob/master/optimizations.md#checking-before-adding)). +The server is only added if its response contains at least one peer. This check +is skipped during the initial routing table setup. ## libp2p Kademlia DHT Implementations From 53fc4a7c6f0b7baaffb6b686c225571e0b640260 Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Wed, 19 Mar 2025 11:01:58 +0100 Subject: [PATCH 11/21] diversity filter --- src/routing/kad-dht.md | 33 ++++++++++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 3 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 7cf41680..cd4f27ad 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -177,7 +177,28 @@ the most stable peers are eventually retained in the Routing Table. #### IP Diversity Filter -`FIXME:` DHT Servers SHOULD implement an [IP Diversity Filter](https://github.com/libp2p/go-libp2p-kbucket/blob/ddb36fa029a18ea0fd5a2b61eeb7235913749615/peerdiversity/filter.go#L45). +DHT servers SHOULD implement an IP Diversity Filter to ensure that nodes in +their routing table originate from a diverse set of Autonomous System Numbers +(ASNs). This measure helps mitigate Sybil attacks and enhances the network’s +resilience. + +A recommended approach is to impose the following limits: + +* **Globally**, a maximum of `3` nodes sharing the same IP grouping should be +allowed in the routing table. +* **Per routing table bucket**, a maximum of `2` nodes from the same IP +grouping should be permitted. + +For IP grouping: + +* **IPv6 addresses** are grouped by ASN. +* **IPv4 addresses** are grouped by `/16` prefixes, except for [legacy Class A +blocks](https://en.wikipedia.org/wiki/List_of_assigned_/8_IPv4_address_blocks), +which are grouped by `/8` prefixes. + +Since a single node can advertise multiple addresses, a peer MUST NOT be added +to the routing table if any of its addresses already exceed the allowed +representation within the table. ### Routing Table Refresh @@ -238,8 +259,14 @@ In public DHT swarms, DHT Servers MUST filter out private and loopback multiaddresses, and MUST NOT include peers whose only addresses are private or loopback. -`FIXME:` Define whether DHT Server should return information about itself and -about requester. +DHT Servers SHOULD NOT return their own Peer ID in responses to `FIND_NODE` +queries. However, they MUST include information about the requester, if and +only if the requester is a DHT Server in its routing table and it is among the +`k` closest nodes to the target key. + +A DHT Server SHOULD always return information about its known `k` closest +peers, provided its routing table contains at least `k` peers, even if those +peers are not closer to the target key than itself. ### Client behavior From dc718c72a273f8852b7b6fe2529631d00e891391 Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Wed, 19 Mar 2025 11:28:32 +0100 Subject: [PATCH 12/21] intro --- src/routing/kad-dht.md | 29 +++++++++++------------------ 1 file changed, 11 insertions(+), 18 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index cd4f27ad..802af3a7 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -5,7 +5,7 @@ description: > overlay network used for peer and content routing in the InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT specification, adapting and adding features to support IPFS-specific requirements. -date: 2025-03-18 +date: 2025-03-19 maturity: reliable editors: - name: Guillaume Michel @@ -25,17 +25,18 @@ adding features to support IPFS-specific requirements. ## Introduction -`FIXME:` +The Kademlia Distributed Hash Table (DHT) is a decentralized key-value store +designed to enable efficient and scalable peer-to-peer routing. It provides a +structured overlay network that allows nodes to locate peers and content in a +distributed system without relying on centralized servers. -Distributed Key-Value Store +The primary goal of the Kademlia routing algorithm is to progressively discover +and interact with nodes that are closest to a given key based on the network's +distance metric. Once a node has identified the closest peers, it can either: -Goal of DHT is to find the closest peers to some key (in a specific geometry). Once this routing to the closest nodes is possible, nodes can interact with these nodes in various ways, including in asking them to store and serve data. - -### DHT Operations - -* Peer Routing -* Content provider advertisement and discovery -* Value storage and retrieval +* **Locate a specific peer** in the network +* **Find content providers** serving content associated with a CID +* **Store and retrieve values** directly within the DHT, such as IPNS names ### Relation to [libp2p kad-dht](https://github.com/libp2p/specs/tree/master/kad-dht) @@ -347,10 +348,6 @@ corresponds to the preimage of the Kademlia Identifier, as described `GetClosestPeers` is used for the purpose of Content Routing. -### Signed Peer Records - -`FIXME`: Signed Peer Records are not yet implemented in the IPFS Kademlia DHT. - ## Provider Record Routing Provider Record Routing is the process of locating peers that provide a @@ -659,10 +656,6 @@ protocol](https://github.com/libp2p/specs/blob/master/ping/ping.md). If a DHT server receives an invalid request, it simply closes the libp2p stream without responding. -## Backpressure - -TBD - ## Client Optimizations ### LAN DHT Swarms From 7d97c198497e0e68bc715f012a1fe75ba83746d4 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Mon, 24 Mar 2025 23:17:37 +0100 Subject: [PATCH 13/21] chore: cosmetic editorials --- .gitignore | 1 + src/bitswap-protocol.md | 2 +- src/routing/http-routing-v1.md | 2 +- src/routing/kad-dht.md | 78 ++++++++++++++++++++-------------- 4 files changed, 50 insertions(+), 33 deletions(-) diff --git a/.gitignore b/.gitignore index d5b51159..1a1c0879 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,3 @@ out/ super-linter.log +node_modules/ diff --git a/src/bitswap-protocol.md b/src/bitswap-protocol.md index 11d943ec..cc7ee19f 100644 --- a/src/bitswap-protocol.md +++ b/src/bitswap-protocol.md @@ -29,7 +29,7 @@ editors: name: Protocol Labs url: https://protocol.ai/ tags: ['exchange', 'routing'] -order: 1 +order: 2 --- Bitswap is a libp2p data exchange protocol for sending and receiving content diff --git a/src/routing/http-routing-v1.md b/src/routing/http-routing-v1.md index ea4db1de..2fac60f3 100644 --- a/src/routing/http-routing-v1.md +++ b/src/routing/http-routing-v1.md @@ -36,7 +36,7 @@ editors: url: https://ipshipyard.com xref: - ipns-record -order: 0 +order: 3 tags: ['routing'] --- diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 802af3a7..f6d77a74 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -1,11 +1,11 @@ --- -title: Kademlia DHT +title: IPFS Kademlia DHT description: > The IPFS Distributed Hash Table (DHT) specification defines a structured overlay network used for peer and content routing in the InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT specification, adapting and adding features to support IPFS-specific requirements. -date: 2025-03-19 +date: 2025-03-24 maturity: reliable editors: - name: Guillaume Michel @@ -38,9 +38,9 @@ distance metric. Once a node has identified the closest peers, it can either: * **Find content providers** serving content associated with a CID * **Store and retrieve values** directly within the DHT, such as IPNS names -### Relation to [libp2p kad-dht](https://github.com/libp2p/specs/tree/master/kad-dht) +### Relation to libp2p kad-dht -The IPFS Kademlia DHT specification is a specialization of the libp2p Kademlia DHT. +The IPFS Kademlia DHT specification is a specialization of the [libp2p Kademlia DHT](https://github.com/libp2p/specs/tree/master/kad-dht). It is possible to use an alternative DHT specification alongside an IPFS implementation, rather than the one detailed here. This document specifically @@ -49,20 +49,20 @@ in the [Amino DHT](#relation-to-the-amino-dht). If you're designing a new Kademlia-based DHT for use with IPFS, some details in this specification may appear overly specific or prescriptive. -### Relation to the [Amino DHT](#amino-dht) +### Relation to the Amino DHT -Nodes participating in the [Amino DHT Swarm](#amino-dht) MUST implement the +Nodes participating in the public [Amino DHT Swarm](#amino-dht) MUST implement the IPFS Kademlia DHT specification. The IPFS Kademlia DHT specification MAY be used in other DHT swarms as well. ## DHT Swarms A DHT swarm is a group of interconnected nodes running the IPFS Kademlia DHT -protocol, collectively identified by a unique protocol identifier. IPFS nodes +protocol, collectively identified by a unique libp2p protocol identifier. IPFS nodes MAY participate in multiple DHT swarms simultaneously. DHT swarms can be either public or private. -### Protocol Identifier +### libp2p Protocol Identifier All nodes participating in the same DHT swarm MUST use the same libp2p protocol identifier. The libp2p protocol identifier uniquely identifies a DHT swarm. It @@ -73,6 +73,7 @@ DHT swarm. Note that there could be multiple distinct DHT swarms using the same libp2p protocol identifier as long as they don't have any common peers. This practice is discouraged as networks will immediately merge if they enter in contact. + Each DHT swarm SHOULD have a dedicated protocol identifier. ### Amino DHT @@ -80,12 +81,14 @@ Each DHT swarm SHOULD have a dedicated protocol identifier. The [Amino DHT](https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino) is the swarm of peers also referred to as the _Public IPFS DHT_. It implements the IPFS Kademlia DHT specification and uses the protocol identifier -`/ipfs/kad/1.0.0`. The Amino DHT can be joined by using the [Amino DHT -Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers). +`/ipfs/kad/1.0.0`. +:::note The Amino DHT is utilized by multiple IPFS implementations, including [`kubo`](https://github.com/ipfs/kubo) and -[`helia`](https://github.com/ipfs/helia). +[`helia`](https://github.com/ipfs/helia) +and can be joined by using the [public good Amino DHT Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers). +::: ### Client and Server Mode @@ -103,18 +106,21 @@ operate in Server Mode if they are publicly reachable and have sufficient resources. Conversely, nodes behind NATs or firewalls, or with intermittent availability, low bandwidth, or limited CPU, RAM, or storage resources, SHOULD operate in Client Mode. Operating a DHT server without the capacity to respond -quickly to queries negatively impacts network performance. +quickly to queries negatively impacts network performance and SHOULD be avoided. -DHT Servers advertise the libp2p Kademlia protocol identifier via the [libp2p +DHT Servers MUST advertise the libp2p Kademlia protocol identifier via the [libp2p identify protocol](https://github.com/libp2p/specs/blob/master/identify/README.md). In -addition DHT Servers accept incoming streams using the Kademlia protocol -identifier. DHT Clients do not advertise support for the libp2p Kademlia -protocol identifier. In addition they do not offer the Kademlia protocol +addition DHT Servers MUST accept incoming streams using the libp2p Kademlia protocol +identifier. + +DHT Clients MUST NOT advertise support for the libp2p Kademlia +protocol identifier nor offer the libp2p Kademlia protocol identifier for incoming streams. ## Kademlia Keyspace + Kademlia [0] operates on a binary keyspace defined as $\lbrace 0,1 \rbrace^m$. In particular, the IPFS Kademlia DHT uses a keyspace of length $m=256$, containing all bitstrings of 256 bits. The distance between any pair of keys is defined as @@ -122,7 +128,7 @@ the bitwise XOR of the two keys, resulting in a new key representing the distance between the two keys. This keyspace is used for indexing both nodes and content. -The Kademlia node identifier is derived from the node's [Peer +The Kademlia node identifier is derived from the libp2p node's [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md). The Kademlia node identifier is computed as the digest of the SHA256 hash function of the binary representation of the Peer ID. The Kademlia identifier is a @@ -133,6 +139,7 @@ Example: ```sh PeerID b58 representation: 12D3KooWKudojFn6pff7Kah2Mkem3jtFfcntpG9X3QBNiggsYxK2 +PeerID CID representation: k51qzi5uqu5djx47o56x8r9lvy85co0sdf1yfbzxlukdq4irr8ssn3o7dpfasp PeerID hex representation: 0024080112209e3b433cbd31c2b8a6ebbdca998bd0f4c2141c9c9af5422e976051b1e63af14d Kademlia identifier (hex): e43d28f0996557c0d5571d75c62a57a59d7ac1d30a51ecedcdb9d5e4afa56100 ``` @@ -144,6 +151,8 @@ Servers in the network. It has knowledge about all nearby nodes and progressively fewer nodes as the XOR distance increases. This structure allows efficient and rapid navigation of the network during lookups. +### Bucket Size + The Routing Table MUST contain information about at least `k` DHT Servers whose Kademlia Identifier shares a common prefix of length `l` with the local node, for every `l` in `[0, 255]`, provided such nodes exist. The set of `k` peers @@ -167,7 +176,7 @@ Note that DHT Clients are never included in a Routing Table. Each DHT Server MUST store the public [multiaddresses](https://github.com/libp2p/specs/blob/master/addressing/README.md) for every node in its Routing Table. DHT Servers MUST discard nodes with only -private and/or relay multiaddresses. Additionally, DHT Servers must verify that +private and/or relay multiaddresses. Additionally, DHT Servers MUST verify that these nodes are reachable and replace any nodes that are no longer accessible. ### Replacement Policy @@ -315,8 +324,9 @@ It is recommended that the maximum number of in-flight requests (denoted by ## Peer Routing Implementations typically provide two interfaces for peer routing using the -`FIND_NODE` RPC: `FindPeer`, which locates a specific Peer ID, and -`GetClosestPeers`, which finds the `k` closest peers to a given key. +`FIND_NODE` RPC: +- [`FindPeer`](#findpeer), which locates a specific Peer ID, and +- [`GetClosestPeers`](#getclosestpeers), which finds the `k` closest peers to a given key. ### `FindPeer` @@ -346,11 +356,12 @@ to look for the `k` closest peers to any key. The `key` provided to `FIND_NODE` corresponds to the preimage of the Kademlia Identifier, as described [below](#content-kademlia-identifier). -`GetClosestPeers` is used for the purpose of Content Routing. +`GetClosestPeers` is used for the purpose of Content Routing +([Provider Record Routing](#provider-record-routing)). ## Provider Record Routing -Provider Record Routing is the process of locating peers that provide a +Provider Record Routing is IPFS-specific process of locating peers that provide a specific piece of content, identified by its CID. This is achieved by storing and retrieving Provider Records in the DHT. @@ -361,6 +372,7 @@ more Peer IDs providing the corresponding content. Instead of storing the content itself, the DHT stores provider records pointing to the peers hosting the content. + A Provider Record is identified by the multihash contained by the CID. It functions as an append-only list, where multiple providers can add themselves as content hosts. Since strict consistency across the network is not required, @@ -450,13 +462,15 @@ keyspace: 1. **Public Key Records** (`/pk/`) – Used to store public keys that cannot be derived from Peer IDs. 2. **IPNS Records** (`/ipns/`) – Used for decentralized naming and content - resolution. + resolution. See + [IPNS Routing Record](https://specs.ipfs.tech/ipns/ipns-record/#routing-record) + and [IPNS Record Verification](https://specs.ipfs.tech/ipns/ipns-record/#record-verification). Records MUST meet validity criteria specific to their record type before being stored or updated. DHT Servers MUST verify the validity of each record before accepting it. -### Routing +### Record Routing The Kademlia Identifier of a record is derived by applying the SHA256 hash function to the record’s key and using the resulting digest in binary format. @@ -497,9 +511,11 @@ that point to content in IPFS. These records MAY be stored in the DHT under the `/ipns/` namespace. Record format and validation is documented in the [IPNS -specification](https://specs.ipfs.tech/ipns/ipns-record/). IPNS records are -limited in size to -[10KiB]((https://specs.ipfs.tech/ipns/ipns-record/#record-size-limit)). +specification](https://specs.ipfs.tech/ipns/ipns-record/). + +IPNS implementations MUST follow [IPNS Routing Record](https://specs.ipfs.tech/ipns/ipns-record/#routing-record), +[IPNS Record Verification](https://specs.ipfs.tech/ipns/ipns-record/#record-verification), +and [IPNS Record Size Limit](https://specs.ipfs.tech/ipns/ipns-record/#record-size-limit). #### Quorum @@ -656,6 +672,8 @@ protocol](https://github.com/libp2p/specs/blob/master/ping/ping.md). If a DHT server receives an invalid request, it simply closes the libp2p stream without responding. +# Appendix: Notes for Implementers + ## Client Optimizations ### LAN DHT Swarms @@ -689,12 +707,10 @@ is skipped during the initial routing table setup. * Rust: [libp2p-kad](https://github.com/libp2p/rust-libp2p/tree/master/protocols/kad) ---- - -## References +# Bibliography [0]: Maymounkov, P., & Mazières, D. (2002). Kademlia: A Peer-to-Peer Information System Based on the XOR Metric. In P. Druschel, F. Kaashoek, & A. Rowstron (Eds.), Peer-to-Peer Systems (pp. 53–65). Berlin, Heidelberg: Springer Berlin Heidelberg. [DOI](https://doi.org/10.1007/3-540-45748-8_5) -[pdf](https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf) +[PDF](https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf) From 9eafd1583ec84b3c7d37430e2d1817d232a361eb Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Mon, 24 Mar 2025 23:22:44 +0100 Subject: [PATCH 14/21] chore: manual link to bibliography will do for now, we should support this in generator at some point --- src/routing/kad-dht.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index f6d77a74..cd491faf 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -121,7 +121,7 @@ identifier for incoming streams. ## Kademlia Keyspace -Kademlia [0] operates on a binary keyspace defined as $\lbrace 0,1 \rbrace^m$. In +Kademlia [`[0]`](#bibliography) operates on a binary keyspace defined as $\lbrace 0,1 \rbrace^m$. In particular, the IPFS Kademlia DHT uses a keyspace of length $m=256$, containing all bitstrings of 256 bits. The distance between any pair of keys is defined as the bitwise XOR of the two keys, resulting in a new key representing the @@ -167,7 +167,7 @@ prefix with the local node. For more information see [bucket population measurements](https://github.com/probe-lab/network-measurements/blob/master/results/rfm19-dht-routing-table-health.md#peers-distribution-in-the-k-buckets). The IPFS Kademlia DHT uses a bucket size of `k = 20`. This corresponds to the -`k` value as defined in the original Kademlia paper [0]. The `k` value is also +`k` value as defined in the original Kademlia paper [`[0]`](#bibliography). The `k` value is also used as a replication factor and defines how many peers are returned to a lookup request. From 51690fc543701d817b19e02c84742285666564fd Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Wed, 26 Mar 2025 14:26:48 +0100 Subject: [PATCH 15/21] addressing review --- src/routing/kad-dht.md | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index cd491faf..1c52bfde 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -130,9 +130,9 @@ and content. The Kademlia node identifier is derived from the libp2p node's [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md). The -Kademlia node identifier is computed as the digest of the SHA256 hash function -of the binary representation of the Peer ID. The Kademlia identifier is a -256-bit number, which is used as the node's identifier in the Kademlia +Kademlia node identifier is computed as the digest of the SHA2-256 hash +function of the binary representation of the Peer ID. The Kademlia identifier +is a 256-bit number, which is used as the node's identifier in the Kademlia keyspace. Example: @@ -224,9 +224,10 @@ removed from the Routing Table. After removing unresponsive peers, any buckets that are not full MUST be replenished with fresh, online peers. This can be accomplished by either adding -recently connected peers or by executing a `FIND_NODE` request with a randomly -generated Peer ID matching the bucket. `FIND_NODE` requests should only be run -for buckets up to the last non-empty bucket. +recently connected peers or by executing a `FIND_NODE` [RPC +message](#rpc-messages) with a randomly generated Peer ID matching the bucket. +`FIND_NODE` requests should only be run for buckets up to the last non-empty +bucket. Finally, the refresh process concludes by executing a `FIND_NODE` request for the local node's Peer ID, ensuring the DHT Server maintains up-to-date @@ -383,14 +384,14 @@ efficiently. ### Content Kademlia Identifier The Kademlia Identifier associated with a CID is derived from the multihash -contained by the CID, by hashing it with the SHA256 hash function. The +contained by the CID, by hashing it with the SHA2-256 hash function. The resulting 256-bit digest is used as the Kademlia Identifier for the content. Example: ```sh CIDv1 (base32) : bafybeihfg3d7rdltd43u3tfvncx7n5loqofbsobojcadtmokrljfthuc7y -CID contained hash (hex) : 1220e536c7f88d731f374dccb568aff6f56e838a19382e488039b1ca8ad2599e82fe +Multihash from CID (hex) : 1220e536c7f88d731f374dccb568aff6f56e838a19382e488039b1ca8ad2599e82fe Kademlia Identifier (hex): d623250f3f660ab4c3a53d3c97b3f6a0194c548053488d093520206248253bcb ``` @@ -466,13 +467,14 @@ keyspace: [IPNS Routing Record](https://specs.ipfs.tech/ipns/ipns-record/#routing-record) and [IPNS Record Verification](https://specs.ipfs.tech/ipns/ipns-record/#record-verification). -Records MUST meet validity criteria specific to their record type before being -stored or updated. DHT Servers MUST verify the validity of each record before -accepting it. +Records with the above prefixes MUST meet validity criteria specific to their +record type before being stored or updated. DHT Servers MUST verify the +validity of each record before accepting it. Records with other prefixes are +not supported by the IPFS Kademlia DHT and MUST be rejected. ### Record Routing -The Kademlia Identifier of a record is derived by applying the SHA256 hash +The Kademlia Identifier of a record is derived by applying the SHA2-256 hash function to the record’s key and using the resulting digest in binary format. To store a value in the DHT, a client first finds the `k` closest peers to the From fb43c865cac24891117625552d386efd7f94861e Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Thu, 27 Mar 2025 14:18:15 +0100 Subject: [PATCH 16/21] provide clarifications --- src/routing/kad-dht.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 1c52bfde..30d08ce8 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -405,7 +405,8 @@ CID. Once the `k` closest DHT Servers are found, the node sends each of them an `ADD_PROVIDER` RPC, using the same `key` and setting its own Peer ID as -`providerPeers`. +`providerPeers`. Providers MUST indicate their listen multiaddresses to be +cached and served with the provider record. The DHT Servers MUST make 2 checks before adding the provided `record` to their datastore: @@ -413,9 +414,9 @@ datastore: 2. Discard `providerPeers` whose Peer ID is not matching the sender's Peer ID Upon successful verification, the DHT Server stores the Provider Record in its -datastore, and responds by echoing the request to confirm success. If -verification fails, the server MUST close the stream without sending a -response. +datastore, and caches the provided public multiaddresses. It responds by +echoing the request to confirm success. If verification fails, the server MUST +close the stream without sending a response. #### Provide Validity From 68599776b80ea83d4fa53db1c93d1b20148130a1 Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Fri, 28 Mar 2025 17:53:15 +0100 Subject: [PATCH 17/21] specify transports requirements --- src/routing/kad-dht.md | 53 +++++++++++++++++++++++++++++++++++------- 1 file changed, 44 insertions(+), 9 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 30d08ce8..9ccdc3f0 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -40,7 +40,8 @@ distance metric. Once a node has identified the closest peers, it can either: ### Relation to libp2p kad-dht -The IPFS Kademlia DHT specification is a specialization of the [libp2p Kademlia DHT](https://github.com/libp2p/specs/tree/master/kad-dht). +The IPFS Kademlia DHT specification is an instantiation of the [libp2p Kademlia +DHT](https://github.com/libp2p/specs/tree/master/kad-dht). It is possible to use an alternative DHT specification alongside an IPFS implementation, rather than the one detailed here. This document specifically @@ -58,9 +59,9 @@ used in other DHT swarms as well. ## DHT Swarms A DHT swarm is a group of interconnected nodes running the IPFS Kademlia DHT -protocol, collectively identified by a unique libp2p protocol identifier. IPFS nodes -MAY participate in multiple DHT swarms simultaneously. DHT swarms can be either -public or private. +protocol, collectively identified by a unique libp2p protocol identifier. IPFS +nodes MAY participate in multiple DHT swarms simultaneously. DHT swarms can be +either public or private. ### libp2p Protocol Identifier @@ -97,10 +98,10 @@ to lookup queries from other nodes and storing records. It stores a share of the global DHT state, and needs to ensure that this state is up-to-date. A node operating in Client Mode (or DHT Client) is simply a client able to make -requests to DHT Servers. DHT Client don't answer to queries and don't store +requests to DHT Servers. DHT Clients don't answer to queries and don't store records. -Having a large number of reliable DHT servers benefits the network by +Having a large number of reliable DHT Servers benefits the network by distributing the load of handling queries and storing records. Nodes SHOULD operate in Server Mode if they are publicly reachable and have sufficient resources. Conversely, nodes behind NATs or firewalls, or with intermittent @@ -114,9 +115,43 @@ protocol](https://github.com/libp2p/specs/blob/master/identify/README.md). In addition DHT Servers MUST accept incoming streams using the libp2p Kademlia protocol identifier. -DHT Clients MUST NOT advertise support for the libp2p Kademlia -protocol identifier nor offer the libp2p Kademlia protocol -identifier for incoming streams. +DHT Clients MUST NOT advertise support for the libp2p Kademlia protocol +identifier nor offer the libp2p Kademlia protocol identifier for incoming +streams. + +DHT Clients MAY Provide [Content](#provider-record-routing) and +[Records](#value-storage-and-retrieval) to the network, content providing is +not exclusive to DHT Servers. + +### Transports + +All nodes MUST run the libp2p network stack. + +DHT Servers MUST support both +[`QUIC`](https://github.com/libp2p/specs/blob/master/quic/README.md) and +`TCP`+[`Yamux`](https://github.com/libp2p/specs/blob/master/yamux/README.md)+[`Noise`](https://github.com/libp2p/specs/blob/master/noise/README.md). +It is essential that all DHT Servers are able to open a connection to each +other. Additionally, DHT Servers SHOULD support +[`TLS``](https://github.com/libp2p/specs/blob/master/tls/tls.md) as an +alternative to Noise, [`WebRTC +direct`](https://github.com/libp2p/specs/blob/master/webrtc/webrtc-direct.md), +[Secure +`WebSockets`](https://github.com/libp2p/specs/blob/master/websockets/README.md) +and +[`WebTransport`](https://github.com/libp2p/specs/blob/master/webtransport/README.md). +DHT Servers adoption of browser-based transports is encouraged to allow for +browser-based DHT Clients to interact with the DHT. + +DHT Clients SHOULD support +[`QUIC`](https://github.com/libp2p/specs/blob/master/quic/README.md) and +`TCP`+[`Yamux`](https://github.com/libp2p/specs/blob/master/yamux/README.md)+[`Noise`](https://github.com/libp2p/specs/blob/master/noise/README.md) +whenever possible. They MAY also support additional libp2p transports. However, +to guarantee discovery of existing records in the DHT, a client MUST implement +at least one of these: `QUIC` or `TCP`+`Yamux`+`Noise`. + +Clients that cannot support either `QUIC` or `TCP`+`Yamux`+`Noise` (e.g., +browser-based nodes) MAY still act as DHT Clients, but their ability to find +records in the DHT will be limited. ## Kademlia Keyspace From cd85a7f92069fc41210e1cb2dfaabaf1d14d6507 Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Tue, 1 Apr 2025 11:20:12 +0200 Subject: [PATCH 18/21] swarm clarifications --- src/routing/kad-dht.md | 76 +++++++++++++++++++++++++++++------------- 1 file changed, 52 insertions(+), 24 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 9ccdc3f0..2b48cf0d 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -63,26 +63,21 @@ protocol, collectively identified by a unique libp2p protocol identifier. IPFS nodes MAY participate in multiple DHT swarms simultaneously. DHT swarms can be either public or private. -### libp2p Protocol Identifier +### Identifiers & Existing Swarms -All nodes participating in the same DHT swarm MUST use the same libp2p protocol -identifier. The libp2p protocol identifier uniquely identifies a DHT swarm. It -follows the format `//kad/`, e.g `/ipfs/kad/1.0.0` for -the Amino DHT protocol version `1.0.0`, or `/ipfs/lan/kad/1.0.0` for a local -DHT swarm. +Every DHT swarm is associated with a specific libp2p protocol identifier, and +all nodes within that swarm must use it. Public DHT swarms MUST use a unique +libp2p protocol identifier, whereas private swarms SHOULD use a distinct +identifier. Although private swarms may reuse an identifier if their networks +remain isolated, they will merge upon interaction. Therefore, unique +identifiers are recommended. -Note that there could be multiple distinct DHT swarms using the same libp2p -protocol identifier as long as they don't have any common peers. This practice -is discouraged as networks will immediately merge if they enter in contact. +#### Amino DHT -Each DHT swarm SHOULD have a dedicated protocol identifier. - -### Amino DHT - -The [Amino DHT](https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino) is -the swarm of peers also referred to as the _Public IPFS DHT_. It implements the -IPFS Kademlia DHT specification and uses the protocol identifier -`/ipfs/kad/1.0.0`. +[_Amino DHT_]((https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino)) is +a public instance of the _IPFS Kademlia DHT_ spec mounted under +`/ipfs/kad/1.0.0` libp2p protocol, it is also referred to as the _Public IPFS +DHT_. :::note The Amino DHT is utilized by multiple IPFS implementations, including @@ -91,6 +86,39 @@ The Amino DHT is utilized by multiple IPFS implementations, including and can be joined by using the [public good Amino DHT Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers). ::: +#### IPFS LAN DHTs + +_IPFS LAN DHTs_ are DHT swarms operating exclusively within a local network. +Thy are accessible only to nodes within the same network and are identified by +the libp2p protocol `/ipfs/lan/kad/1.0.0`. + +In a LAN DHT: +* Only hosts on the local network MAY be added to the routing table. +* By default, all hosts operate in [server mode](#client-and-server-mode). + +Although many IPFS LAN DHTs use the same protocol identifier, each swarm is +distinct because its scope is limited to its own local network. + +Nodes MAY participate in LAN DHTs, enabling fast peer and content discovery in +their local network. + +#### Creating a Custom DHT Swarm + +Custom DHT swarms can be created to serve specific use cases by meeting these +requirements: +* **Unique libp2p Protocol Identifier**: All nodes in a DHT swarm MUST use the +same libp2p protocol identifier. A suggested format is +`//kad/`. Note that if two public swarms share the same +protocol identifier and encounter each other, they will merge. +* **Consistent Protocol Implementation**: All nodes participating in the swarm +MUST implement the same DHT protocol, including support for all defined RPC +messages and behaviors. +* **Bootstrapper Nodes**: To join a swarm, a new node MUST know the +multiaddresses of at least one existing node participating in the swarm. +Dedicated bootstrapper nodes MAY be used to facilitate this process. They +SHOULD be publicly reachable, maintain high availability and possess sufficient +resources to support the network. + ### Client and Server Mode A node operating in Server Mode (or DHT Server) is responsible for responding @@ -714,15 +742,15 @@ without responding. ## Client Optimizations -### LAN DHT Swarms +### Dual DHTs -Implementations MAY support private or LAN-specific DHT swarms, which operate -within a local network and remain isolated from the public DHT. Nodes MAY -participate in multiple DHT swarms simultaneously, provided that each swarm has -a unique protocol identifier. +Implementations MAY join multiple DHT swarms simultaneously—for example, both a +local and a public swarm. Typically, write operations are executed on both +swarms, while read operations are performed in parallel, returning the result +from whichever responds first. -Private DHT swarms MAY store and serve private multiaddresses, as they are not -exposed to the public network. +Using a local DHT alongside a global one enables faster discovery of peers and +content within the same network. ### Verifying DHT Server From 27d2a5825816422c50ca1f462ddb4355784fc982 Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Tue, 1 Apr 2025 11:38:00 +0200 Subject: [PATCH 19/21] specify relation to libp2p kad-dht --- src/routing/kad-dht.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index 2b48cf0d..b380bee2 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -40,8 +40,9 @@ distance metric. Once a node has identified the closest peers, it can either: ### Relation to libp2p kad-dht -The IPFS Kademlia DHT specification is an instantiation of the [libp2p Kademlia -DHT](https://github.com/libp2p/specs/tree/master/kad-dht). +The IPFS Kademlia DHT specification extends the [libp2p Kademlia +DHT](https://github.com/libp2p/specs/tree/master/kad-dht), with practical +details related to CID, IPNS, and content providing. It is possible to use an alternative DHT specification alongside an IPFS implementation, rather than the one detailed here. This document specifically From a75b7b48e132d20a67b9198c1f8747787190c8c2 Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Wed, 9 Apr 2025 11:10:54 +0200 Subject: [PATCH 20/21] mark protobuf ping message as deprecated --- src/routing/kad-dht.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index b380bee2..bbae8659 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -656,7 +656,7 @@ message Message { ADD_PROVIDER = 2; GET_PROVIDERS = 3; FIND_NODE = 4; - PING = 5; + PING = 5; // DEPRECATED } enum ConnectionType { From 48e545849a40f64e18d53e2fe22bd3d1de4a73dc Mon Sep 17 00:00:00 2001 From: guillaumemichel Date: Thu, 10 Apr 2025 09:46:32 +0200 Subject: [PATCH 21/21] require libp2p ping --- src/routing/kad-dht.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/routing/kad-dht.md b/src/routing/kad-dht.md index bbae8659..d7018b63 100644 --- a/src/routing/kad-dht.md +++ b/src/routing/kad-dht.md @@ -152,16 +152,20 @@ DHT Clients MAY Provide [Content](#provider-record-routing) and [Records](#value-storage-and-retrieval) to the network, content providing is not exclusive to DHT Servers. -### Transports +### Networking All nodes MUST run the libp2p network stack. +DHT Servers MUST support the [libp2p ping +protocol](https://github.com/libp2p/specs/blob/master/ping/ping.md) to allow +probing by other DHT nodes. + DHT Servers MUST support both [`QUIC`](https://github.com/libp2p/specs/blob/master/quic/README.md) and `TCP`+[`Yamux`](https://github.com/libp2p/specs/blob/master/yamux/README.md)+[`Noise`](https://github.com/libp2p/specs/blob/master/noise/README.md). It is essential that all DHT Servers are able to open a connection to each other. Additionally, DHT Servers SHOULD support -[`TLS``](https://github.com/libp2p/specs/blob/master/tls/tls.md) as an +[`TLS`](https://github.com/libp2p/specs/blob/master/tls/tls.md) as an alternative to Noise, [`WebRTC direct`](https://github.com/libp2p/specs/blob/master/webrtc/webrtc-direct.md), [Secure