Distributed Storage System Project Wiki
Advertisement

x.0  Distributed Hash Table[]

Distributed hash tables (DHTs) are a class of decentralized distributed systems that provide a lookup service similar to a hash table: (key, value) pairs are stored in the DHT, and any participating node can efficiently retrieve the value associated with a given key. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows DHTs to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

DHTs form an infrastructure that can be used to build more complex services, such as distributed file systems, peer-to-peer file sharing and content distribution systems, application layer multicast, instant messaging. Notable distributed networks that use DHTs include BitTorrent's distributed tracker, the Kad network, Limewire, Freenet, etc.


x.1    Structure[]

The structure of a DHT can be decomposed into several main components.

The foundation is an abstract keyspace, such as the set of 160-bit strings. A keyspace partitioning scheme splits ownership of this keyspace among the participating nodes.

It helps bring dowm  the node identification(ip address) and file identification to the same name-space. This means that a node and a file has to be represented as a number such that one directly cannot any one of them.

For example : if n0's id is 12.133.64.41 and f0 's id is 0101010100 then hash of them will bring it down to the same namespace,
i.e.  hash(n0) -> 23434242   and  hash(f0) -> 5435333.

As a standard, 160-bit SHA-1 Hashing technique is used. Thus the total number of files and total nodes cannot cross 2^160. An overlay network then connects the nodes, allowing them to find the owner of any given key in the keyspace.

Once these components are in place, a typical use of the DHT for storage and retrieval might proceed as follows. To store a file with given filename and data in the DHT, the hash of filename is found (hash=k), and a message put(k,data) is sent to any node participating in the DHT. The message is forwarded from node to node through the overlay network until it reaches the single node responsible for key k as specified by the keyspace partitioning, where the pair (k,data) is stored. Any other client can then retrieve the contents of the file by again hashing filename to produce k and asking any DHT node to find the data associated with k with a message get(k). The message will again be routed through the overlay to the node responsible for k, which will reply with the stored data.

The keyspace partitioning and overlay network components are described below with the goal of capturing the principal ideas common to most DHTs; many designs differ in the details.

 

x.1.1    Keyspace partitioning

Most DHTs use some variant of consistent hashing to map keys to nodes. This technique employs a function δ(k1,k2) which defines an abstract notion of the distance from key k1 to key k2, which is unrelated to geographical distance or network latency. Each node is assigned a single key called its identifier (ID). A node with ID i owns all the keys for which i is the closest ID, measured according to δ.

Example. The Chord DHT treats keys as points on a circle, and δ(k1,k2) is the distance traveling clockwise around the circle from k1 to k2. Thus, the circular keyspace is split into contiguous segments whose endpoints are the node identifiers. If i1 and i2 are two adjacent IDs, then the node with ID i2 owns all the keys that fall between i1 and i2.

Consistent hashing has the essential property that removal or addition of one node changes only the set of keys owned by the nodes with adjacent IDs, and leaves all other nodes unaffected.


x.1.2    Overlay network
Each node maintains a set of links to other nodes (its neighbors or routing table). Together these links form the overlay network. A node picks its neighbors according to a certain structure, called the network's topology. This basically decouples the structure from physical network topology.

All DHT topologies share some variant of the most essential property: for any key k, the node either has a node ID which owns k or has a link to a node whose node ID is closer to k, in terms of the keyspace distance defined above. It is then easy to route a message to the owner of any key k using the following greedy algorithm (that is, not necessarily globally optimal): at each step, forward the message to the neighbor whose ID is closest to k. When there is no such neighbor, then we must have arrived at the closest node, which is the owner of k as defined above. This style of routing is sometimes called key based routing.

Beyond basic routing correctness, two important constraints on the topology are to guarantee that the maximum number of hops in any route (route length) is low, so that requests complete quickly; and that the maximum number of neighbors of any node (maximum node degree) is low, so that maintenance overhead is not excessive. Of course, having shorter routes requires higher maximum degree. Some common choices for maximum degree and route length are as follows, where n is the number of nodes in the DHT, using Big O notation:

    * Degree 0(1), route length 0(n)

    * Degree 0(logn), route length 0(logn / loglogn)

    * Degree 0(logn), route length 0(logn)

    * Degree 0(sqrt{n}), route length 0(1)

The third choice is the most common, even though it is not quite optimal in terms of degree/route length tradeoff, because such topologies typically allow more flexibility in choice of neighbors. Many DHTs use that flexibility to pick neighbors which are close in terms of latency in the physical underlying network.

Maximum route length is closely related to diameter: the maximum number of hops in any shortest path between nodes. Clearly the network's route length is at least as large as its diameter, so DHTs are limited by the degree/diameter tradeoff[5] which is fundamental in graph theory. Route length can be greater than diameter since the greedy routing algorithm may not find shortest paths. 

Advertisement