A Brief Introduction to Ethereum Swarm

With the Ethereum blockchain as the CPU of the world computer, Swarm is best thought of as its “hard disk”. — Viktor Trón, Team Lead at…

A Brief Introduction to Ethereum Swarm

With the Ethereum blockchain as the CPU of the world computer, Swarm is best thought of as its “hard disk”. — Viktor Trón, Team Lead at SWARM

The idea for Swarm came from Gavin Wood, one of the founders of Ethereum. When Viktor Trón and Daniel Nagy joined the Ethereum Foundation in 2015, they took over the project within the Foundation’s Geth team. It was more than 5 years ago. Now, in 2022 Swarm is more or less feature complete, and started its own long journey as an autonomous project. Swarm has its own foundation and finished its initial token offering with more than a $15 million investment.

In this article, I will show you why Swarm is an exciting project, and what unique features (like feeds and direct messaging) make it different from other storage solutions.

Ethereum Swarm is basically a giant DHT. DHTs are used in many storage solutions (like BitTorrent or IPFS) as a database where peers can find where the data is stored. Swarm uses the DHT in a different way because it stores the data itself in the DHT. Swarm calls this solution DISC which means Distributed Immutable Store for Chunks. When you store anything in Swarm, it splits it into 4K chunks. These chunks are addressable by the chunk hash and stored on nodes that are the nearest to the chunk address. Every chunk is stored redundantly on the 4 nearest nodes.

The network structure of Swarm is stable and highly optimized for the DHT. Because of this optimized network structure, the content retrieval in Swarm is really efficient. Swarm calls this Kademlia connectivity. If you need a more detailed explanation of the network structure, you can read about it in the Book of Swarm.

Swarm is based on libp2p (underlay network), but nodes are addressed by an address that is derived from its Ethereum address (overlay network). The signature method is also the same that is used by Ethereum, and signatures can be easily validated by smart contracts. This makes Swarm an ideal storage solution for Ethereum.

Anonymity is one of the key points of Swarm. Every node knows only a limited number of its neighbors because of Kademlia connectivity. If a chunk is not found on any neighbor they ask their neighbors, etc., and forward back the content. It’s not possible to find out who is the requestor, and who is the source of the given chunk. Swarm calls this method forwarding Kademlia.

Bandwidth is a scarce resource in Swarm, because of this when you retrieve a chunk, you have to pay a small amount of BZZ (the native token of SWarm) for the retrieval. It is something like gas cost on the Ethereum network. If your neighbor has the chunk, it can keep the BZZ. If not, it has to pay one of its neighbors for it. This solution incentivizes the nodes to cache frequently asked contents and keep them near to the customers. It makes Swarm a really efficient CDN. (Swarm would be an ideal solution for decentralized multimedia streaming.)

Swarm has a very interesting solution for accounting. It is something like Lightning Network, where microtransactions happen off-chain. In the case of Swarm, nodes keep track of a bandwidth balance. If a node gives and retrieves the same amount of chunks to another node, the balance will be near 0. If this balance goes over a limit, the node has to pay with a cheque.

Cheques are digitally signed data packs that can be collected by the nodes, and that can be used to withdraw BZZ tokens from the checkbook contract when it’s worth it. This is Swarm’s own gasless micropayment solution, but theoretically, it would be used for any cases where micropayment is needed.

There are two ways for storing data on Swarm. The first is global pinning. In this case, you store your data on your drive, and the DHT contains only a reference to it. It is very similar to the method used by IPFS to store data. It is free of charge because you store your own data, but not anonymous and you have to deal with the redundancy, etc. Another way of storage is using a postage stamp. A postage stamp is something like a cheque that can be used by the storage nodes to withdraw BZZ if they can prove that they storing your content. In this case, the content is stored in the correct place in the DHT, the source of the data is untrackable and the redundant storage is provided by the Swarm network.

Content addressed chunks only one type of Swarm chunks. There is another type of chunk that is called Single Owner Chunk (SOC). In this case, the chunk key is a hash of the owner’s Ethereum address and a chunk identifier. A SOC is valid only if it’s signed by the owner. This type of chunk makes it possible to create a tricky data structure on Swarm that is called a feed.

A feed is represented by a SOC where the identifier is hashed from a topic and a sequence number. A feed is a “virtually mutable” data structure on immutable storage. When the owner wants to change the content of her feed, she increases the sequence number and publishes a new SOC. The Swarm client can heuristically find the highest sequence number, and pull the latest content. These mutable contents can be used in many ways. For example, you can assign it to your ENS address, and you can update the content without any blockchain interaction.

Another type of chunk is the trojan chunk. As I wrote at the beginning chunks are stored on the DHT on the nodes that are near to the chunk key by the node address (Ethereum address). This function of Swarm can be used to send direct messages to nodes. In this case, the message is packed into a chunk (this is the trojan chunk) and sent to the network. The network tries to store the chunk in the correct place, and nodes will forward it to the target node. The message is encoded, so only the target node will be able to read it. If the target node is offline, the neighbors will store it as a normal chunk if it has a postage stamp. Swarm has a high-level API to manage these trojan chunks, they called it PSS. PSS is something like an anonymous mail service that is a full-featured replacement of Ethereum’s dead messaging service Whisper.

As you can see, Swarm is a really exciting storage solution, and much more. The unique features like Feeds, PSS messaging, and the micropayment system make it a complex ecosystem that can be used in many ways.

At the end of the article, I would say a big “Thank You!” for Viktor (Trón) who patiently answered my questions and helped me to understand the basics of Swarm.

This article was originally posted on HackerNoon.