Skip to main content
All CollectionsDeveloper & Technical Docs
The transition to Hivenet v2 and why it matters
The transition to Hivenet v2 and why it matters

Learn how Hivenet’s v2 infrastructure upgrade improves performance, reliability, and scalability with faster updates and better service.

Daniel Awbery avatar
Written by Daniel Awbery
Updated over 2 months ago

In 2024, Hivenet began a series of foundational changes as part of adopting a new “v2” model for our underlying infrastructure. While these changes are not expected to impact end-user experience, this is a perfect opportunity to continue our goal of being more transparent about our technology with our users. Before we start, if you’ve not already read our primer on How Hivenet Works we suggest heading there first - we will assume you’re already familiar with the concepts and terminology covered in that article. With all that said, let’s get started and discover more about the incoming changes to Hivenet’s agents and how we keep track of data across our network.

On the rational fear of horse-sized ducks

You may be familiar with a famous internet thought experiment that asks:

“Would you rather fight 100 duck-sized horses or one horse-sized duck?”

Of course, there is no “right” answer to this question (if you ask us, 100 duck-sized horses sounds like the safest bet. Be honest: a ~1.6m tall, angry duck sounds … terrifying), but what on earth does this have to do with Hivenet v2? In our overview of Hivenet’s architecture we compared our agents to the “worker bees” that perform a range of critical tasks. In Hivenet’s original iteration, there was a single hivenet-agent with a huge range of responsibilities - this is our “horse-sized duck”. Figure 1 illustrates the range of functions performed by hivenet-agent: managing uploads and downloads; accessing the filesystem on host resources; handling connectivity with nodes in the network; and so on.

Figure 1: the many roles of hive-agent

There are advantages to having a single, “monolithic” agent - in particular, they can be quicker to build and deploy in the initial stages of development. However, over time, drawbacks can emerge as new features are added and the codebase grows in size:

  • The codebase becomes more complex and harder to develop with parallel teams

  • More code means a larger compiled, executable file, meaning deployments take longer and may be more cumbersome

  • A higher potential for unforeseen consequences increases the risk incurred when adding/ changing features

You may also be asking: “I get what hivenet-agent is, but … where was it deployed”? The answer is that hivenet-agent was deployed on every desktop client and storage peer in the network. So, for example, if we wanted to change the code for handling file management on your desktop, we would have to update every peer as well… even if the change did not impact them. We won’t get more into the theory of monolithic versus microservice architectures beyond this, but here is a good starter if you want more.

Now that we have a proper understanding of oversized ducks let’s delve deeper into how Hivenet v2 addresses these challenges.

Seeing the wood for the duck-sized horses

We’ve picked our battle - we are no longer faced with a single, massive duck ✺◟(^∇^)◞✺. But we still have a problem: turning an enormous mallard into a herd of tiny ponies is more complicated than waving a magic wand. Breaking a monolithic service into pieces requires detailed analysis to identify an optimal approach, a lot of development, and a careful rollout to ensure everything works as expected.

Figure 2 summarises how hivenet-agent has evolved into multiple, smaller services - we will share more detail on each agent and how they cooperate to drive our service throughout the rest of this article.

Figure 2: the many agents of Hivenet v2

A note on connectivity methods: networking is a deep topic (so we will keep this brief), but for now, just understand that sometimes direct connectivity to a node is not possible, so we must leverage a relay service. If you want to learn more, please see the LibP2P connectivity page.

As Figure 2 shows, Hivenet v2 includes specialized services tailored to fundamentals like networking, filesystem management, tracking down required data blocks, etc. There are many advantages to these changes behind the scenes, but from a user perspective the main benefits are: increased reliability; improved performance; and faster development that enables us to deliver improvements more quickly.

See Table 1 below for an overview of each agent and its role.

Agent name

Where present

Purpose

hivedisk-agent

Desktop application

Interacts with the desktop filesystem. Delegates management of local contributed resources to a local instance of hivenet-resource-agent.

hivenet-resource-

agent

Contributed nodes

Exposes upload and download APIs for storing and retrieving immutable blocks to/ from local storage.

hivenet-http-agent

Non-user nodes

Exposes an HTTP API to support uploading and downloading of immutable blocks on the network. Interacts with hivenet-cids to identify stored content and with hivenet-peers for peer selection.

hivenet-relay-agent

Subset of nodes selected by swarm operator

Provides relay and connectivity services for inter-agent communication if direct connectivity is not possible.

Note: We won’t cover more details in this article - see here for more on relays in this context.

hivenet-peers

Peer lookup service

Exposes an HTTP API to support status reporting and metadata for nodes and peer selection. Nodes are expected to periodically report their status to this service.

hivenet-cids

cids lookup service

Exposes an HTTP API to map cids to their underlying nodes.

Table 1: core Hivenet v2 agents and their purpose

A note on “peers”: in a network the size of Hivenet, it’s not practical for each node to associate with every other node. So, in this context, a peer is simply the subset of all nodes that a specific node is aware of/connected to.

If you think splitting hivenet-agent into so many pieces seems like a big change - it is! However, in practice, the only impact you should notice is an improved quality of service. That’s it for Hivenet’s growing family of agents, so let’s move on to our next significant change: it’s a conceptual shift, but it is expected to deliver non-trivial performance and latency improvements for our service.

A little of column A, a little of column B: a hybrid architecture

First, let’s cover some theory to help lay the foundations for understanding what’s changing in Hivenet v2. In our introduction to How Hivenet Works we explain that, at its simplest, Hivenet provides a service that maps a stable reference to an object (e.g., a user’s file) to its associated “blocks” of data distributed across the network. Hivenet v1 achieved this using a “Distributed Hash Table” (DHT). We don’t have space to cover all the details here (see this - or this for an academic perspective) but here is a very simple summary, that provides enough context for our needs:

  1. A DHT is a large table that contains a mapping of who stores what data

  2. Both the table and the data it references are distributed across nodes in the network

  3. If a node needs to find a block of data - and holds neither the mapping, nor the data itself - it uses a routing algorithm to iterate through its peers (and their peers) until it finds what it needs

When exploring a new technology, it’s natural to ask: “What problem does this exist to solve?”. In the case of DHTs, they were a response to the problems faced by early peer-to-peer file-sharing networks (Napster, BitTorrent, etc.) as they tried to improve the efficiency of their services. DHTs offered a few key benefits over other approaches, as shown in Table 2:

Scalability

Support large networks while handling dynamic/ “ephemeral” nodes

Efficient routing

Locate content without querying every single node in the network

Fault tolerance

Decentralized network removes single points of failure

Table 2: Benefits of Distributed Hash Tables for a distributed file system

All of that sounds amazing, right? But there must be some downsides… Well, yes - and in practice, at Hivenet, we’ve come to the conclusion that, for our requirements, the constraints introduced by DHTs outweigh their benefits. Table 3 summarises the key areas of concern.

Latency and inefficiency

Our distributed service includes nodes that can “churn,” connecting and disconnecting at random. When this happens, the network must dynamically adjust to use other nodes - this takes time and can impact connection speeds or cause delays in lookup queries. User feedback tells us that this impacts the overall experience of our service.

Constraints to scalability

Hivenet’s user base is scaling - this means that, over time, the DHT has also grown to accommodate additional nodes. With a larger DHT, nodes must allocate more memory to store routing tables and more CPU to query them. More network traffic is needed to keep the table current and avoid “stale” or invalid entries. Our users have told us they are experiencing increased network and resource utilization.

Security concerns

The decentralized nature of the DHT offers security benefits due to the lack of a single, vulnerable “authority.” However, the lack of central authority creates other issues - like the “Sybil” attack, which uses multiple fake “identities” to exert influence on the network. Mitigating these attacks is complex and resource-intensive: the benefits need to be weighed against these challenges in practice.

Table 3: limitations of Distributed Hash Tables for Hivenet

Ultimately, as Hivenet is a user-facing company, we have to take into account our responsibility for 1) our users’ private data and 2) service availability expectations that may not apply in other scenarios. Taking all of this into account, in Hivenet v2, we will be deprecating the use of the DHT and migrating two capabilities to our core platform (see Figure 3 for illustration):

  1. The cids lookup service that tells us which nodes host which blocks of data.

  2. The peer lookup service that tells us the status of every node in the network and allows for optimal selection of peers for storing and retrieving data.

Figure 3: separation of responsibilities between Hivenet services and node-hosted agents

Straight from the (duck-sized) horse's mouth

This new approach represents a conceptual shift in our architecture. However, these changes will offer some key benefits for our users and contributors:

  1. Hivenet will be able to adjust peer selection criteria without the need to deploy new agents. This will make it faster to optimize and improve query speeds.

  2. Less bandwidth and CPU will be required to maintain connectivity, placing less burden on your resources and improving your experience of the service.

  3. Decomposing the monolithic v1 agent into dedicated services will streamline our development, meaning you get new features and capabilities faster.

There are other benefits, but overall, our goal is always to deliver an excellent user experience. We at Hivenet are clear that we have identified the best path to achieve this objective - and we’re investing now to maintain our quality of service. Making these changes will take time: the foundational updates will happen over the next few months, while the longer-term functionality will roll out after that. These changes should be invisible to you, but we will let you know if anything will impact you. As always, if you have any concerns, you can reach out via our Support page and let us know.

Did this answer your question?