In this post, we will attempt to compare and evaluate the differences between Bitcoin and Ethereum in terms of blockchain size and data storage requirements. Surprisingly, Bitcoin’s blockchain is still larger than Ethereum’s, but that’s about to change, and as Ethereum’s blockchain size increases rapidly, it will soon surpass Bitcoin’s. On the other hand, it’s not a particularly useful comparison metric, as far more computations and data needs to be run to understand useful information about the Ethereum network.

Bitcoin vs Ethereum – An Overview

Bitcoin journalist and podcaster Peter McCormack recently tweeted a comparison of the storage space required for a full Bitcoin node to a full Ethereum node.

The first question this brings is how could the Ethereum chain be so small, only 175 GB? After all, Ethereum has higher transaction throughput than Bitcoin. Ethereum has grown rapidly recently due to the huge demand for Defi and NFT related activities. It seems odd that all data is only a few hundred GB. This requires a lot of space, especially considering that almost any action a user takes on Ethereum requires a digital signature.

As a result, Ethereum currently produces far more data than Bitcoin, which still has more accumulated blockchain data, as shown in the graph below. From 2015 to 2018, Bitcoin’s blockchain grew faster than Ethereum, and then from 2018 to 2020, the two seem to be growing in parallel. Finally, from the end of 2020, the growth rate of the Ethereum blockchain has accelerated further and is now much faster than Bitcoin. Ethereum’s cumulative blockchain size looks set to surpass Bitcoin’s soon, and accelerate ahead of Bitcoin. While it may be surprising to learn that the Ethereum blockchain is smaller than the Bitcoin blockchain, in our view this is due to a bias against the status quo and forgetting that the Ethereum scale a few years ago has How small is the main reason for this surprise.

Invest in Cryptocurrencies with BitMEX

Blockchain size (GB) of Bitcoin and Ethereum

Blockchain size (GB)

For both Bitcoin and Ethereum, the total size of the blockchain in the figure above contains all transaction data, and is all the data a person needs to download from peers in order to fully synchronize and validate the chain. This includes authorizing all digital signatures for each transaction. In the case of Ethereum and Geth, BitMEX tested signature inclusion by disconnecting Geth nodes from the internet and successfully retrieving digital signatures from various sample transactions, including those from 2016 and 2017. The Ethereum blockchain data also includes all the code needed to deploy each smart contract, which BitMEX also tested for existence on the local machine, which has several hundred GB of storage.

What is the 9 TB Ethereum blockchain?

This large dataset is used for so called “archive nodes”. As far as we know, this large amount of data is because the node stores and indexes results from the historical state of the network in memory. All of these results can be calculated from smaller blockchain datasets. We can think of this 9 TB large data set as the amount of data needed to track and audit the flow of money at every point since the birth of Ethereum. In this sense, it is a noteworthy indicator.

We got a successful result when looking for the most recent transaction hash on our unarchived Geth node with the following command:

eth.getTransaction (“TXHash”)

However, if we try the same command on an earlier transaction, the result is “null”, probably because the transaction was not indexed. But we can still get data on these earlier transactions by specifying where the transaction is in a particular block.

eth.getTransactionFromBlock(“Block number”, “Transaction Index”)

Running the above command succeeds in getting results even when querying very old Ethereum transactions on a non-archive node with only a few hundred GB of stored data. Transaction signatures are also displayed. It is worth mentioning that our Geth nodes have 528 GB of data in the Chaindata directory as of November 21, 2021. 267 GB of it is in the “early” folder, the data is related to older blocks.

Bitcoin UTXO set and Ethereum header state

The next question arose after Peter Szilagyi commented that Ethereum’s head state requires 130 GB of data. We received questions such as why this number is so large compared to a somewhat equivalent or similar metric in Bitcoin (UTXO set size, i.e. unspent Bitcoin transaction output). For both Ethereum and Bitcoin, the last block and state header or UTXO set is all the data a node needs to evaluate the validity of incoming blocks.

At the time of this writing, Bitcoin’s UTXO set contains about 76 million outputs and takes up 4.6 GB of disk space. Bitcoin Core supports pruning blockchains, where nodes can discard older blockchain data and keep only some recent transactions and UTXO sets. This means that one can fully verify the entire Bitcoin blockchain and check the validity of new blocks while requiring far less than 10 GB of disk space. This is a rather useful feature that embodies great efficiency. For example, 4.6GB is only about 1.2% of the size of the entire Bitcoin blockchain.

This efficiency does not seem to apply to Ethereum. According to figures cited by Peter, head state on Ethereum accounts for 130 GB, which is about 43% of the size of the blockchain and much higher than Bitcoin’s 1.2%. Ethereum also has early transactions and accounts, why can’t these be cut, at least in theory, to achieve similar savings? Arguably, as far as we know, Ethereum developers are not trying to make it more efficient because there are other priorities, but even if they try, it is unlikely to achieve the efficiency savings seen in Bitcoin.

Invest in Cryptocurrencies with BitMEX

Ethereum’s state chain

In Ethereum, there are two main types of databases that nodes store: blockchain (all transactions plus block headers) and state. The state is calculated from the transaction history and basically contains: all Ethereum account balances, storage and account nonces associated with each deployed Ethereum smart contract. The state is updated and computed after each block based on the previous state and new transactions in the block. The Merkle root hash of the state is included in each block header, ensuring consensus on the state of the network. As Ethereum grows, the state data continues to grow, and as mentioned above, the latest state is comparable in size to the blockchain itself. If a node were to store all the full state of each block, this would be a huge amount of data, possibly even significantly larger than a 9 TB archive node.

The effect of a single Ethereum transaction on the state can be very small or very large. For example, a “regular” transaction that just sends ether from one address to another, with little effect on state. At the same time, a transaction that fails because it runs out of gas has little effect on the state. In contrast, for other types of transactions, the data footprint on the blockchain itself may be small, but the impact on state may be large, for example, a transaction may interact with a smart contract, which may change multiple account balances. If the Ethereum blockchain contained only transactions that had minimal impact on the state, the state size would be much smaller and could approach the c1% efficiency level of the Bitcoin UTXO set.

This is the key difference between Bitcoin transactions and Ethereum transactions. Just by looking at a single Bitcoin transaction, it is possible to know its impact on the state of the Bitcoin network and thus understand the current situation. This is not necessarily the case with Ethereum, where the impact of a transaction is usually understood only by calculating the state of the entire network at the same time.

You might be thinking, well, then Ethereum works differently than Bitcoin, there is no clear link or relationship between head state size and transaction count, but the same principle of pruning still applies. Why can’t some of the old, unused or expired state be pruned and excluded from the header state? Ethereum doesn’t actually work that way. When a smart contract is deployed, there is never actually a mechanism to close or end the contract, and it will continue to exist forever even if it is no longer used. Part of the core idea behind Ethereum is that it is an interactive system of composable contracts. Any account can interact with any smart contract or any part of the state at any time. Therefore, to validate a new block, a node must have the latest state of all smart contracts and the entire system. Therefore, when it comes to reducing the size of the head state, the possible pruning or efficiency gains are limited. Therefore, the head state is likely to continue to grow over time.

Bitcoin vs Ethereum – Conclusion

Comparing the size of the Ethereum and Bitcoin blockchains is not always significant. Bitcoin’s blockchain is basically enough to let people know everything about the Bitcoin network. In contrast, the Ethereum blockchain by itself is by no means sufficient for one to know too much about the state of Ethereum, to understand its state, more data needs to be calculated and stored, otherwise there is no way of knowing what many transactions actually result in Influence. To be fair, however, blockchain data size comparisons are somewhat meaningful, such as what is the minimum amount of data that needs to be downloaded over the internet to perform an initial sync. From this metric, the two coins are very close, and Ethereum is about to take the lead, or fall into a disadvantage, depending on how one looks at it.

Invest in Cryptocurrencies with BitMEX