This is a three part post where I explain concepts included in the famous white-paper that are not introduced nor described beforehand, and are important to have a better grasp of the problems tackled by Bitcoin. They are divided into the same sections of the paper.
This is not a one-stop-shop for Bitcoin or crypto, but a good catch-most of the under-explained concepts in the original whitepaper. Also, this is not financial advise, nor am I trying to evangelize this to you. :)
I believe the best way to understand the original paper is to have this post by your side, stop here for the tricky and technological concepts, and return to reading the paper. There is no specific order, and you can skip any section (specially the mathy ones).
- Table of Contents
- P2P distributed timestamp server
- Proof of Work: Adam Back’s Hashcash
The paper starts with an introduction on the trust inherent to the commerce on the Internet, particularly on the costs of transactions. At one point, Nakamoto talks about the impossibility of non-reversible transactions, and the cost of their mediation. Let’s stop here and dig deeper into what this means.
Trusted third parties like financial institutions, banks and credit card issuers are responsible for resolving disagreements between parties. In order to resolve them, they need to do extra checks which have a cost. Consequently, any transaction whose value is lower than the cost of validating it is simply not profitable and won’t be done.
So small transactions like buying a soda can or buying a digital copy of a song worth a couple of dollars are thus impossible, and the client is only left with more expensive options, like subscribing to a music service with a greater monthly cost.
But money is not the only cost in this. On the one hand, in order to be able to make complaints and reverse transactions, merchants/sellers first need information from their customers. On the other hand, if transactions were non-reversible they could be cleared without collecting any private information.
The non-reversibility of payments proposes two things:
- merchants are protected since buyers can’t cancel their payments
- buyers are protected through escrow mechanisms (delaying payment until the goods or services have been received)
Bitcoin transactions are irreversible and can only be refunded by the receiving party—a key difference from credit card transactions that can be canceled, through mechanisms which add costs to transactions and set a minimum viable transaction value.
The escrow mechanisms are not part of the solution proposed by the paper, but have been implemented by third parties, for a fee of course. Indeed, a middle man charging for escrow. Kind of contradictory, perhaps?
An escrow is a financial and legal agreement that protects both buyers and sellers in a transaction in exchange for a fee. This fee is paid to an independent third party that holds the payment until everyone carries out their responsibilities in the agreement.
As an example, if buyer B wants to buy something from seller S, B is at risk of being scammed by S if the latter does not deliver the goods or services of the transaction. How can B recover the BTC spent, given the fact that bitcoin is non-refundable?
Bitcoin escrow mechanisms exist for this purpose, most only accepting legal merchandise.
For illegal activities, which I hope the reader is far from being involved in, some BTC escrows, such as the Dark Bitcoin Escrow, accept illegal goods and services (stolen credit cards, drugs, unregistered guns, fake currency).
Naturally, these mechanisms are frequented by thieves and gang members who don't want to get scammed by other thieves and gang members.
Later on, the authors propose a “solution to the double-spending problem using a peer-to-peer distributed timestamp server to generate computational proof of the chronological order of transactions.” There’s a lot to unpack here, so let’s go step by step.
Firstly, what’s the double-spending problem?
My favorite Reddit subreddit, r/explainlikeimfive, opens up possibilities for many wise surfers of the web to describe this sort of concept in a very didactic manner. The following explanation is a digested extraction of this and this.
Double spending is essentially the version of counterfeit money for digital money. With a digital currency like bitcoin, your money exists as a file. So what is stopping you from copying that file (money) and using both copies to buy things? It’s like buying a chocolate bar with one dollar bill, and a pen with another bill, but the bills are actually identical.
The blockchain comes to solve this. It is basically a history of all transactions of that digital currency. It works via agreement among all people who have copies of that blockchain. The key is that transactions on the blockchain are only considered valid if a majority of the people have agreed to that specific blockchain.
If you attempted to use a single unit of digital currency in two different transactions, it is possible that both transactions end up in different blockchains, but eventually one of them will be rejected and only the other one will be valid according to the majority of the network, so the one and only digital currency is "spent" and can't be used again.
As you may have noticed, the double spending problem in a digital currency system is fundamentally about the chronological order in which transactions happen. Let’s go over it once again with an example.
If A has $100 in their account, and tries to send $100 to each of B and C, obviously only B or C would get the money. Which of them gets it first depends on the order in which the transactions happen. If A sends $100 to B first, and then tries to send $100 to C, B would get the money and the second transaction would be rejected.
In a centralized digital currency systems (e.g. e-banking), there is a single entity (the bank) tracking transactions, so the double spending problem is “easy” to stop. Whichever transaction that entity receives first goes through and others are rejected. The chronological order of transactions (and thus which one is "first") is well-defined here since there is only a single entity responsible for tracking and ordering them.
However, in a decentralized digital current systems (e.g. Bitcoin), the order is not well-defined. If A broadcasts both transactions near-simultaneously, one person (or a validator node in the network of the blockchain) may receive the transaction to B first, another person may receive the transaction to C first.
Thus, Bitcoin ends up being the first successful system to solve the double spending problem with blockchain technology.
The blockchain allows a decentralized crowd to record and order information (in this case, transaction data) that is extremely difficult to modify once recorded. This allows all Bitcoin nodes to agree as to which of the two conflicting transaction happened "first".
Secondly, what’s a peer-to-peer network?
In a P2P network, every participant plays the same role; they are all peers. Each peer is both a client - requesting data, and a server - providing data. When you compare it to a regular website, this makes it a lot more robust.
A peer is usually connected to several other peers and if one goes offline, everything still works as usual. This makes blockchain networks very robust.
The nature of the network being P2P allows it to avoid any third party, any financial institution, and instead depend on the effort of each and everyone of the peers involved.
Thirdly, what is a timestamp server?
In Bitcoin’s P2P network, all the transactions are public to all the participants of the networks, namely, the nodes.
This public record attempts to solve the double spending problem. But enforcing all nodes to agree at all times about the single source of truth of the transactions is not a trivial task.
The solution proposed includes making the nodes agree on the order of transactions. This means that if two transactions involve the same coin (double spending it), then the one that is kept will be solved.
So what is the criteria used to decide which one comes first? Nakamoto proposes a timestamp server.
A timestamp server works by taking a hash of a block of items to be timestamped and widely publishing the hash, such as in a newspaper or Usenet post [2–5]. The timestamp proves that the data must have existed at the time, obviously, in order to get into the hash. Each timestamp includes the previous timestamp in its hash, forming a chain, with each additional timestamp reinforcing the ones before it.
Firstly, a timestamp is just a string of characters that identify a specified date and time of day, often with seconds to add to the accuracy and a timezone indication. For example, “20/12/2021 12:00:00 -0400”.
Generally, a timestamp is added to something else. In this case, a collection of transactions, a block, will be timestamped. The transactions in it don’t compete with each other to be the last one standing in the chain, so grouping them together is more efficient than having each transaction hold one timestamp. The timezone of reference is London local time, UTC-0.]
Secondly, the block of transaction or items is then “hashed” and made public to the rest of the network. But what is a hash?
In plain terms, a hash is a function that receives an input A (of arbitrary length, and mostly an alphanumeric text) and returns an output B (often of a specific size, like 256 characters) in such a way that it is extremely hard to create an inverse function, that is, a function that takes B and outputs A.
A ‘good’ hash:
- Is easy to calculate for any input.
- Makes it extremely computationally difficult to calculate or find an input that has a given hash.
- Has an extremely low probability of having two different messages with the same hash.
So to put it shortly, a hash is a short string of charaters that can be computed easily, but cannot be reversed. It’s a small pseudo-random summary or fingerprint of the input. This fingerprint determines the exact moment in which the block of transactions was mined and validated.
In relation to the timestamp, the hash is made from it, among others things (previous block header hash, difficulty, nonce and Merkle root). The moment the hash is published, the block’s data (including the timestamp that the hash sums up) guarantees that it must have existed at the time of creation.
Thirdly, but most importantly, as new blocks of transactions are put together, each new hash is calculated with the previous block’s hash. If you trace the path from the latest block in the blockchain, jumping blocks one hash at a time backwards, you will eventually reach the genesis block, the very first block that was produced.
I will spare you the search of when this block’s timestamp: January 3rd 2009, 18:15:05h UTC.
This traceability guarantees a linear order of blocks in which the entire network agrees upon. To mess with this chain, an attacker would have to brake the hash of the target block, and every single block after it; a practically impossible task (at least as of the time of writing). So if a block B’s hash was computed from a block A’s hash, this proves that the block A came before block B.
As hard as it is to rewrite the chain’s history, it is computationally easy to verify a block’s hash: anyone can check that block B’s hash results from block A’s hash by recomputing it. This verification is key for network nodes to rest assured that any incoming blocks (and their transactions) have not been modified in any way.
By introducing the chain of digital signatures, Nakamoto states that the payee can verify the signatures to verify the chain of ownership. But a problem is then posed. How can a payee verify that the received coin was not double-spent?
A common solution is to introduce a trusted central authority, or mint.
Investopedia defines mint concisely:
A mint is a primary producer of a country's coin currency, and it has the consent of the government to manufacture coins to be used as legal tender.
Along with production, the mint is also responsible for the distribution of the currency, protection of the mint's assets, and overseeing its various production facilities.
Now that we saw what a P2P distributed timestamp server is and how blocks of transactions are hashed, let's dive deep into how this all works together.
Hashcash uses a proof-of-work technique to prove an email is not spam. Bitcoin mining also uses a proof-of-work system, but in its case, to prove a block of transactions is valid. In order to implement a distributed timestamp server on a peer-to-peer basis, Satoshi mentions the need of a mechanism that validates transactions on the network.
The idea behind Hashcash is pretty simple. In order to prove an email is not spam, the sending software must solve a small but CPU-intensive math problem. The answer to the math problem is included in the email header, which is read by the receiving email software.
This prevents spam because spammers send millions of emails each day. Having to solve a small (but CPU-intensive, aka, high electricity costs) math problem for each of those millions of emails would require a lot of very expensive hardware, which makes sending spam /unprofitable/.
The whole point of Bitcoin mining is verifying transactions. Mining doesn't solely exist to generate more coins. The coins that are generated are given to the miners as a reward for verifying transactions. People will still have to mine even when there are no more coins to generate, because transactions still need to be verified forever and ever.
A transaction occurs anytime someone sends Bitcoins to another person. Each transaction needs to be verified to ensure it's legit, e.g. ensuring the sender did in fact send the coins, and ensuring the sender had the coins to send.
Without verifying transactions, people could create their own fake coins and send them to themselves, or they could pull a double spending attack, which, as we have seen, is sending the same coins to more than one person.
So someone (or something) needs to verify each transaction. But Bitcoin is a distributed system without any central authority, so who/what exactly verifies transactions?
The answer is the miner who did the most work to prove the transaction. This is where proof-of-work comes into play. In order for a block of transactions to be verified and added to the blockchain, a miner must solve a very complicated math problem. A problem that is so complicated that miners need to buy restrictively expensive hardware, and it's impossible to predict which miner will solve the problem.
The verification process prevents double spending because the person trying to execute the double spend would have to verify their own fake transaction, which, because of proof-of-work, is impossible.
It's impossible to verify your own transactions because of the complicated random nature of the math problem that needs to be solved. There are millions of other miners trying to verify the same transactions, and the chances of your miner verifying the transaction is practically zero. You are competing with /every other miner on the planet/ to find the next block. If you "win", you get the block reward plus you get to choose which transactions are included in your block. If somebody else wins, you get no say in the matter. Unless you have a very large mining operation, the chance of you winning is so small that it's never going to happen.
So miners verify transactions by solving complicated math problems, and as a reward the Bitcoin network "rewards" the miner who verified a block of transactions with a set number of coins.
In fact, the miner who solved the problem creates the coins and gives them to themselves!
When no more Bitcoins can be generated, miners will continue to mine to collect the fee attached to each transaction, paid by the sender.
Nonce, or a "number only used once," refers to the first number a blockchain miner needs to discover before solving for a block in the blockchain.
Adding transactions to the blockchain requires substantial computer processing power. The individuals and companies who process blocks, the miners, are compensated only if they are the first to create a hash that meets a certain set of requirements, called the target hash.
The process of guessing the hash starts in the block header. It contains the block version number, a timestamp, the hash used in the previous block, the hash of the Merkle Root (a data structure with references to other hashes), the nonce, and the target hash.
If the hash meets the requirements set forth in the target, then the block is added to the blockchain. Cycling through solutions in order to guess the nonce is the famous proof of work, and the miner who is able to find the value is awarded the block and paid in the cryptocurrency.