"BTC White Paper" Study and Summary

What is BTC#

Before I read this white paper, I naturally thought that Bitcoin was just a string of code representing one Bitcoin. But that's not the case. In the entire Bitcoin system, there is no concept specifically used to refer to Bitcoin.

The white paper mentions:

We define an electronic coin as a chain of digital signatures.

We define a chain of digital signatures as a form of digital currency.

When I first read this, it was easy to think that one Bitcoin equals one digital signature. This leads to a huge question: how do you represent 0.1 BTC?

In the subsequent section "9. Combining and Splitting Value," it states:

Although it would be possible to handle coins individually, it would be unwieldy to make a separate transaction for every cent in a transfer. To allow value to be split and combined, transactions contain multiple inputs and outputs. Normally there will be either a single input from a larger previous transaction or multiple inputs combining smaller amounts, and at most two outputs: one for the payment, and one returning the change, if any, back to the sender.

Although it is possible to handle each coin individually, making a separate transaction for every cent would be very inconvenient. To allow for the splitting and combining of value, transactions contain multiple inputs and outputs. Typically, there will be either a single input from a larger previous transaction or multiple inputs combining smaller amounts, and at most two outputs: one for the payment and one for returning change (if any) back to the sender.

In the entire blockchain system, only transaction records are stored; Bitcoin is merely a unit of measurement, recorded in the transaction records to indicate how much BTC was transferred. The wallet address corresponds to the transaction records in the entire blockchain, not to the BTC itself.

What is the process of a transaction initiated?#

In "2. Transaction," the original text is as follows.

We define an electronic coin as a chain of digital signatures. Each owner transfers the coin to the next by digitally signing a hash of the previous transaction and the public key of the next owner and adding these to the end of the coin. A payee can verify the signatures to verify the chain of ownership.

We define an electronic coin as a chain of digital signatures. Each owner transfers the coin to the next by digitally signing a hash of the previous transaction and the public key of the next owner and adding these to the end of the coin. The payee can verify these signatures to confirm the chain of ownership.

According to the definition of BTC we just mentioned: BTC is just a unit of measurement; what is actually stored on the blockchain is the transaction records.

For example:

Under my wallet address, I have the following three transaction records:

0.6 BTC transferred to my address.
0.8 BTC transferred to my address.
0.3 BTC transferred to my address.

These three transaction records indicate that I have 1.7 BTC at my address. Now, I need to send 1.5 BTC to my friend B.

First, the inputs for this transaction are the three transaction records I own (because 1.5 BTC requires all three transaction records).

Therefore, when the transaction occurs, the first thing to do is to use my private key to try to match the digital signatures at the end of these three transaction records to prove that these three transaction records belong to me. Then the transaction can begin.

Next, the hash of the previous transaction refers to the three hashes corresponding to the three transaction records.

Then, this transaction has two outputs:

1.5 BTC transferred to my friend's address.
0.2 BTC transferred back to my address (change).

At the end of these two transaction records, it will look like this:

The hash of the three transaction records + the public key of friend B calculated to obtain a digital signature, which is attached to the end of the 1.5 BTC transaction record.
The hash of the three transaction records + my own public key calculated to obtain a digital signature, which is attached to the end of the 0.2 BTC transaction record.

This completes a transfer, transferring the ownership of one transaction record to friend B, while the other returns change to myself. Verifying ownership only requires using the private key to attempt to decrypt and verify. This is what happens during a BTC transfer.

How to avoid double spending?#

What is double spending? Let's take an example:

Now, I have 10 BTC, and I plan to send 8 BTC to friend B and 9 BTC to friend C. So, I follow the transaction process mentioned above and successfully construct two transactions, initiating them simultaneously. Although we can easily verify whether the BTC belongs to us, we cannot guarantee that the sender does not also want to send the same BTC to another person at the same time.

The problem arises because B and C blindly trust my word, and there is no communication between B and C, making it impossible to determine the order in which the transactions occurred.

In reality, the solution to this problem is to introduce a third-party institution, similar to a bank. B and C only trust transfers initiated by this institution, and I can only initiate transfer requests to this institution. Since all transfer actions must go through a central institution, there is no issue of information not being communicated; the institution knows how much money I have, how much I can transfer, and will not allow me to initiate double spending at will, while also clarifying the order of all transfers.

However, the problem is that the introduced third-party institution requires us to trust it completely. Trust is a difficult behavior to regulate; in the example above, we cannot guarantee that I am not colluding with the third-party institution. Even though there are many procedures and protocols in reality to help people trust these third-party institutions, the fact remains: as long as there are human participants in the process, shady behavior is inevitable; it's just a matter of time.

What BTC aims to do is eliminate this third-party institution and design a transaction system that guarantees security without requiring third-party trust.

"Timestamp Server"#

The first measure is to implement a design similar to a Timestamp Server. Similar to a newspaper, it records a time and specific events, indicating the order in which all transactions occur.

In the block, BTC's blocks include a timestamp and link all blocks using hashes.

The hash is calculated as follows: the hash of the previous block + the data of the current block = the hash of the current block. It can be considered that each block used to record transactions is linked together by hashes. ~~This is why it is called a blockchain~~.

In this calculation method, the hash value depends on the specific value of the previous hash.

One consensus is that even a slight change in the input value of the hash algorithm will result in a completely different calculated value.

So, if we try to modify the first block, the entire hash value will change dramatically. The hash value corresponding to the second block will not match at all. To successfully tamper with the first block, you would have to recalculate the second block. This continues for all subsequent blocks. Whether you want to delete or modify history, you must update the data of all blocks.

Every time a new block is added, it enhances the immutability of all previous blocks, ensuring the order of time.

"Proof-of-Work"#

The proof-of-work mechanism in BTC. Each miner expends considerable computational effort to gain the right to write new blocks. Only after providing proof of work can nodes (miners) write content to new blocks.

How is proof of work provided? BTC is designed as follows:

Each block consists of the following parts:

The hash of the previous block (prev hash)
All transaction data of the current block (in practice, it should store the Merkle Root)
Timestamp
Difficulty target (determines how many leading zeros are required in the hash value for proof of work)
Random number (nonce)

In this, the only unknown content for miners is the "random number (nonce)." This nonce value is what is referred to as "proof of work."

In "Proof-of-Work," miners need to solve a mathematical problem: Given the hash value calculated from all content except the random number, determine what the nonce value should be so that the hash value has n leading zeros, with the remaining digits matching the previous hash value. The specific number of leading zeros, n, is determined by the difficulty target.

As for how the difficulty target is generated, I searched for information but have not yet understood it. I only know that this difficulty target is dynamically adjusted based on the current difficulty of generating blocks in the network.

When the value of the nonce is calculated by the computer, it can serve as the node's "proof of work," proving that the node has the right to write to the new block, and after writing the block, it can broadcast it to all nodes. Although calculating the nonce value is very difficult, verifying whether this value is correct is quite simple; other nodes only need to substitute it in and check if it equals the expected value.

Calculating the nonce is not an easy task for a computer; it requires considerable computational power to arrive at the result. Therefore, combined with the earlier "Timestamp Server," if someone wishes to modify a block, the calculation of this nonce cannot be avoided.

If you want to modify the history of a block, you must be prepared with sufficient computational power to calculate the nonce for all involved blocks. As time goes on, the difficulty target will increase, making subsequent calculations more challenging; the more blocks there are, the harder it becomes to tamper with history.

"Network"#

New transactions are broadcast to all nodes.

Each node collects new transactions into a block.

Each node works on finding a difficult proof-of-work for its block.

When a node finds a proof-of-work, it broadcasts the block to all nodes.

Nodes accept the block only if all transactions in it are valid and not already spent.

Nodes express their acceptance of the block by working on creating the next block in the chain, using the hash of the accepted block as the previous hash.

New transactions are broadcast to all nodes.

Each node collects new transactions into a block.

Each node tries to find a difficult proof-of-work for its block.

When a node finds a proof-of-work, it broadcasts the block to all nodes.

Nodes only accept the block if all transactions in it are valid and have not been spent.

Nodes indicate their acceptance of the block by creating the next block using the hash of the accepted block as the previous hash.

Nodes always consider the longest chain to be the correct one and will keep working on extending it. If two nodes broadcast different versions of the next block simultaneously, some nodes may receive one or the other first. In that case, they work on the first one they received, but save the other branch in case it becomes longer. The tie will be broken when the next proof-of-work is found and one branch becomes longer; the nodes that were working on the other branch will then switch to the longer one.

Nodes always consider the longest chain to be the correct one and will continue to work on extending it. If two nodes simultaneously broadcast different versions of the next block, some nodes may receive one or the other first. In that case, they will work on the first one they received but will keep the other branch in case it becomes longer. The tie will be broken when the next proof-of-work is found and one branch becomes longer; nodes that were working on the other branch will then switch to the longer one.

Combining the above points, the problem of double spending is resolved.

If I attempt to initiate two transactions simultaneously in an attempt to double spend, the proof of work + timestamp service design can create a chronological order for the transactions and broadcast them to all nodes. The order ensures that double spending cannot be completed.
If I attempt to complete double spending from different geographical locations, the difficulty of proof of work will make it challenging for nodes to complete both transactions simultaneously.
Even if two geographically separated nodes manage to write both transactions into blocks at the same time, BTC will fork, and the longer fork will replace the shorter one. This ensures that only one of the transactions can succeed.

Summary#

The double spending problem refers to the same Bitcoin being used for two payments. Bitcoin solves this problem through proof of work and the longest chain principle. When new transactions are broadcast to the network, miners package them into blocks, find a hash that meets the difficulty requirements through proof of work, and add the block to the blockchain.

If two transactions attempt to spend the same UTXO, only one transaction will be included in a block first, while the other transaction will be invalid due to the UTXO already being spent. The longest chain principle ensures that the entire network reaches a consensus, preventing double spending.

Merkle Tree#

The Merkle Tree mentioned earlier. If disk space needs to be saved, the storage of nodes' blocks can omit specific data. It is only necessary to calculate the hashes of all transactions in the block in the form of a Merkle Tree, calculating a root hash pairwise, thus eliminating the need to store data; the root hash can be used to verify data consistency.

Even if we want to verify a specific transaction on a block, we only need to take the block header to find other nodes in the network that store data, and verify our transaction by finding the corresponding data of the longest chain node.

Where did the first BTC come from?#

There is one block that is most special; its hash is not linked to any other block, and that is the genesis block. The first BTC was born in this genesis block, which contains 50 BTC.

As mentioned earlier, "Proof-of-Work," each miner node is striving to calculate and obtain the right to write to a block.

First, miners who achieve PoW will receive block rewards. Initially, a block rewards 50 BTC, which halves every four years. This continues until 21 million BTC are mined. This is the origin of mining and the primary source of BTC.

The Bitcoin block reward is halved approximately every 210,000 blocks (about every four years). This design is intended to control the total supply of Bitcoin.

The first halving (2012): the reward decreased from 50 BTC to 25 BTC.

The second halving (2016): the reward decreased from 25 BTC to 12.5 BTC.

The third halving (2020): the reward decreased from 12.5 BTC to 6.25 BTC.

Future halvings will continue until all 21 million Bitcoins are mined.

Secondly, miners who achieve PoW naturally have the right to collect gas from each transaction. The main sources of income for miners are mining and gas.

About Wallets#

In reality, what is stored in the wallet is not Bitcoin, but Bitcoin's transaction records (UTXO). If you want to know how much money is in your address, you actually need to traverse the entire blockchain, find all UTXOs related to yourself, and sum the data.

This operation sounds slow, but in practice, most wallet software utilizes memory and indexing optimizations to speed up queries. Specific implementation methods can include Bloom filters, address indexing, and other technologies.

Is there more than one transaction in a block?#

Yes.

Transactions are not immediately written into blocks; after they are received and broadcast, they are first placed in the memory pool (mempool).

Then miners will select transactions from the memory pool to package, calculate hashes, and then start calculating nonce values to obtain proof of work (Proof-of-Work).

If miners successfully find the nonce value, these transactions will then be confirmed and written into the block.