Bitcoin L2: Blockchain Database Best Practices
We’ve reviewed the security of numerous projects that track the Bitcoin blockchain and build a database of blocks and transactions. This database typically provides search capabilities, links transactions to user accounts, stores auxiliary data for higher-level protocols, or feeds blockchain intelligence algorithms. Almost all services that Build on bitcoin do it: web wallets, blockchain explorers, and back-end APIs for web apps. But nearly all do it wrong. They assume that the blockchain grows at regular 10-minute intervals and that blockchain reorganizations can only extend the length of the best chain. These assumptions are not true, as several factors, either probabilistic, game-theoretic, protocol-mandated, or network-connectivity dependent, may cause the miners to perform sudden changes, such as adding multiple blocks at once or performing long reorganizations.
blockchain reorganization race condition
The most common assumption of blockchain cloning applications is that the Bitcoin node’s blockchain is “locked” somehow while the polling application makes several requests to identify the changes. The blockchain can asynchronously change its state and reorganize between RPC calls, and hidden race conditions may appear. Some polling algorithms try to detect a change on the tip and then use the tip height to gather more information, assuming the blockchain hasn’t changed in between. They end up building a blockchain that is generally coherent but erratically may become incoherent.
blockchain branch choosing criteria
The second most common misconception is that the Bitcoin protocol chooses the most extended blockchain branch. This assumption is false; the Bitcoin protocol chooses the blockchain branch with higher accumulated expected difficulty, often called chain-work (but don’t confuse it with the actual work done). So, the best chain may switch from a first branch to a second branch, which is a shorter branch, if the accumulated difficulty of the second branch is higher. In practice, since reorganizations of no more than 53 blocks have ever occurred, this can only happen just after a difficulty change boundary, if there are two competing blocks at the boundary such as the time-stamps of the competing boundary block establishes two different expected difficulties for the following block. We call this event a difficulty-split event. The difficulty field (nBits
) has enough precision to distinguish two interval values with a 1-minute temporal difference. The rate of competing blocks (orphan rate) is about 3%, so the expected rate of a difficulty-split event (by chance) is about 1/2016 * 3/100. This is approximately 1 difficulty-split event every 450 days. Not very likely, but it can occur. But if we take into account that a party may maliciously try to generate a difficulty-split event, then the cost of the attack will be approximately equal to the current block reward, with a success probability that depends on the attacker’s total hashing power.
First-time blockchain download and database incoherence
Another moment where an attack can take place is during the first time a system downloads the blockchain: an attacker can easily build branches that reorganize at will during the interval where the difficulty is low, before the first checkpoint. Suppose that an attacker is connected to a Bitcoin node, which is the source node of a blockchain cloning application, and suppose that the attacker can force the application to rebuild its blockchain database. If the cloning algorithm does not take into account arbitrary asynchronous reorganization, then the attacker will be able to force the application to build a non-coherent blockchain. In an application we reviewed, we managed to build a non-coherent blockchain that could not be undone because the application could not travel the blockchain backward, so the attack was persistent, even if the Bitcoin node later synchronized with the network.
Ending block
The most realistic assumption applications take is that a blockchain branch of length N ending with block X cannot instantaneously mutate into a blockchain branch of length N ending with block Y (different from X) but with all the remaining blocks equal. This assumption is currently correct. In fact, to replace a block at height N with another block at the same height would require that the accumulated difficulty of the second blockchain branch is higher, but since the expected difficulty of block N is known at block (N-1), both X and Y have the same difficulty. Nevertheless, this situation can happen in more sophisticated setups:
For example, secure setup would be that the cloning software polls different instances of bitcoind and use a voting system to decide the best chain, in order to avoid targeted attacks to single bitcoind or a single point of failure. Then the voting result may change as if it were a last tip reorganization even if each instance will never perform such a reorganization.
Also there is the possibility that future versions of bitcoind decide which block to add to the best chain in a different way (e.g. by using the GHOST method). There is the possibility that a node being connected is a modified node. Suppose that a pool mining server node detects that two client miners solved different blocks at the same height and the second solved block received has a higher reward that the first (because of transaction fees). Then the mining node may switch the tip from the first to the second, for his own benefit, even if the best chain has the same accumulated difficulty.
Human intervention
Last, some cloning applications assume only reorganization at a maximum height will occur, such as only the last 10 blocks. This can be violated if a blockchain forking bug occurs and human intervention needs to resolve the conflict, such as the event that took place in August 2010 or in March 2013. The length of the reorganization in the first case was 53 blocks, and 31 blocks in the second case. Luckily, the cloning method can be made generic, so it doesn’t need to be bounded in the number of blocks.
blockchain cloning Java source code snippet
This is our proposed and simple method to clone the blockchain, which does not suffer from race conditions. We assume the MyDatabase class allows the application to manage the blockchain stored in a database, and bitcoindRPC allows the application to access a Bitcoin node using the JSON-RPC interface.
tipBlock = MyDatabase.GetBestChainTipBlock()
while(1)
{
nextblockHash = bitcoindRPC.getBlockHash(tipBlock.getHeight()+1)
if (nextblockHash != null)
{ // there is a next block
nextBlock = bitcoindRPC.getBlock(nextblockhash);
if (nextBlock.getParentHash() != tipBlock.getHash())
{
previousBlock = tipBlock.getPreviousBlock();
previousBlock.setNextBlock(null);
MyDatabase.destroy(tipBlock);
tipBlock = previousBlock;
}
else
{
MyDatabase.add(nextBlock);
tipBlock.setNextBlock(nextBlock);
tipBlock = nextBlock;
}
}
[*]
}
This code is both simple and easy to check for correctness. The only drawback is that it will detect one-block reorganizations (which we already said practically never happen) only when the following block is included (a delay of approximately 10 minutes). To prevent this delay we added a second check of the tip block, inserted in the position marked by [*] in the previous snippet.
else
{
// check if tip block has changes or new best chain is shorter (very rare)
tipBlockHash = bitcoindRPC.getBlockHash(tipBlock.getHeight())
if (tipBlockHash != tipBlock.getBlockHash())
{
previousBlock = tipBlock.getPreviousBlock();
previousBlock.setNextBlock(null);
Mydatabase.destroy(tipBlock);
tipBlock = previousBlock;
}
}
Conclusion
By understanding these common pitfalls and best practices, developers who build on Bitcoin can create more secure, reliable blockchain databases that effectively manage reorganizations and other challenges. To ensure your blockchain application is robust and secure, consider hiring our expert source code review services. Our team specializes in identifying vulnerabilities and optimizing blockchain implementations so you can build on Bitcoin with confidence.