The Bitcoin Algorithm

You’ve probably heard of Bitcoins. They’re virtual currency, meaning they exist nowhere but on a network of accounts owned by people all over the world. The system is decentralized, so it’s not under the control of any government or, arguably, any single authority at all. Bitcoin values are set not by a government authority but partially through a complicated mathematical algorithm and partially by what people think they should be worth at any moment. It was reported on Wednesday that the price of a single Bitcoin rose to US$1,000, a record high.

I’ve been fascinated by the Bitcoin algorithm for some time and I’ve been thinking of ways to use machine learning to analyze its price movements. Then I hear news that Bitcoin has an inherent flaw that could allow a powerful few to wrest control of the now-decentralized currency. All it would take is a group of cheaters according to a research paper released Monday by Cornell University post-doctoral fellow Ittay Eyal and Professor Emin Gün Sirer.

The flaw is due to the nature of how Bitcoins are created — people “mine” them by scanning for a value that when hashed twice with is a set of cryptographic hash functions called SHA-256, begins with a number of zero bits. While the average work required increases exponentially with the number of leading zero bits required, a hash can always be verified by executing a single round of double SHA-256. There’s a whole market in specialized computers, or mining rigs, architected just to mine Bitcoins. If used correctly, the system is set up so that someone guesses correctly every 10 minutes, and the winner gets 25 Bitcoins. Because people compete against one another for the digital currency, Bitcoins are mostly evenly distributed.

But Bitcoin miners could exploit a weakness in the system that would give them a greater chance of getting Bitcoins than rival miners: solving the mining task gives miners a much higher chance of solving the next one, and those solutions are typically stored in a public log called a “blockchain.” But solutions don’t have to be publicized. If you solve a mining task and keep it secret, you can start working on the next one and let everyone else keep mining in the wrong spot. That unfair advantage becomes even more apparent when selfish, secretive miners group together and pool computing resources to solve mining tasks. The bigger the group, the more frequently they win. If a group gets large enough, it could take control of the currency.

The flaw could disrupt the very reason many have decided to use the four-year old currency, which represented a $2.6 billion market as of Monday morning.

bitcoinpoolThe new research paper explains that for the Bitcoin system to be safe it isn’t good enough that just a majority of miners are honest. The only option is to modify the Bitcoin algorithm and this is what the researchers propose. The probability that the dishonest miner’s new chain will be accepted depends on the way a forked chain is handled, i.e. a chain with two proposed solutions. At the moment what happens is that the shortest branch is ignored (corresponding to gamma=1 in the chart above). Normally branching doesn’t occur very often, but if a dishonest pool is at work it would become much more common.

The suggested change allows miners to select the branch to work on with a probability of 0.5, so reducing the chance that the dishonest branch is established. However, this simply increases the size of the dishonest pool needed to make a profit to 25% of the total number of miners. What this means is that even after the fix, 75% of the miners have to honest for Bitcoin to be reasonably safe. As the paper points out, there is a pool that commands 25% of the mining resources at this time and in the past there have been pools of more than 33%.

I’m looking forward to see how this new revelation plays out. In the meantime I shall continue toying with machine learning techniques and how they might apply to the Bitcoin ecosystem. For those new to Bitcoin, here is a brief introduction video that describes the concept: