Thursday, August 30, 2018

Raising op code limits while holding up the sky

Introduction

In the original Bitcoin client created by Satoshi (version 0.1.0), the script engine had significant differences to the one we have today. Notable capabilities include math operations on integers of any size (Crypto math anyone?) and no limits on things like script size. At some point in bitcoin’s history, a limit of 201 op codes per script was imposed.
The team behind the Bitcoin SV implementation is proposing to remove altogether (or alternatively, raise) this limit of 201 op codes per script, which has caused concern about attacks invoking high computational load.
In this post I will seek to quantify this risk and explain mitigating measures that make it quite possible to do this without the sky falling in.

But why remove the limit?

Leaving aside the possibilities that large scripts create, like unrolled loops and complex algorithmic constructs, allow me for a moment to indulge in a bit of musing.
Bitcoin SV’s remit is to restore the Bitcoin protocol and lock it down (with the exception of security fixes and those changes absolutely necessary to meet scaling goals). What constitutes the ‘protocol’ in our view is a common sense combination of
  1. The whitepaper
  2. The original code
  3. The obvious example Satoshi set by fixing bugs
  4. His subsequent musings on protocol matters.
Whilst not a technical argument, perhaps the reason I personally believe in this goal is this: While Satoshi was around, Bitcoin was in its infancy. It was a system that relied on critical mass for many of properties to begin being expressed. Limits began being imposed even while he was still around (although with the express intent of lifting them later when critical mass HAD been achieved) and this trend continued after he was gone with more an more divergence away from the original vision. This culminated in RBF and SegWit. The straws that finally broke the camel’s back and ultimately triggered the rebirth of Bitcoin as Bitcoin Cash.
Now that the yoke of Core is finally lifted and we have the potential for some real critical mass, Bitcoin is finally on the verge of becoming, for the first time in history, what it was originally intended to be. The thing is, we don’t know what that is. Bitcoin is a system that relies on a complex interplay between game theory, economic incentives, cryptography, and more. We know that Satoshi didn’t think we needed any artificial limits and that a central pillar of his plan was to give miners strong, profit based incentives to figure it out. Given the impact that Bitcoin has already had on the world and profound potential for change it represents, I personally feel it’s worth returning the protocol back to the way it was always supposed to be and leaving it alone for a few years - to let Bitcoin become what it has never yet had a chance to be. Only then will we have a chance to see what comes of that and what we can learn about the interplay of technology and sociology that Bitcoin represents.. We owe Satoshi that.
The mere existence of bitcoin script and the fact that it did not originally have limits is a strong suggestion that Satoshi envisaged this language being used in ways we haven’t yet conceived.
Now back to the fun technical detail.

What constrains script sizes?

In current implementations, there are actually a few limits that come into play with script sizes:
  1. 201 op codes per script
  2. 10000 bytes per script
  3. 20000 sigops per transaction
  4. 20000 sigops per mb (3 and 4 are widely acknowledged to have completely broken implementations however)
  5. 1mb per transaction
Point 2 is important to note, because there is a common misconception that lifting the 201 op code limit per script equates to unlimited script size. It does not, because the 10000 byte limit then comes into play. Also important to note is that this limit counts both op codes AND data.
The other more important thing that constrains script sizes is the constraint that was built into bitcoin from the very beginning: transaction fees. It is ok to run a really long script if you're willing to PAY a miner for it.

100 * 10 = 10 * 100

The commonly cited concern with raising the limit on op codes per script is computational cost. Won’t a 201 op code script cost a lot less CPU cycles to run than a 10000 op code script? Yes of course it will (although as we are about to see only marginally), but it will also be commensurately more lucrative for the miner to mine. From the miner’s point of view, the revenue gained from 100 lots of 10 satoshis is the same as from 10 lots of 100 satoshis.
This is a simplistic view because it doesn’t account for the variance in computational cost of different op codes but it’s a good starting point. Let take a look at how much that variance changes the game.

Sigops and why they matter

Note that one of the limits above is max sigops per megabyte. Sigops are a generic term for the ECDSA signature verification operations invoked by the OP_CHECKSIG and OP_CHECKMULTISIG opcodes (and their verify variants). These op codes are orders of magnitude more expensive to process than any other op code. And until recently most clients processed them in single threads.
So how could this fact be exploited to try and break bitcoin? In examining the potential impact of a change, it isn’t really sufficient to make impact impact comparisons between normal usage and the post change attack scenario. It is more informative to compare the pre-change attack scenario with the post-change scenario.

Sigop stuffing for fun and not-for-profit

In order to analyse different attack scenarios, I will use a metric called sigop density. That is a measure of how many sigops you can pack into a fixed number of bytes. This is useful because the fixed number of bytes is proportional to transaction fees (under the current fee model). So it is really a measure of the transaction fee cost of computational load.
Because sigops are currently so computationally expensive it is feasible that creating a block packed with as many sigops as possible will cause miners to take a long time validating a block, in part due to the inefficient signature verification code in bitcoin client software. Executing this attack requires making a block as sigop dense as possible which means a lot of non-standard transactions in place of standard ones. Likely the attacker would have to mine the block themself unless there is a miner willing to accept any type of transaction.

The worst case attack that no one did in 10 years

I’m going to make a few assumptions here to simplify this examination. They have very little impact on the final numbers but would complicate the explanation significantly:
  1. I will assume that we are talking about just one script as opposed to two seperate ones (scriptSig and scriptPubKey)
  2. In calculating data sizes, I will ignore the rest of the transaction and only consider the bytes in the script itself (doing this actually makes the numbers sound slightly worse so it’s a bias AGAINST the case I’m making)
Invoking a sigop requires 3 op codes and about 107 bytes in the script. Two data pushes (signature and public key) and OP_CHECKSIG itself.
<sig> = 1 + 71 bytes (avg)
<pubkey> = 1 + 33 bytes
CHECKSIG = 1 byte
Total = 107 bytes (avg)
There are a couple of alternative to achieving even higher sigop density.
  1. Only pushing one signature and public key then using OP_DUP2 to repeatedly copy them for each new sigop. This is easily defeated however with a signature cache, reducing the cost of each sigop to a cache lookup.
  2. Pushing many signatures but reusing the same public key. This requires 5 op codes rather than 3 but saves about 32 bytes. The script looks something like this: <sig> <pk> DUP TOALTSTACK CHECKSIG <sig> FROMALTSTACK DUP TOALTSTACK CHECKSIG…
As it happens with the 201 op code limit in place, neither of these methods achieve greater sigop density than simply repeating <sig> <pubkey> CHECKSIG over and over.
You can fit in 67 sigops before you hit the limit which consumes about 7169 bytes

The new worst case attack

With the 201 op code limit per script raised we are now free to use the 2nd alternative method since we are no longer worried about the higher number of op codes, only the number of bytes.
In this case, each of these uses about 76 bytes. And in the same 7169 bytes you can fit 94 sigops. This equates to an increase in maximum sigop density of (94/67 - 1) * 100% = 40%
Interestingly removing the 10000 byte limit has little effect on this sigop density. It costs only an extra 34 bytes to setup this up which would contribute less that 1% to the efficiency of this attack.
So to sum up, in the absolute worst case of someone constructing and mining a full attack block we have a computational cost 40% higher than the previous maximum. Nothing to sneeze at but in a system that is supposed to be resilient to peak loads also not the end of the world.

Defense against the dark arts

Given that we haven’t seen this attack carried out in bitcoin’s lifetime and particularly in the year that Bitcoin Cash has existed with a plethora of hostile actors arrayed against it, it is questionable whether this is something we even need to worry about. However, it pays to be prepared so what would be the impact and what could we do to mitigate against it?
The first is obvious: know your peak capacity. As explained in my previous post about block size, Bitcoin SV is firing up our own version of the Gigablock Testnet (more than one of them in fact) and we will be performing stress tests of the SV client (and other clients) on those. This will include simulating this attack and work on the attack scripts is already underway. Capacity information will be made available to miners and will also be used to inform our development priorities. The impact of this change is likely limited to increasing block validation times by a maximum of 40% and only in a full blown (expensive) attack scenario.
Should it turn out that liting this op code limit per script creates a viable attack, then of course we will revisit it. For miners, a simple tool already exist to mitigate it i.e. choose a block size hard cap that results in acceptable validation time.
An additional defensive measure is parallel block validation (so you can switch to a more easily validatable block if one comes in) - which we’ll prioritise based on miner demand for the feature.
The real mitigation is improving the speed of signature verification, which is one of the first priorities on our roadmap.

Final thoughts: to risk or not to risk?

If Bitcoin is going to grow its capacity, there is no choice but to embrace the increased loads this will place on network, storage and computational infrastructure. Our challenge is not to avoid this load but to take it on safely with a professional enterprise approach to capacity management and testing. This is what has been requested of us by the miners that commissioned the SV project and is precisely what we have commited to do. Additional sigop load is a natural consequence of growing transaction volume and significant gains can be made with efficiency increases. These are some of the low hanging fruit which the Bitcoin SV team will be focussing on early. In my previous post I mentioned several items on our roadmap directly related to accelerating signature verification but there is a natural limit to efficiency gains, after which we must turn to horizontal scaling strategies.
In the end, these limits are not for the SV team to decide but for miners to choose using their knowledge and accurate performance metrics to inform their decisions. It is our job is to provide those metrics and improve the tools miners use to exercise their ability to optimise the mining process.

No comments:

Post a Comment