Verifying Rust: Exploring Verification Options for Substrate

February 20, 2023 · 7 min read

CTO Interlay

Security is hard to get right in blockchains. It's even harder to make sure that code is correct while still actively developing the product: code changes are frequent, requirements are changing, and the codebase is growing.

In this post, we'll explore options for verifying blockchain code, specifically for the Substrate blockchain framework to try to verify the code of the Interlay blockchain.

My requirements for the test were simple: the tool should be integrated into the Rust toolchain (no DSLs) and I set myself a time limit of a single day to get started and get a useful result.

Spoiler alert: I managed to get two tools running and produce output. This post covers only the setup and I will follow up with more details in a future post.

Substrate Runtimes as Verification Targets

We are building the Interlay and Kintsugi blockchains based on the Rust Substrate framework. For verifying that the "code is correct", we are looking for two targets:

Runtime: The WASM runtime encodes the rules of the blockchain subject to consensus verification. This is where we can potentially introduce logical errors in how the protocols are working or in their implementation such that we either impact the safety or liveness of the system.
Node Implementation: The node implementation wraps around the runtime. Bugs introduced here can impact the liveness of the system.

Diving into Verification

Software verification is most often quite an academic exercise. With the adverse environment of blockchain development, there's a clear benefit of increased security assurance in the software development process. This goes hand in hand with the desire of the verification community to integrate verification into the development process.

I'll not cover the usual testing and fuzzing tools but rather explore the three tools I found most accessible for verification in Rust.

My criteria for selecting tools were:

Actively maintained: The tool should be actively maintained and have a community around it. I checked the GitHub repositories for current contributions and when the last commit was made.
Clear documentation: The tool should have clear documentation on how to use it and get started. If after 10 minutes of parsing the available documentation, I could not get a good understanding of how to install or use the tool, I would not use it.
Rust integration: Since the idea is to integrate the tool into the normal development workflow, I was looking for a tool that integrates well with cargo and also does not require writing verification code in a language other than Rust.

Selected Tools

From the tools I found, I selected the following three tools for further exploration:

More tools are available on the Rust Formal Methods Interest Group website and on Alastair Reid's blog.

For all tools, I installed them and ran them without any additional configuration against the chain implementation of the Interlay blockchain.

A Note on Expectations

I would be happy if a tool runs on the first try: installing the tool without issues and scanning the code base without setting up verification rules. Overall, the code is well-tested, but it is very complex: more than 1,000 dependencies, almost 100,000 lines of code, and invoking external C libraries.

kani

From the developer's website:

Kani is an open-source verification tool that uses model checking to analyze Rust programs. Kani is particularly useful for verifying unsafe code in Rust, where many of the Rust’s usual guarantees are no longer checked by the compiler.

Getting started

I found getting started very easy with just two commands to execute:

cargo install --locked kani-verifier
cargo kani setup

Next, I ran kani against the runtime of the Interlay blockchain:

# From the interbtc repo root
cargo kani

Compilation Errors

That produced the following error during the compilation of the schnorrkel crate:

error: format argument must be a string literal
   --> /home/nud3l/.cargo/registry/src/github.com-1ecc6299db9ec823/schnorrkel-0.9.1/src/batch.rs:165:47
    |
165 |     assert!(hrams.len() == public_keys.len(), ASSERT_MESSAGE);
    |                                               ^^^^^^^^^^^^^^
    |
help: you might be missing a string literal to format with
    |
165 |     assert!(hrams.len() == public_keys.len(), "{}", ASSERT_MESSAGE);
    |                                               +++++

At this point, I wasn't sure how to convince kani to ignore these errors in the dependencies. I tried running kani only on a single crate but it resulted in the same compiler issue.

The authors have a guide on how to get started with real code but it does not include how to handle compiler errors in dependencies.

Prusti

From the developer's website:

Prusti is a verification tool for Rust programs. It is based on the Rust compiler and uses the Rust type system to verify Rust programs. Prusti is a research prototype and is not yet ready for production use.

Getting started

Getting started with Prusti required downloading the Prusti Assistant VS Code extension. I already had the required Java SDk and rustup versions installed so the process of getting started only involved opening a rust file and hitting the "Verify with Prusti" button on the VS Code status bar.

Prusti

As the first target, the fee pallet seemed interesting as it's mildly complex, has few dependencies, and with its fixed point math might be subject to issues.

Errors

On the first run on the fee crate, Prusti found 358 errors. That seemed quite a lot but after initial inspection, most of the errors were:

Unsupported features (313 errors): I was expecting Prusti with its current feature set to run into these issues as substrate makes heavy use of macros, traits, and other advanced features of Rust.
Internal errors (24 errors): Several internal errors occured.
Unexpected verification error (9 errors): Some verification failed.
Verification errors (14 errors): These seem to be the ones worth investigating.

Success

Prusti found potential overflow and underflow errors in the bitcoin crate:

// bitcoin/src/parser.rs
if position + 4 > raw_bytes.len() {
    return Err(Error::EndOfFile);
}

It also found possible issues with unbounded arrays:

// bitcoin/src/script.rs
pub fn op_return(return_content: &[u8]) -> Script {
    let mut script = Script::new();
    script.append(OpCode::OpReturn);
    script.append(return_content.len() as u8);
    script.append(return_content);
    script
}

MIRAI

From the developer's website:

MIRAI is an abstract interpreter for the Rust compiler's mid-level intermediate representation (MIR). It is intended to become a widely used static analysis tool for Rust.

Getting started

Installation of MIRAI was straightforward, following the guide:

git clone https://github.com/facebookexperimental/MIRAI.git
cd MIRAI
cargo install --locked --path ./checker

Next, I ran MIRAI in the interbtc root directory.

cargo mirai

Compilation Errors

MIRAI produces compilation errors on the wasm builds:

     Compiling wasm-test v1.0.0 (/tmp/.tmpfNz4QS)
  error[E0463]: can't find crate for `std`
    |
    = note: the `wasm32-unknown-unknown` target may not be installed
    = help: consider downloading the target with `rustup target add wasm32-unknown-unknown`
    = help: consider building the standard library from source with `cargo build -Zbuild-std`

  error: requires `sized` lang_item

  For more information about this error, try `rustc --explain E0463`.
  error: could not compile `wasm-test` due to 2 previous errors
  warning: build failed, waiting for other jobs to finish...
  error: cannot find macro `println` in this scope
   --> src/main.rs:3:5
    |
  3 |                 println!("{}", env!("RUSTC_VERSION"));
    |                 ^^^^^^^

  error: could not compile `wasm-test` due to 3 previous errors
  ------------------------------------------------------------

Success

As the wasm build is done on the entire runtime, I decided to try my luck against a single pallet instead. Similar to before, I tried the fee crate. This worked and to my surprise, MIRAI did not print any warnings or errors.

Next, I tried the bitcoin create that implements parsing and other somewhat error prone code. MIRAI gave me results here:

warning: possible attempt to subtract with overflow
   --> crates/bitcoin/src/parser.rs:239:24
    |
239 |     let target: U256 = parser.parse()?;
    |                        ^^^^^^^^^^^^^^
    |
note: related location
   --> crates/bitcoin/src/parser.rs:177:40
    |
177 |         let (result, bytes_consumed) = T::parse(&self.raw_bytes, self.position)?;
    |                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: related location
   --> crates/bitcoin/src/parser.rs:151:23
    |
151 |         let compact = U256::set_compact(bits).ok_or(Error::InvalidCompact)?;
    |                       ^^^^^^^^^^^^^^^^^^^^^^^
note: related location
   --> crates/bitcoin/src/math.rs:53:25
    |
53  |             word << 8 * (size - 3)
    |                         ^^^^^^^^^^

Summary

Overall, I was very happy with the results. I was able to run two tools against the crates without additional configuration and they found potential issues. Next up will be trying to implement custom verification rules and dive deeper into the identified issues.

Lean Execution: How to Stay Focused

February 21, 2022 · 4 min read

Dominik Harz

CTO Interlay

When building a company and shipping a product, it's both critical and extremely hard to stay focused. Much has been written about this subject before. I'm sharing here which three questions I'm trying to ask myself to help to stay lean.

Why do we need this feature?

Most of the time, you or a colleague got a great idea to improve the product or the company. Especially, when you have not found product-market fit yet or the company is growing, you might feel the urge to add things, change things, and generally make sure you are set up for success. However, I find myself often in discussions bout how to implement said change/feature instead of why? Before going into the actual implementation, I try to understand two critical aspects:

What benefit does this change bring to our users? This can be the users of our product, users of internal or external processes, or anyone that stands to benefit from this change. It's helpful to classify the using T-shirt sizes from S to XL and later go into a more concrete assessment (what % of our user base profits from this, are these the critical users, ...).
Does this support our company goals in the mid/long-term? Sometimes a new feature can be beneficial to a user, but might not be in the long-term strategical interest of the company. I like to also use the T-shirt sizes here to gauge this question and later analyze this in more detail.

When do we need this feature?

Before talking about how to implement a feature, it's critical to understand where it fits in the roadmap. If you anyway decided that the feature does not add any benefit (see question 1), it's likely not worthwhile to specify a concrete timeline but rather give it some sort of medium or low priority on the backlog. However, if it's an important feature, the next step is to understand the dependencies of where it fits in the roadmap.

Some key questions I like to think about:

Does the feature require another feature to be launched first to make it useful? Understand if there is a required sequence of steps.
Can we launch without the feature? Is the feature critical and we cannot ship to the user without it. Or would it be great to have at launch but not critical.

How should we launch the feature?

After understanding the added value of the feature and its order in the timeline, the next step is to understand how the feature should be launched. Sometimes this process is overlooked but I find it helpful to acknowlegdge that you can and often should simplify features to their core value proposition. In crypto, the first iteration of Uniswap was simple but it worked. MakerDAO launched with a single collateral asset and it worked as well. Identifying the critical feature set is a hard exercise and requires an excellent understanding of user demand and competitor position.

I'm an engineer and researcher at heart and we tend to often strive for a perfect, over-engineered solution that covers every edge case. It's even worse coming from a security-focused background where your first step is to ask: How do we break this protocol?

In the product world, I found it often acceptable to tolerate limited sub-optimal edge cases in the product, acknowledge them, and fix them later. The point here is that you want to get a product out at the right time. And more often than not, getting the feature out with its essential benefit and have it "good enough" saves significant effort.

This last point I find often overlooked. Additional complexity in the feature costs us often 3x more time investment down the line since it's not just the design that is more complicated. It's also the implementation, the documentation, and the testing that are proportionally more involved.

Stealing All of Maker's Collateral

February 20, 2020 · 9 min read

Dominik Harz

CTO Interlay

Or why Oracle Vulnerabilities are not the Worst Application of Flash Loans.

tl;dr

You can potentially steal all of Maker’s collateral ($700m) and issue arbitrary amounts of new Dai using flash loans if no delay for new governance contracts is introduced.
Anyone can execute the attack and only needs to pay the transaction fees (few $) without holding any MKR.
If Maker does not introduce a delay until the liquidity pools for flash loan pools gets above a threshold, there is little chance to prevent the attack (race condition).
We have reached out to Maker February 8, 2020 about our research and had a call with them on February 14, 2020 to discuss our findings. Maker is aware of the attack vector and is holding a vote this Friday 12pm PST that would prevent the attack.

Introduction

Maker and its Dai stablecoin is the most popular so-called decentralized finance (DeFi) project on Ethereum with around of $700m locked in its smart contracts. The Maker protocol relies on a governance process encoded in a smart contract. MKR token holders can vote to replace an existing governance contract. Votes are proportional to the amount of MKR. The total number of MKR tokens is around 987,530 where selected wallets or contracts hold large quantities of the token:

[Maker Governance Contract: 192,910 MKR]
Maker Foundation: 117,993 MKR
a16z: 60,000 MKR
0xfc7e22c6afa3ebb723bdde26d6ab3783aab9726b: 51,291 MKR
0x000be27f560fef0253cac4da8411611184356549: 39,645 MKR

Note: the Maker Governance Contract contains MKR tokens of multiple parties.

Governance Attack

In a post from December 2019, Micah Zoltu pointed out how to attack the Maker governance contract. The basic idea is to accumulate enough MKR tokens to replace the existing governance contract with the attackers, malicious, governance contract. The malicious governance contract is then able to give the attacker full control over the system and withdraw any collateral in the system as well as create arbitrary amounts of new Dai.

To reduce the number of required MKR tokens, he proposed to execute the attack when new governance contracts are being voted on. At the moment, 192,910 MKR tokens are locked in the governance contract. However, an attacker would need less tokens if, say two or three contracts, would be voted on in parallel with similar token distributions. This frequently happened in the past as we can observe in the graph below:

The obvious attack strategy would be to crowd-fund the required MKR tokens through a smart contract and pay each attacker a share of the of the winnings. However, the attackers need to amass probably around 50k MKR tokens to have a chance to attack the system without Maker noticing these movements. We do show a strategy for this in our latest paper.

A Brave New Attack Strategy: Flash Loans

However, we can completely lift the requirement to amass MKR tokens if we consider using flash loans instead. Flash loans are a fairly new concept and therefore we give a brief explanation here. Normally, an agent has to provide collateral to take out a loan in the DeFi space. For example, in Maker, Alice can borrow Dai by depositing ETH. This is required since we operate under a model of weak identities and economically rational agents.

A flash loan lifts this requirement as it only takes place in a single transaction:

Alice takes out the loan from a flash loan liquidity provider (e.g. Aave or dYdX)
Alice executes some actions (e.g. arbitrage trading on Uniswap, Fulcrum, Kyber etc.)
Alice pays back the flash loan with interest

A flash loan works in three steps within a single transaction.

The flash loan works because of how the Ethereum Virtual Machine is designed: if at any point during this transaction the flash loan fails, the whole transaction is reverted. Hence, Alice can take the loan risk free, i.e. if she cannot pay it back it would be like she would have never taken it. The liquidity providers are also winning: they have only lent their funds if Alice was able to pay them back.

Arbitrage and Oracle Manipulation with Flash Loans

On February 14 and February 18, two incidents involving flash loans occurred that caused bZx to halt its platform. In the first transaction, a single flash loan was able to make a profit of 1,193 ETH (approx. $298,250). This transaction was executed using a smart contract that opened a short position on Fulcrum on wBTC. In the same transaction, the transaction took out a wBTC loan on Compound and executed a trade of wBTC on Kyber’s Uniswap reserve resulting in slippage ultimately bringing down the price on Fulcrum as well. The full details can be found in the post-mortem by bZx and a detailed analysis of the involved calls by PeckShield.

Similarly, a second incident occurred on February 18 where in a single transaction a profit of 2,378 ETH (approx. $600,000) was made. This transaction involved an initial borrowing of 7,500 ETH to take a long position on Synthetix' sUSD. More details are found in this informal analysis.

Oracle Manipulation to Reduce Required Liquidity

For our attack, we assume that 50k MKR are sufficient. Even if in practice the number of tokens could be higher, we point out how the concept of flash loans makes securing Maker, without a governance delay, hard. In a naive approach, the attacker could take out a flash loan to buy 50k MKR tokens.

At current exchange rates, the attacker needs around 485,000 ETH to buy that amount of MKR as only a single exchange, Kyber, has enough volume available. However, the attacker can also leverage multiple exchanges and buy 38k MKR from Kyber, 11.5k from Uniswap, and 500 MKR from Switcheo for a total of 378,940 ETH. That number is still high but already a reduction by almost 100,000 ETH!

An attacker can use the oracle manipulation strategies above to effectively bring down the price of MKR on Kyber and Uniswap. These are the two largest providers of MKR and have shown to be vulnerable to oracle price manipulation. Further analysis is required to determine how much the MKR price can be reduced. However, on a less liquid token like wBTC an attacker was able manipulate the exchange rate by around 285%.

Obtaining enough Liquidity

Even with oracle manipulation, a large number of ETH is required to execute the attack on Maker. However, an attacker can increase its liquidity by taking out two flash loans within the same transaction. Aave and dYdX protect themselves against reentrancy and allow only a single flash loan within a single transaction. But the attacker can borrow ETH from both protocols within the same transaction.

Thus, as of February 18, the attacker has a pool of around 90k ETH on dYdX plus 17k ETH on Aave available. Hence, at current liquidity rates an attacker could take out a combined ~107k ETH loan from both dYdX and Aave to try to manipulate the MKR token price and obtain enough MKR tokens to replace the current Maker governance contract with its own.

For this to succeed the attacker must be able to reduce average MKR price by at least 3.54 times. Alternatively, the attacker can also wait for dYdX and Aave to increase its liquidity pools. With the current growth rate of around 5% of both protocols, it does not seem unlikely that the attack becomes possible within two months.

Combination?

Obviously, it is possible to combine the crowd funding approach and the flash loan together. Using the available liquidity of ~107k ETH, it is possible to obtain around 10,800 MKR from Kyber. This would allow multiple attackers to reduce the required amount of pooling together 50k MKR to just around 39.2k MKR. It seems also that some people are indeed interested in such an attack as this informal Twitter poll shows:

It can also be noted that the top four account holders (in fact five, but not considering the current Maker Governance Contract) are able to execute the attack without crowdfunding.

No Time to Wait

Once enough liquidity is available through the flash loan lending pools (with or without the combination of using oracle manipulation), anyone is able to take over the Maker governance contract. Once Maker starts a vote when the liquidity pools have reached that threshold, Maker needs to ensure that MKR tokens are as little distributed as possible. If the distribution of MKR at any point in this voting procedure allows for an exploit of the vulnerability, any collateral can be taken away.

That attacker would be able to steal $700m worth of ETH collateral and be able to print new Dai at will. This attack would spread throughout the whole DeFi space as Dai is used as backing collateral in other protocols. Further, the attacker can use his Dai to trade other currencies worth around $230m. We give a more detailed analysis in our paper.

Counter measure

Maker should put a new governance contract in place that prevents flash loans from attacking their system. Specifically, a new governance contract should be able to be checked by the Maker Foundation for malicious code and give enough time to react. At the bare minimum, a new governance contract should not be able to go live within a single transaction. That way, the attacker can likely not profit from the attack and thus not repay the flash loan. If the attacker cannot repay the flash loan, the attack is like it would have never happened.

Maker is putting such a contract to vote on Friday, 12pm PST, February 21, 2020. The proposed contract would activate the Governance Security Module (GSM) and prevent such a flash loan attack.

Analysing Ethereum contract gas costs during development

November 6, 2018 · 5 min read

Dominik Harz

CTO Interlay

Updating the state of a smart contract in Ethereum costs money. In this post I will go a bit into detail why this is necessary in Ethereum and how to check easily (with truffle test) your gas costs during local development. For this we will use a little pet example, you can use for your own deployments.

Gas cost - why?

Ethereum smart contracts are deployed at an address in the Ethereum network. They are identifiable by the fact that the code field in the address is occupied by the EVM bytecode. One example is the Maker Dai contract. Any contract in Ethereum is deployed that way and implements a range of functions that can update its own state and the state of the blockchain.

The own state is defined within the contract and refers to any contract wide variables you define like uint, mapping, or string. The global state of the network is affected when you make a transaction to a contract, which, if everything goes right, will include one more transaction in the ever-growing list of transactions.

Everytime somebody invokes a state update (i.e. sends a transaction), the network reaches consensus over this operation. If 50% + 1 decide that the state transition is valid, the network state gets updated. However, this means that if you have a function that does 5 + 4 and stores the result to a contract variable, every node in the network need to verify that the result is in fact 9. Assuming that Ethereum has more than 12,000 nodes, this is quite costly.

Moreover, assume that any of these operations are free. One could simply create a function that runs indefinitely to block all nodes verifying the network. This is the equivalent to a distributed denial-of-service attack. It would be nice if we would know in advance if a function terminates. However, the Halting problem prevents us from knowing this. Hence, we have to charge partial function execution and forcefully terminate execution when it becomes to burdensome for the network.

Practical implications

Operations executed by the EVM cost gas depending how "heavy" they are on the network. Gas is a way of determining the cost for an operation. The current gas price on the other hand determines how much 1 gas costs in Ether. That way, the gas cost can be adjusted depending on the network load. For example, while the gas cost for a single operation is constant, the gas price can go up when the network is congested. An overview of the gas costs are found in appendix G of the yellow paper.

Local execution and global execution

Any operation on a smart contract costs gas. It is important to note a difference though. Calls are only executed locally. So for example the function below measures its gas, but the you will not be charged executing such a function as it is a read-only function.

function balanceOf(address owner) public view returns (uint256) {
    return _balances[owner];
}

However, transactions will deduct Ether from your balance as those are executed by the global network and miners need to have an incentive to include your transaction in a block. So the function below is from an ERC20 and executes a transfer. This updates the contract state, and thus requires a transaction.

function _transfer(address from, address to, uint256 value) internal {
    require(value <= _balances[from]);
    require(to != address(0));

    _balances[from] = _balances[from].sub(value);
    _balances[to] = _balances[to].add(value);
    emit Transfer(from, to, value);
}

Measuring gas costs

I will assume here that you are familiar with Truffle and Ganache. When you develop your new great smart contracts, I like to create end-to-end tests that also check for gas costs.

Assuming your project folder looks something like this:

contracts/
migrations/
test/

I am creating a folder mkdir experiments where I will store the number of transactions each of my interactions require and the amount of gas that this operation is using.

Next, I'm creating a test script demo.js in the test folder. I like to store my experiments in a CSV format so I can import it to other tools. First, the required imports:

var fs = require("fs");
var csvWriter = require('csv-write-stream');
var writer = csvWriter();

// your contract
const ERC20 = artifacts.require("./ERC20.sol");

You will write your tests based on the Truffle/Mocha/Chai interface. For this, I'm creating the variables I am using to store the measurements. Also, I am using the before construct to setup the CSV writer. During tests I would update the gas and number of transaction related counters.

After the tests are finished, the data is written to a CSV file. To USD conversion uses a helper function.

contract('ERC20', async (accounts) => {
    // gas price conversion
    const gas_limit = 7988288;

    // experiment related vars
    var transfer_success_gas = 0;
    var transfer_fail_gas = 0;
    var transfer_success_tx = 0;
    var transfer_fail_tx = 0;

    // any other const or vars
    // ...

    before('Create writer for experiments', async () => {
        writer.pipe(fs.createWriteStream(('./experiments/ERC20.csv')));
    })

    after('Write experiment data to file', async () => {
        let transfer_success_usd = convertToUsd(transfer_success_gas);
        let transfer_fail_usd = convertToUsd(transfer_fail_gas);

        writer.write(
                {
                    Transfer: transfer_success_gas,
                    TransferFail: transfer_fail_gas,
                });
        writer.write(
            {
                Transfer: transfer_success_usd,
                TransferFail: transfer_fail_usd,
            });
        writer.write(
            {
                Transfer: transfer_success_tx,
                TransferFail: transfer_fail_tx,
            });
        writer.end();
    })

    // your tests
    // it("Transfer Success")
    // it("Transfer Fail")
})

The USD conversion function works like this. I have this in a separate file called helpers.js that I export/import as a module.

convertToUsd: function (gasCost) {
    // gas price conversion
    const gas_price = web3.toWei(5, "gwei");
    const eth_usd = 200; // USD

    return gasCost * web3.fromWei(gas_price, "ether") * eth_usd;
}

That's it. You can run the experiment with truffle test path/to/the/file.js.

Analyzing Bitcoin smart contracts from a mechanism design perspective

September 12, 2018 · 11 min read

Dominik Harz

CTO Interlay

Contracts can be used to enforce agreements between entities. To this extent, smart contracts have been proposed by Nick Szabo and implemented for example in Bitcoin. This article covers the basics of mechanism design of smart contracts in the context of Bitcoin. Mechanism design is concerned with creating or encoding preferences in an agreement. Hence, an author of a smart contract can create a mechanism that enforces certain behaviour of the agents (or humans) interacting with the contract. Essentially, the author of the contract wants to reward desired behaviours and punish undesired behaviours. This is widely used in cryptocurrency protocols including the Lightning Network or Ethereum-based protocols such as Casper or TrueBit.

Use Case: Agents having access to sensors could sell the sensor data they are collecting to agents willing to buy this data. This represents a simple contract where a resource is exchanged for a price. Bitcoin in combination with payment channels secure the payments and allow for micro-payments. Thus, one has to design protocols integrating with these existing technologies to achieve for example secure decentralised data exchanges.

Bitcoin and smart contracts

Nakamoto introduced Bitcoin as a way to send money between peers without trusted third parties [2]. In Nakamoto consensus the heaviest chain is considered having correct transactions as most miners have contributed to this specific chain by solving a proof of work (PoW) puzzle. Assuming that the majority of players in the system are honest (i.e. $50\% + 1$ ), the heaviest chain represents the "true" state of the network.

Bitcoin also included the capability of executing smart contracts with Bitcoin Script [3]. A smart contract is a piece of software which formalises a relationship between entities much like a legal contract [4]. However, the idea is to use secure computer protocols and include game theoretic elements to cover aspects that cannot be adequately verified in the protocol. Szabo argues that smart contracts cover interactions including search, negotiation, commitment, performance, and adjudication.

Rational agents

Agents in decentralised systems are self-interested [5]. Hence, protocols facilitating (economic) interactions between agents need to account for autonomous agents to behave rationally. A set of agents $P = \{1 ... n\}$ is considered to be rational if they seek maximise their utility $u_i$ depending on some function [5] [6] [1]. This is especially relevant in decentralised systems, where we have to assume agents act only in their interest. Agents can apply multiple strategies $s \in S$ within a negotiation [1]. We have to assume that agents will execute actions $\theta \in \Theta$ based on $s$ that optimises their utility including cheating, lying, breaking contracts, and hiding their intentions. However, we also assume that agents that interact can find an outcome $\omega \in \Omega$ that increases or optimises $u_i$ .

Mechanism design

Mechanism design is used to encode preferences in an agreement. Essentially, a "designer" defines a preferred social choice which, under the assumption of rational agents, the mechanism is intended to favour [6]. Considering a contract $C$ as an agreement between agents, $C$ implements a mechanism $M = (\Sigma, g)$ [7]. The outcome of the mechanism $g$ depends on the actions of agents $\Sigma$ . $M$ implements a rule $f$ . Rational agents would only enter $C$ or follow $C$ if $f$ is the result of the dominant strategies or Nash equilibrium of the agents.

An agent will have multiple options to enter contracts with other agents. To ensure that the proposed contract $C$ is chosen, we want to have it incentive compatible. Loosely speaking incentive compatibility of a mechanism occurs when the utility $u_i$ of each agent $i$ is maximised when an agent is "truth-telling" (i.e. the dominant strategy). In games without money, non-trivial choices (e.g. more than two contracts, multiple potential agents, multiple different terms) lead either to incentive incompatibility, or the social choice function is a dictatorship according to the Gibbard--Satterthwaite theorem. However, in Bitcoin, we can construct games with money.

Games in Bitcoin

In games with money, a Vickery-Clarke-Groves (VCG) mechanism maximises social welfare. A VCG mechanism is incentive compatible. Hence an agent strives to implement a contract as such a mechanism. To construct such games the mechanism the Clarke pivot rule can be applied [6]. Agents in Bitcoin are playing a game with strict incomplete information. One could argue that an agent could potentially analyse actions of an agent as transactions are public. However, as agents might take multiple identities (Sybil), this is a non-trivial task, and thus, we will not consider it in this case. We further assume that agents have independent private values, i.e. resources subject to trade do not depend on the other agents.

Such a game can be defined as follows [6]:

A set of players $P = \{1, .., n\}$ exists.
Every player $i$ has a set of actions $\Sigma_i$ .
Every player $i$ has a set of types $T_i$ . A value $t_i \in T_i$ is a private input to the player.
Every player $i$ has a utility function $u_i$ . The utility function depends on its private input and the actions of all players $u_i = (t_i, \sigma_1, .., \sigma_n)$ .

Hence, a player must choose an action only based on his private input but without knowing the private inputs of other agents. A strategy $s$ describes an agent's decision to execute a specific action based on a private input. The agent will choose $s$ that is the best response to any $\sigma$ by other agents.

However, an agent needs to optimise a specific social choice. In Bitcoin, we are not concerned with social welfare as a social choice as this is unknown to the agents and they are not naturally interested in the utility of others. In the case that only two agents are involved and we have a simple contract concerning one resource, single-parameter domains can be applied [6]. In such domains, the valuation $v_i$ of an agent depends only on one parameter, say obtaining the resource. Moreover, the price an agent is willing to pay is already included in the incentive mechanism.

Yet, this would only hold for the case where two agents compete for obtaining a single resource. In more complex cases, for example, adding the quality of the data, we need to apply a more complex social choice function. In these cases, variations of VCG are the only incentive compatible mechanisms [6].

Assumptions

Using the resource exchange formalisation described in [1] the following tuple defines the exchange setting:

$(P,\mathcal{Z}, (v_{i})_{i \in P})$

We are assuming that two agents exist, i.e. two players $P = \{A, B\}$ . Those agents are exchanging a purely digital resource for a specific price. The agents negotiate over a single resource $z \in \mathcal{Z}$ . Moreover, they use a valuation function $v_i : 2^{\mathcal{Z}} \to \mathbb{R}$ . Agent $A$ hence defines the value $v_A(z)$ individually as well as agent $B$ in $v_B(z)$ .

This valuation expresses the price an agent is willing to exchange the resource for, whereby we assume that $v_A \leq v_B$ if $A$ is offering the resource and $B$ , is willing to pay for it. Otherwise, there would be no price for them agree on. In this simple case, we assume that the agents follow the negotiation protocol truthfully and that their utility $U$ increases when making a deal and exchanging the resource. We further assume that the agents have a way to enter such an agreement, i.e. one of them can prepare a contract that the other one can interpret and willing to commit to.

Contract

The contract $C$ is implemented as an Hashed Timelock Contract (HTLC) [8] with multiple transactions in the Lightning Network. Note that $A$ sells data $z$ , and $B$ buys it for the agreed price $p$ in Satoshi. To allow agents to exchange $z$ and atomic swap is used. This protocol has been described for cross-chain transactions but is equally useful within a single chain [9].

$A$ stores $z$ on IPFS receiving $H(z)$ . $A$ uses this as the secret for the HTLC and sends this to $B$ . As $A$ takes the Merkle root of $z$ , the data can be of (almost) any size.
$B$ prepares a transaction transferring $p$ to $A$ spendable with the signature of $A$ . Also, $B$ includes the spending condition $H(input) == H(z)$ based on the IPFS hash of the file. $B$ is also setting a time $t$ after which the transaction must be spent, otherwise, $B$ can redeem the locked funds. Last, $B$ sends the transaction to $A$ to sign it.
$A$ signs the transaction to agree to the trade and commits it to the channel.
$A$ reveals the $H(z)$ to issue the payment, which gives $B$ $H(z)$ . $B$ obtains $z$ through IPFS.

This contract allowed an atomic swap of the file without the need to upload the file to Bitcoin. It requires both agents to be online which is a reasonable assumption for autonomous agents. The protocol does not handle the security of the file. Any party observing $H(z)$ can access $z$ without paying any price. Hence, the protocol could be extended using GPG or other asymmetric encryption schemes. In that case, $A$ could take $B$ 's public RSA key to encrypt $z$ and then store it on IPFS. This would allow private trading of the file.

Mechanism analysis

Players: $A$ and $B$ are the only players in the game. They have the same capabilities and pre-contract the same norms. Post-contract one agent has the obligation to pay for $z$ and the other has an obligation to provide $z$ . The contract is atomic. Hence, payment and delivery occur at the same time. However, there is a possible weakness in the contract. If $A$ reveals $H(z)$ without having the file stored on IPFS any more, $B$ will not be able to retrieve the file.

Types: Both players have a valuation $v(z)$ . $A$ has private access to $z$ .

Strategy: $A$ has three actions: not reveal $H(z)$ , reveal $H(z)$ and have $z$ accessible, and reveal $H(z)$ and have $z$ inaccessible. $B$ has two actions: prepare HTLC with a payment $p$ for $A$ or not. Assuming $B$ 's valuation of $v_B(z) > 0$ , $B$ will propose a minimum payment. Since there is no proof of $z$ being available and the desired resource, $B$ has to account for this risk. The protocol could be improved by using concepts such as proof of replication [10]. $A$ will only consider revealing $H(z)$ in the proposed HTLC if $p \geq v_A(z)$ . Moreover, $A$ might cheat $B$ by not making $z$ available. In case $A$ expects future trades with $B$ , it has a motivation to actually provide $z$ . $B$ might promise a higher pay for future interactions and adjust his valuation function to the resource $z$ provided by $A$ to a higher value since $A$ is based on previous direct experience trusted.

Utility: Utilities depend on the combination of strategies.

$B$ proposes HTLC with payment $p$ and $A$ reveals $H(z)$ with $z$ available: $u_A = p - v_A(z)$ and $u_B = v_B(z) - p$ . Under the assumption that $A$ does not loose anything from selling $z$ .
$B$ proposes HTLC with payment $p$ and $A$ reveals $H(z)$ with $z$ unavailable: $u_A = p - v_A(z)$ and $u_B = 0 - p$ .
$B$ proposes HTLC with payment $p$ and $A$ does not reveal $H(z)$ : $u_A = 0$ and $u_B = 0$ .
$B$ does not propose HTLC: $u_A = 0$ and $u_B = 0$ .

Social choice: Since $B$ proposes the contract, $B$ can set the social choice function. In this case, a single parameter domain is useful. The utility analysis shows that $p$ has the condition $v_A(z) \leq p \leq v_B(z)$ . Since $B$ is not able to be sure that $z$ is available, $p$ will be low in the first tries as $B$ tries to manage his risk exposure. However, $B$ could argue that $A$ might be interested in future trades and include this as part of his utility function as an expected value. Hence, $B$ would only propose a contract if $p << v_B(z)$ since otherwise the risk is too high.

References

[1] S. Fatima, S. Kraus, and M. Wooldridge, Principles of Automated Negotiation. Cambridge: Cambridge University Press, 2014 [Online]. Available: http://ebooks.cambridge.org/ref/id/CBO9780511751691

[2] S. Nakamoto, "Bitcoin: A peer-to-peer electronic cash system," 2008.

[3] Bitcoin Wiki, "Script." 2018 [Online]. Available: https://en.bitcoin.it/wiki/Script{\#}Opcodes. [Accessed: 27-Jun-2018]

[4] N. Szabo, "Formalizing and Securing Relationships on Public Networks." 1997 [Online]. Available: http://ojphi.org/ojs/index.php/fm/article/view/548/469. [Accessed: 07-Apr-2017]

[5] T. W. Sandholm and V. R. Lesser, "Leveled Commitment Contracts and Strategic Breach," Games and Economic Behavior, vol. 35, nos. 1-2, pp. 212--270, 2001.

[6] N. Nisan, T. Roughgarden, E. Tardos, and V. V. Vazirani, Algorithmic Game Theory, vol. 1. Cambridge: Cambridge University Press, 2007, pp. 1--754 [Online]. Available: [http://portal.acm.org/citation.cfm?doid=1785414.1785439 http://ebooks.cambridge.org/ref/id/CBO9780511800481](http://portal.acm.org/citation.cfm?doid=1785414.1785439 http://ebooks.cambridge.org/ref/id/CBO9780511800481)

[7] N. Nisan and A. Ronen, "Algorithmic Mechanism Design," Games and Economic Behavior, vol. 35, nos. 1-2, pp. 166--196, Apr. 2001 [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S089982569990790X

[8] Bitcoin Wiki, "Hashed Timelock Contracts." 2018 [Online]. Available: https://en.bitcoin.it/wiki/Hashed{\_}Timelock{\_}Contracts. [Accessed: 28-Jun-2018]

[9] I. Bentov et al., "Tesseract: Real-Time Cryptocurrency Exchange using Trusted Hardware," 2017 [Online]. Available: http://www.cs.cornell.edu/{~}iddo/RTExchSGX.pdf

[10] A. Juels and B. Kaliski Jr., "Pors: Proofs of retrievability for large files," Proceedings of the ACM Conference on Computer and Communications Security, pp. 584--597, 2007 [Online]. Available: http://www.scopus.com/inward/record.url?eid=2-s2.0-74049101079{\&}partnerID=40{\&}md5=83cf075b3704d4fe5bfb2ccf38c39362

Integrating Python and Ethereum

February 14, 2017 · 5 min read

Dominik Harz

CTO Interlay

In Ethereum and other blockchains there are still a lot of proof of concept implementation and developers trying out how to cope with the new concepts. As part of the dInvest post series I was also looking into Ethereum and trying to implement a hedge fund in a blockchain. In a previous post I discussed how to get a quantitative framework in python up and running. In this post I will write how to integrate python programs with Ethereum smart contracts. For one reason or another you might be also faced with the issue, that although Ethereum offers a Turing-complete language not everything is actually doable there.

Let's say you have created one of the simple tutorial contracts in Ethereum and now want to look at something more advanced. I personally liked the Hitchhiker's Guide to Smart Contracts by Manuel Aráoz to get started with more complex code, setup testrpc, and truffle. Take a look at it.

dInvest smart contract

dInvest is composed of one smart contract that is responsible for making investments, verifying investment criteria and distribution of returns. The contract exposes public functions to create new investments and for withdrawal which will act as main functions of a hedge fund. Users of the hedge fund are identified by their Ethereum address which is equivalent for the public key. Suggestion of investment strategies and strategy execution are done in different agents that also have Ethereum addresses. These agents are set by the contract creator only. When a user is creating an investment it is possible to specify a list of industry sectors identified by a two digit number based on the Standard Industrial Classification codes. These sectors will be identified as a black list when making the investments. Therefore user have the ability control the sectors which the hedge fund will invest on.

The contract can be found in the GitHub repo.

Interaction with smart contracts

To interact with smart contracts, there are a couple of option including RPC or a JavaScript API. I found the easiest way to interact with Ethereum smart contracts from other programs (like python programs) was using their web3 JavaScript API. As the majority of dInvest is written in python, I wanted to stick to the language and not include JS as well. Luckily, there is a web3 implementation in python. To get it up and running for the dInvest setting I switched to the virtualenv, where I also installed zipline and then install web3 simply with pip install web3.

Using web3, there are three steps to get you up and running to interact with your smart contract:

Getting your ABI
Setup the RPC connection
Interact with the smart contract

In the next sections, I will go into detail how to achieve the three steps. I am using this mostly as a python module for other programs. In the end our python module structure might look like this:

contract
|-- __init__.py
|-- ContractHandler.py
|-- your-contract-name.json

Getting your ABI

Now, to interact with any smart contract you need the Application Binary Interface(ABI) defined by the contract. The ABI is a static, strongly typed interface. Whenever you create a new contract or change an existing one, chances are your ABI changes as well. In my experience the easiest way to get the current ABI of a smart contract (which might be yours or any contract you have the source code available) is to go to https://ethereum.github.io/browser-solidity/ and copy/paste your code there. Then press the "Compile" button on the upper right side and copy the entire string in the "Interface" field into a your-contract-name.json file. Once you have that JSON, your web3 interface will know how to interact with the contract.

Setting up the RPC provider

As a next step you will need to connect to the RPC provider. In your python file (e.g. ContractHandler.py) include those lines of code:

from web3 import Web3, TestRPCProvider

class ContractHandler:
  def __init__(self):
    self.web3 = Web3(RPCProvider(host='localhost', port='8545'))
    with open(str(path.join(dir_path, 'contract_abi.json')), 'r') as abi_definition:
      self.abi = json.load(abi_definition)
    self.contract_address = your_contract_address
    self.contract = self.web3.eth.contract(self.abi, self.contract_address)

I prefer having my configurations in a separate file. There are many ways to do it and it seems like there is no standard in python. I guess using a txt file is not the best option though and I plan to switch to yml soon. See also here https://martin-thoma.com/configuration-files-in-python/. Make sure to run your favorite Ethereum client before starting your program (e.g. geth --rpc).

Interacting with the smart contract

Note: Before interacting with your own account you need to unlock it first. This is achieved in web3 via:

self.web3.personal.unlockAccount(your_ethereum_account, your_ethereum_password)

There are some standard web3 calls you can make, like getting the current balance of an account in wei:

wei_balance = self.web3.eth.getBalance(some_ethereum_address)

In case you want to call a function in the contract you can do this by calling the command as defined by the contract ABI. In our dInvest example there is a contract call which returns the blacklisted companies for our sustainable investment. It is callable with:

blacklist = self.contract.call().blackListCompanies()

There are some more examples in the GitHub code available.

Final note

As a final note, I would like to point out that there are other blockchain solutions like Hyperledger Fabric or Tendermint that aim to solve issues around compatibility with other programming language, transaction throughput etc. As they are permissioned blockchains I haven't yet given them a try, but might be interesting to take a look at.

Quantitative finance with zipline

February 10, 2017 · 10 min read

Dominik Harz

CTO Interlay

As stated in the [dInvest post series]({% post_url 2017-01-10-dinvest %}) the idea is to build a hedge fund in a blockchain. Due to computational limitations, it is not feasible to implement investment agents in the blockchain. In dInvest an investment agent should do the following: (1) Get a list of all available financial assets to trade; (2) based on the data given (i.e. financial data and fundamentals data) make a recommendation which assets to buy; (3) keep track of which assets the agent is currently holding; and (4) send the recommended assets to buy to the blockchain. In this blog post I will cover the first three tasks. But what does a hedge fund actually do and how do financial investment strategies look like? I had taken some courses in my undergrad on international finance, but as a computer scientist I had to learn some new concepts while doing this project.

Financial investments can take different forms. A hedge fund offers multiple individuals to acquire a part of the pooled investment, which itself invests in publicly traded financial assets. It is measured according to its absolute returns [1]. A manager administrates the investment of the hedge fund and takes the decisions to buy, sell, or hold financial assets to balance absolute returns and risks. A hedge fund may apply different strategies. Hedge funds can follow one specific strategy, mix established strategies, or create a fund out of hedge funds to diversify [2]. The majority of hedge fund strategies try to optimize the absolute return and risk based on financial indicators of the assets to be invested in. An overview of the different strategies that are currently applied by different hedge funds is provided by HFR, Inc in [3]. Measuring the performance of hedge funds as well as specific influence factors is a wide research area, which is covered in more detail in our report.

Measuring success of investments

A common standard in literature and the investment practice to compare hedge funds is based on their absolute returns, Sharpe ratio, alpha and beta as well as a comparison to benchmarks [2].

Returns: The return of the fund is influenced by how well the strategy is able to determine assets that are increasing (for long buys) or decreasing (for short buys) in value over time.
Sharpe: With the Sharpe ratio one can determine the return in respect to the risk involved. The Sharpe ratio is calculated by dividing the sum of the asset return and a benchmark return by the standard deviation of the asset return. The higher the Sharpe ratio, the higher the return with the same risk or the lower the risk with same return.
Alpha: The alpha value expresses the risk-adjusted performance of the fund in comparison to a benchmark. An alpha of zero means exact performance as the benchmark. A positive value indicates outperforming the benchmark and a negative value represents falling behind the benchmark.
Beta: The beta value shows the volatility of the fund in comparison to a benchmark. The beta value baseline is one and represents the same volatility as the benchmark. A value below one indicates a lower volatility and consequently a value above one a higher volatility than the benchmark.

Quantitiative finance

To implement an investment strategy, certain rule sets are developed and applied. Quantitative finance ”translates” economic models into the mathematical world and allows to apply for example stochastic analysis over shares or calculate the value of derivatives [4]. Quantitative finance techniques express investment strategies in mathematical formulas and thus enables a translation into an algorithm [4]. When starting out with dInvest, we were looking into existing quantitative frameworks, that offer us (1) a simulation environment to test our algorithms without using actual money, (2) realistic conditions of the trading market, such as fees and process delays, (3) is able to trade shares and derivatives based on historical data to test our algorithm, (4) provide little delay (lower than 10 minutes) in trading current assets, and (5) is able to be controlled using a programming language (such as Python, Java, or Scala). We evaluated several solutions decided to use the open source implementation of the Quantopian framework zipline.

Getting started with zipline

Zipline is based on python and is able to run with python 2 and 3. Quantopian has created a beginners tutorial on their website. Before going into the details of zipline, let us look into how to set it up locally. Note: I will be using python3 throughout this post. To install zipline you first need all the required C extensions. For Debian based distributions use the command shown below, otherwise there are some more details provided here.

sudo apt-get install libatlas-base-dev python-dev gfortran pkg-config libfreetype6-dev

I then created a virtual environment for Zipline. If you have never setup a virtualenv follow this tutorial. In case you want to use pyhton3 like me, make sure to use the python3 executable. Also, when trying python2 I ran into a couple of issues as Linux Mint uses older versions of python2, which seem to be incompatible with the required version of numpy. When you setup your virtualenv, run pip install zipline from the terminal with your virtualenv . This will install all the required packages including numpy and most likely take some time.

The first investment algorithm

In the /zipline/examples folder you will find some example algorithms to try out. In their beginners tutorial the zipline authors describe some of the algorithms, how they are generally structured and how to execute them. I will go into more detail about having a value investment algorithm and not take one of the example algorithms.

The algorithm to be implemented is inspired by Benjamin Graham [6]

Select shares and derivatives of companies with a market capitalization of minimum 100 million dollars
Determine the SIC code of each asset [5] and exclude the ones contained in the blacklist from the hedge fund
Based on the sectors of the companies find the two sectors with the price of assets per earning ratio (PE ratio)
Invest every month in all the companies listed in the two sectors

Let's take a look at our algorithm. It is basically taken from the web-based version of zipline called Quantopian and can be found here. I have tried to include quite some comments, so if you are familiar with python then the code should be quite readable. Zipline does not include the full functionality and data to realize the investment algorithm. The market capitalization data, sector codes, and PE ratio are not accessible from zipline. For a quick overview of the current code and the changes I made, I will go through the main functions. Also, please note that I only used free sources of financial data. There seem to be a lot of other sources available, but they have their price tag.

initialize(context) Defines the number of sectors we are interested in and also schedules to execution of our algorithm for back-testing.

rebalance(context, data) This essentially takes to selected assets and distributes them evenly into our portfolio.

get_fundamentals(context, data) In Quantopian this is fairly easy, since the Morningstar fundamentals dataset is included. However, in zipline I had to find a workaround. Quandl offers the SF0 fundamentals for free: www.quandl.com/data/SF0-Free-US-Fundamentals-Data. To get a full view of the data, I choose to download the whole data as a CSV file and automated the process. An example of how to do this can be found in the TradeHandler.getData(self) function here.

get_sectors(key) Again, this is included in Quantopian, but needs a manual workaround. To get the sector codes it is quite simple. You need to download a txt file, which includes the sector codes and the function will read them into a dict.

get_sector_code() Also, this part is included in Quantopian. Here we download a JSON file to get a mapping of sector codes to the financial assets we are trading.

To analyse the performance of the investment algorithm the absolute returns, Sharpe, alpha, and beta for the period of January 1, 2009 to November 15, 2016 is compared. The investment algorithm uses the Quandl financial data i.e. opening and closing prices of specific assets during that period [8]. The algorithm is executed multiple times. First, no sectors are excluded and second, an exclusion list based on sectors is used. Defense, beer and alcohol, tobacco, and coal industry are excluded based on ESG criteria and common practices [9]. The first graph in the figure below presents the returns achieved by the unrestricted algorithm, the algorithm with single excluded sectors, and the benchmark return. These returns show that unrestricted and single excluded algorithms have overall a higher profit than the benchmark. The second graph in the figure displays a combination of two excluded sectors and the benchmark return, while the third graph elaborates on the combinations of three and all four sectors excluded. A single or combination of exclusions constantly perform better than the benchmark until beginning of 2012. Then the unrestricted algorithm as well as the single exclusion of defense and alcohol exclusion and also the combination of tobacco and coal, defense and alcohol, defense and coal exclusions drop until they recover in 2014. The other single and combination algorithms do not show this drop, but also have a strong increase in returns in 2014. Then all variations show high volatility. The drop in 2012 in the unrestricted algorithm is caused by three of the sectors it is investing in. When tobacco is restricted, there is no strong drop in 2012. However, if not also in combination defense and/or coal is restricted the algorithm selects either of the two industries instead and another drop occurs. In the third graph of the figure these exclusions are combined and no significant drop occurs, however the volatility remains.

There is further analysis and discussion of the results in our report.

References

D. Harper. (2016) Hedge funds hunt for upside, regardless of the market. [Online]. Available: http://www.investopedia.com/articles/03/112603.asp
M. Agarwal, Hedge Fund Strategies. John Wiley & Sons, Inc., 2009, pp. 45–55. ISBN 9781118258187. [Online]. Available: http://dx.doi.org/10.1002/9781118258187.ch4
Hedge Fund Research. (2016) HFR hedge fund strategy classification system. [Online]. Available: https://www.hedgefundresearch.com/hfr-hedge-fund-strategy-classification-system
P. Wilmott, Paul Wilmott on quantitative finance. John Wiley & Sons, 2013.
U.S.D. of Labor. (2016) SIC division structure. [Online]. Available: https://www.osha.gov/pls/imis/sic_manual.html
B. Graham and D. Dodd, Security Analysis: Sixth Edition, ser. Security Analysis Prior Editions. McGraw-Hill Education, 2008.
Quantopian. (2016) zipline: Pythonic algorithmic trading library. [Online]. Available: https://github.com/quantopian/zipline/
Quandl. (2016) Quandl: financial database. [Online]. Available: https://www.quandl.com/browse?idx=database-browser
J. R. Evans and D. Peiris, “The relationship between environmental social governance factors and stock returns,” UNSW Australian School of Business Research Paper No. 2010ACTL02, 2010.

Breaking captchas: Using deep learning to automatically break CAPTCHAs

January 12, 2017 · 13 min read

Tharidu Fernando

Akash Singh

Dominik Harz

CTO Interlay

Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) is a way of differentiating humans and machines and was coined by von Ahn, Blum, Hopper, and Langford [5]. The core idea is that reading distorted letters, numbers, or images is achievable for a human but very hard or impossible for a computer. CAPTCHAs might look like the one below. Most likely the reader has already seen one, when trying to register at a website or write a comment online.

Penguin-Pal_Captcha

There are several use cases for CAPTCHAs, which includes the ones presented in [6]: Preventing comment spam, protect website registration, protect e-mail addresses from scrappers, protect online polls, preventing dictionary attacks, and block/hinder search engine bots.

CAPTCHAs do not give a guarantee that it prevents these cases every time as there are known attack vectors. These include cheap or unwitting human labor, insecure implementation, and machine learning based attacks. We will not go into detail on insecure implementations, as the focus of this article are deep learning based approaches.

Human-based CAPTCHA breaking

Out of curiosity and to compare results achieved by machine learning approaches, we take a look at the human based approach. For example BypassCAPTCHA offers breaking CAPTCHAs with cheap human labor in packages (e.g. 20,000 CAPTCHAs for 130$). There are also other services including Image Typerz, ExpertDecoders, and 9kw.eu. There are also hybrid solutions that use both OCR and human labor like DeathByCAPTCHA. These vendors list the following accuracies and response times (averages):

Service	Accuracy (daily average)	Response Time (daily average)
BypassCAPTCHA	N/A	N/A
Image Typerz	95%	10+ sec
ExpertDecoders	85%	12 sec
CAPTCHABOSS (premium version of ExpertDecoders)	99%	8 sec
9kw.eu	N/A	30 sec
DeathByCAPTCHA	96.8%	10 sec

The values are advertised and self-reported. We did not conduct any verification of the stated numbers, but it can give an orientation on human performance and serves as a reference for our machine learning algorithms.

Learning-based CAPTCHA breaking

CAPTCHAs are based on unsolved or hard AI problems. However, with the progress of AI techniques and computing power, sequences of characters or CAPTCHAs can be recognized as shown by Goodfellow et al. in [1], Hong et al. in [2], Bursztein et al. in [3] and [7], and Stark et al. in [4] using deep learning techniques. Goodfellow et al. predict numbers from Goolge Street View images directly (without pre-processing) utilizing a CNN. They make use of DistBelief by Dean et al. to scale the learning to multiple computers and to avoid out of memory issues [1]. This technique was later on used to solve CAPTCHAs, whereby the researched achieved an accuracy of up to 99.8%. Hong et al. pre-process CAPTCHAs to rotate them and segment the characters. Afterwards they apply a CNN with three convolutional layers and two fully connected layers [2]. Bursztein et al. use pre-processing, segmentation, and recognition techniques (based on KNN) and later on various CNNs to detect CAPTCHAs from multiple websites including Baidu, Wikipedia, reCAPTCHA, and Yahoo [3],[7]. Stark et al. researched a way of detecting CAPTCHAs with limited testing data. They use a technique called Active Learning to feed the network with new training data, where the added data has a high classification uncertainty, to improve the performance overall [4]. The below table gives an overview of the reported accuracies in the different papers and blog posts.

Researcher	Dataset	Technique	Accuracy (maximum)	Reference
Goodfellow et al.	Google Street View image files	CNN with DistBelief	96%	[1]
Hong et al.	Microsoft CAPTCHAs	Preprocessing, segementation and CNN	57%	[2]
Stark et al.	Cool PHP CAPTCHA generated CAPTCHAs	CNN with Active Deep Learning	90%	[4]
Bursztein et al.	Baidu, CNN, eBay, ReCAPTCHA, Wikipedia, Yahoo CAPTCHAs	Reinforcement Learning, k-Nearest Neighbour	54% (on Baidu)	[7]
Bursztein	Simple CAPTCHA	CNN	92%	[3]
Bursztein	Simple CAPTCHA	RNN	96%	[8]

Current state of CAPTCHAs

Google has introduced NoCAPTCHA in December 2014. This introduces multiply new features including evaluation based on cookies, movement of the mouse, and recognition of multiple images. Google announced to introduce an invisible CAPTCHA to get rid of the checkbox.

The previous version of reCAPTCHA was very popular on many websites. It included typically two words with rotation and had an audio option. Further CAPTCHA techniques can include simple logic or math questions, image recognition, recognition of friends (social CAPTCHA), or user interaction (like playing a game) [9].

Our objectives and motivation

The aim of the project is to break CAPTCHAs using deep learning technologies without pre-segmentation. Initially we focus on simple CAPTCHAs to evaluate the performance and move into more complex CAPTCHAs. The training dataset is generated from an open source CAPTCHA generation software. Tensorflow is used to create and train a neural network.

Creating the datasets

We are generating the datasets using a Java based CAPTCHA generator (SimpleCAPTCHA). We have created the following datasets.

Description	Size	Training samples	Test samples
Digits only	38 MB	9502	100
Digits and characters	197 MB	49796	100
Digits and characters with rotation	39 MB	10000	100
Digits and characters with rotation	198 MB	49782	500
Digits and characters with rotation	777 MB	196926	500

Each dataset contains jpeg images containing a CAPTCHA with five characters. The characters are lowercase (a-z) or numbers (0-9). We used the fonts "Arial" and "Courier" with noise. An example of the created CAPTCHAs is displayed below. Our intention was to mimic the CAPTCHAs created by Microsoft. We have extended SimpleCAPTCHA library in order to get character rotation, outlines in CAPTCHAs to achieve the same look of Microsoft CAPTCHAs.

54563 5p23r ycn2m

Generated CAPTCHAs will be 152x80 greyscale images. This resolution is chosen because it is small enough to reduce memory footprint when training the CNN and it is also enough to recognize the CAPTCHA easily.

Deep CNN model

Based on the research in [1], [3] and [4] we use a deep CNN with three ReLU layers and two fully connected layers to solve the CAPTCHAs. Each digit is represented by 36 neurons in the output layer. The three convolutional layers with ReLU activation function have the sizes of 32, 64, and 128. 5x5 filter size was used in all layers. After each convolutional layer there is a max pooling of 2. After the last convolutional layer, there is a fully connected layer with ReLU of size 1024 and finally another fully connected layer that has an output size of 180. In the ReLU layers, a dropout of 0.75 is applied.

In the output layer, digits 0-9 will be represented by 1 to 10 neurons and, characters a to z will be represented by 11 to 36 neurons. Therefore, there are 5 x 36 neurons which will identify the CAPTCHA. The network will output the predictions for all 5 digits and each digit is calculated by the max probability of its 36 neurons. We have set the learning rate as 0.001.

CNN

Results and discussion

First, we trained the CNN with 10000 five letter and digit CAPTCHAs without rotation on a GTX660M. We had 100 batches with a batch size of 100 and ran it for 20 epochs. The hyperparameters were set as described in the previous section. The figure below shows that the network did not perform well with these settings. We then increased the training size to 50000 CAPTCHAs, but the results stayed the same. We then tried with 10000 simplified CAPTCHAs with only five digits without rotation. However, this still did not improve our situation. We noted that the loss function reduced quite quickly and stayed constant. We hence introduced another convolutional layer to the network to allow it to further differentiate the digits. Again, this resulted in almost the same result.

DigitsOnly660M
CNN with three conv. layers and two fully connected layers accuracy of CAPTCHAs with five digits or lowercase letters without rotation. Training in 100 batches and 10000 training samples.

These results match the ones presented in [4]. The authors then introduce Active Learning to circumvent the problem. In [1] a larger amount of samples is used. However, we do not have sufficient computing power available to use millions of images as training data. Also, in [3] the batch size is larger, resulting in a considerably higher accuracy. We decided to change our batch size, but required a more powerful GPU for that. Hence, we used a Nvidia Tesla K80 from AWS to conduct our training. We also changed the CNN back to three conv. layers and two fully connected layers. On the simple case with five digit CAPTCHAs without rotation we used 39250 CAPTCHAs in 157 batches and 10 epochs. We conducted testing with a very small dataset of 100 CAPTCHAs. The results did improve considerably as shown in the figure below. We managed to achieve a best training error of 94.9% and a test error of 99.2%. The very high test accuracy is quite likely also caused by the small amount of test items. However, running a K80 in AWS is quite costly and therefore we kept the test set to a minimum also in the upcoming experiments.

DigitsOnly
CNN with three conv. layers and two fully connected layers accuracy of CAPTCHAs with five digits without rotation. Training in 157 batches, 39250 training samples, and testing with 100 CAPTCHAs.

We then tried with a bit more complex CAPTCHAs with digits and lowercase characters. We used 49750 training and 100 test CAPTCHAs with the same CNN used in the simple case above. The figure below presents our results and shows that we can achieve a training accuracy of 65% and a test accuracy above 80% in these cases. We stopped the CNN prematurely after 10 epochs to try more complex use cases.

DigitsChar
CNN with three conv. layers and two fully connected layers accuracy of CAPTCHAs with five digits or lowercase letters without rotation. Training in 199 batches, 49750 training samples, and testing with 500 CAPTCHAs.

Next, we added rotation to our CAPTCHAs and trained the CNN on 49782 samples. This resulted in almost random result with an accuracy around 10%. We thus increased the samples to 196926 and tested it on 500. Again, we kept the same hyperparameters as presented in the model section. This time we trained for 15 epochs, to prevent premature interruption of the training. Our results are presented in the figure below. With the increased training size we achieve a training accuracy of 97.1% and a test accuracy of 99.5%. However, this is again on a very small test set. DigitsCharRot
CNN with three conv. layers and two fully connected layers accuracy of CAPTCHAs with five digits or lowercase letters with rotation. Training in 787 batches, 196926 training samples, and testing with 500 CAPTCHAs.

From the tests conducted above, we have a few examples to show correct and false predictions.

Correctly classified	Incorrectly classified
Prediction: 54563 Image:	Prediction: 82298 Image:
Prediction: grh56 Image:	Prediction: k76ap Image:
Prediction: fb2x4 Image:	Prediction: fffgr Image:

Conclusion

With this project we have shown that it is possible to create large enough datasets automatically to mimic certain CAPTCHAs (i.e. Microsoft). This provides large labeled datasets, which serve as a foundation to train neural networks. We have chosen to use two different CNNs with three and four convolutional layer and two fully connected layers. Our experiments show however, that adding more layers to the network did not increase the accuracy. We noticed that optimizing a CNN can be cumbersome. While running the CNN on a GTX 660M, we were not able to manage to get satisfying results. Most likely we would have needed more training time on the batches to receive better results. When we switched to a Tesla K80 we managed to train the network with larger amounts of data and a higher batch size. This resulted in higher accuracy on the simple and more complex datasets. We realized that memory poses a quite severe limitation towards applying large scale machine learning. Our approach based on a CNN is limited to CAPTCHAs with exactly the length defined in the network. Hence, classifying CAPTCHAs with any other length than five would fail. As an alternative a RNN could be used to resolve this issue. In [8] a use of RNN to break CAPTCHAs is discussed with fairly good results. However, also in this approach a powerful GPU is required. Moreover, using a CNN requires large datasets to be trained on. For a combination of digits, characters, and rotation we required a dataset of around 200000 CAPTCHAs (~780MB). On small sized GPU this datasets cause either out of memory errors or require a quite long training time. Even with the Tesla K80 the training time takes around 2 hours and 30 minutes.

References

Goodfellow, Ian J., et al. "Multi-digit number recognition from street view imagery using deep convolutional neural networks." arXiv preprint arXiv:1312.6082 (2013).
Hong, Colin et al. "Breaking Microsoft’s CAPTCHA." (2015).
Using deep learning to break a CAPTCHA system in Deep Learning. 3 Jan. 2016, https://deepmlblog.wordpress.com/2016/01/03/how-to-break-a-CAPTCHA-system/. Accessed 6 Dec. 2016.
Stark, Fabian, et al. "CAPTCHA Recognition with Active Deep Learning." Workshop New Challenges in Neural Computation 2015. 2015.
Von Ahn, Luis, et al. "CAPTCHA: Using hard AI problems for security." International Conference on the Theory and Applications of Cryptographic Techniques. Springer Berlin Heidelberg, 2003.
"CAPTCHA: Telling Humans and Computers Apart Automatically" 2010, http://www.CAPTCHA.net/. Accessed 7 Jan. 2017.
Elie Bursztein et al., "The end is nigh: generic solving of text-based CAPTCHAs". Proceedings of the 8th USENIX conference on Offensive Technologies, p.3-3, August 19, 2014, San Diego, CA
Recurrent neural networks for decoding CAPTCHAs in Deep Learning. 12 Jan. 2016, https://deepmlblog.wordpress.com/2016/01/12/recurrent-neural-networks-for-decoding-CAPTCHAs/. Accessed 9 Jan. 2017.
CAPTCHA Alternatives and thoughts. 15 Dec. 2015, https://www.w3.org/WAI/GL/wiki/CAPTCHA_Alternatives_and_thoughts. Accessed 9 Jan. 2017.

dInvest - hedge fund on a blockchain

January 10, 2017 · 4 min read

Dominik Harz

CTO Interlay

As public reputation becomes one of the most important success factors beyond financial success, investment opportunity should ensure ethical decisions and keep sustainable investment as core of their strategy. These long-term investment strategies based on criteria other than pure financial data are summarized under "Value Investment". Value investment is based on the assumption that the current price of an asset might not be the actual value of an asset (e.g. stocks and derivatives). Thus, criteria other than their current prices can be taken into account to calculate their value. The value investing approach offers thereby methods to select potential investment based on social or sustainability criteria. Furthermore, value investment includes diversification of the portfolio to achieve a desired level of risk.

In this blog post I will introduce dInvest, a project Tharidu and me are working on. With the development of blockchain technologies and the Ethereum blockchain, an autonomous and decentralized approach can be taken to create an investment opportunity. Users are able to invest directly with the cryptocurrency Ether into a company which exists in the Ethereum blockchain. Alternatively, they can invest with other cryptocurrencies (e.g. Bitcoin) or fiat currencies by utilizing currency exchanges. An autonomous organization build on Smart Contracts exists solely as code and can execute the details as specified in the contract. As an example, The DAO enabled users to act as venture capitalists by suggesting investments to other users. If a certain amount of investments was raised the contracts would be executed inside the blockchain. However, this approach failed as The DAO suffered security flaws, which led to a severe attack on its system. Furthermore, The DAO failed to give incentives to users proposing investments. Investments are voted for two weeks, while a vote can only be given by bounding a user’s capital into the vote. Thus, late voting and waiting for other’s resulted in available capital.

Problem

Certain investors want to invest their money in order to gain returns while being socially responsible and sustainable to e.g. achieve corporate sustainability objectives. However, financial intermediaries, such as investment banks or hedge funds, do not always adhere to investors’ requirements as there might be conflicting interests. Furthermore, the investment strategy applied by the financial intermediary can change through time and is highly influenced by social factors. In addition to that, investors do not have complete transparency over the transactions.

Our work

We decided to develop an autonomous hedge fund in the Ethereum blockchain. Users are represented in the Ropsten network of the Ethereum blockchain as addresses. Users can hold certain amounts of test Ether, which they can invest into dInvest. The investment will contain an amount of Ether and a time period defined by the user. The details of the investment will be defined in a smart contract between the user and dInvest. Selecting stocks for different investment amounts and time periods is comparably computation intensive and will require the application of investment algorithms. Thus, their executing is costly in the blockchain and will therefore be executed by an autonomous agent (invest agent) outside of the blockchain. The required information will be passed on to dInvest on demand and therefore keeping the operational cost to a minimum. The invest agent will suggest investment strategies to the hedge fund according to the current Ether value. The hedge fund will decide to approve or reject the offer based on the sustainability criteria of users’ investments. Only if the criteria is fulfilled, the Ether will be transferred to the buy agent and make the investment. The invest agent will transfer the investment return to the hedge fund where the profit/loss will be divided to investors according to their investment value.

We managed to achieve a linear cost function of user investments and their sustainability criteria in the smart contract. Based on a value investment investment strategy we managed to achieve around 360% increase of the portfolio in a back-testing simulation over around 6 years. The sustainability criteria was implemented using exclusion of assets based on industry codes (i.e. exclusion of defense, alcohol, tobacco, or coal industries). If you are interested in the academic side and the results of our implementation, you can check out our report. As the paper does not cover the implementation in detail, I will post the details in separate blog posts and link them here. You can also check out our source code on GitHub.

Substrate Runtimes as Verification Targets​

Diving into Verification​

Selected Tools​

A Note on Expectations​

kani​

Getting started​

Compilation Errors​

Prusti​

Getting started​

Errors​

Success​

MIRAI​

Getting started​

Compilation Errors​

Success​

Summary​

Why do we need this feature?

When do we need this feature?

How should we launch the feature?

tl;dr​

Introduction​

Governance Attack​

A Brave New Attack Strategy: Flash Loans​

Arbitrage and Oracle Manipulation with Flash Loans​

Oracle Manipulation to Reduce Required Liquidity​

Obtaining enough Liquidity​

Combination?​

No Time to Wait​

Counter measure​

Gas cost - why?​

Practical implications​

Local execution and global execution​

Measuring gas costs​

Bitcoin and smart contracts​

Rational agents​

Mechanism design​

Games in Bitcoin​

Assumptions​

Contract​

Mechanism analysis​

References​

dInvest smart contract​

Interaction with smart contracts​

Getting your ABI​

Setting up the RPC provider​

Interacting with the smart contract​

Final note​

Measuring success of investments​

Quantitiative finance​

Getting started with zipline​

The first investment algorithm​

References​

Human-based CAPTCHA breaking​

Learning-based CAPTCHA breaking​

Current state of CAPTCHAs​

Our objectives and motivation​

Creating the datasets​

Deep CNN model​

Results and discussion​

Conclusion​

References​

Problem​

Our work​

Substrate Runtimes as Verification Targets

Diving into Verification

Selected Tools

A Note on Expectations

kani

Getting started

Compilation Errors

Prusti

Getting started

Errors

Success

MIRAI

Getting started

Compilation Errors

Success

Summary

tl;dr

Introduction

Governance Attack

A Brave New Attack Strategy: Flash Loans

Arbitrage and Oracle Manipulation with Flash Loans

Oracle Manipulation to Reduce Required Liquidity

Obtaining enough Liquidity

Combination?

No Time to Wait

Counter measure

Gas cost - why?

Practical implications

Local execution and global execution

Measuring gas costs

Bitcoin and smart contracts

Rational agents

Mechanism design

Games in Bitcoin

Assumptions

Contract

Mechanism analysis

References

dInvest smart contract

Interaction with smart contracts

Getting your ABI

Setting up the RPC provider

Interacting with the smart contract

Final note

Measuring success of investments

Quantitiative finance

Getting started with zipline

The first investment algorithm

References

Human-based CAPTCHA breaking

Learning-based CAPTCHA breaking

Current state of CAPTCHAs

Our objectives and motivation

Creating the datasets

Deep CNN model

Results and discussion

Conclusion

References

Problem

Our work