How a Little-Known Solana Feature Made Program Vaults Unsafe - Exploring Solana Core Part 1

Intro

Over the past year and a half, we have spent a lot of time looking at the Solana core code, reporting over 80 bugs of varying severity. This blog post is the first in a series detailing the most interesting vulnerabilities we found and reported in Solana core, hopefully inspiring more whitehats to keep the ecosystem safe. All bugs presented here were responsibly disclosed under the Solana bug bounty program and are now fixed.

The specific bug we present here is one we found in June last year during a smart contract audit. The contract used seeded accounts, a little-known Solana feature. We hadn't encountered this feature in production before, and instead of relying on docs alone, we did a deep dive on how it works. While doing this, we found a bug that could be used as a powerful rug pull mechanism, even though the immediate impact was likely limited. Solana's developers reacted quickly and had it fixed internally within a day.

To understand the bug and its impact, we first need to look at program-derived addresses (PDAs) and seeded accounts, the intersection of which leads to the bug.

First, let's talk about PDAs.

What are PDAs?

Ownership on Solana is represented through signatures. Whether it is SOL or token transfers, staking or lending, authorization is always done with the owner's signature. That creates a problem: How can a smart contract own funds? Everything about the contract is out in the open, and it cannot hold a private key to sign with.

This is solved with program-derived addresses (PDAs). As the name suggests, they are public keys that are derived from the public key of a program. Through the magic of cryptography, they have no accompanying private key. When a program calls another via cross-program invocation (CPI), it can attach signatures of its own PDAs.

The runtime enforces that signatures can only be added for a program's own PDAs, which allows PDAs to be used as authorities for escrow accounts (for example, the vaults that hold the liquidity in AMMs like Orca).

A program can derive an unlimited number of these PDAs, differentiated by their seed, which in essence is a fully controllable sequence of bytes.

In order to compute a PDA, a program calls the function Pubkey::create_program_address(seeds: &[&[u8]], program_id: &Pubkey). This takes the public key of the program and an array of array of bytes as seed, and then computes a hash from this as follows:

let mut hasher = crate::hash::Hasher::default();
for seed in seeds.iter() {
    hasher.hash(seed);
}
hasher.hashv(&[program_id.as_ref(), "ProgramDerivedAddress".as_ref()]);
let hash = hasher.result();

if bytes_are_curve_point(hash) {
    return Err(PubkeyError::InvalidSeeds);
}

Ok(Pubkey::new(hash.as_ref()))

So, what does this do exactly? This code first hashes all seeds, then the program_id and after that the static string ProgramDerivedAddress:

pda(seed, program_id) = hash(seed + program_id + "ProgramDerivedAddress")

After this comes a technical check that asserts that the public key is not on the curve (i.e., it does not have an associated private key). The resulting PDA is just the bytes of the hash interpreted as a public key.

Next up, we have to talk about seeded accounts.

What are Seeded Accounts?

This is a seldom-used feature of the system program that is similar to PDAs. Let's first understand its use case by looking at an example:

To trade on Serum, every user needs to have an OpenOrders account, which stores metadata about the open orders. This account has to be created somehow, which requires at least one signature. So to create one, without using any additional features, a user has to first generate a new key pair and then sign a transaction with this. In this transaction, the account associated with this keypair is created and initialized. This is not just cumbersome because we have to generate private keys that we have to discard right away. It also costs more fees as we now have two signatures on that transaction.

This is where seeded accounts come in: the system program allows you to operate on accounts that are derived from your signing key, while only requiring the signature of the base account. This derivation is done by create_with_seed. This is a feature of the system program, meaning it can only be used to create accounts, assign them to arbitrary programs as owners, and transfer SOL from system accounts created in this way.

So similar to how PDAs associate additional accounts with programs, seeded accounts associate additional accounts with normal users.

This similarity in function already hints at a parallel in implementation. Let's dig into the code to find out how it's actually done:

The address of a seeded account is calculated in Pubkey::create_with_seed(base: &Pubkey, seed: &str, owner: &Pubkey):

Ok(Pubkey::new(
    hashv(&[base.as_ref(), seed.as_ref(), owner.as_ref()]).as_ref(),
))

The base, which you can think of as the authority, is the account that signs the transaction and is able to sign further calls to transfer_with_seed. The seed is an arbitrary but valid UTF-8 string, and the owner is the program that will own this account.

Or, more concisely:

seeded(base, seed, owner) = hash(base + seed + owner)

The Bug

Given that these two methods of computing hashes that result in account keys are so similar, wouldn't it be a shame if we could collide them? Let's look at the formulas for PDAs and seeded accounts side-by-side:

pda(seed1, program_id)      =  hash(seed1  +  program_id + "ProgramDerivedAddress")
seeded(base, seed2, owner)  =  hash(base   +  seed2      +  owner                 )

These lines look eerily similar! Since a cryptographic hash function is used, we'd need to provide the same input in both cases to create a collision. In the case of PDA, we can only control seed1. For seeded accounts, there is lots of flexibility: both seed2 and owner are fully controlled if we invoke transfer_with_seed, and base can be any public key that we can sign for.

The restriction on base is a bit annoying, so we have to make an assumption: base == seed1. Further, we'd need to be able to sign for base to make anything interesting happen with the *_with_seed instructions. So let's make it a user_key as well: base == seed1 == user_key. More on this assumption later.

With this assumption, we can force a collision by ensuring that

seed2 + owner == program_id + "ProgramDerivedAddress"

When considering that owner and program_id are account keys that are always 32 bytes long, we can force a collision by choosing

seed2 = program_id[:21]
owner = program_id[21:32] + "ProgramDerivedAddress"

Putting it all together, we see that there are cases where a user can modify a PDA via the *_with_seed system instructions, which should never be possible.

The Constraints

We are now able to collide a seeded account with a PDA. To be able to evaluate its impact of it, let's list all constraints:

  • A program has to use a single key as PDA seeds, which the user can sign for (base == seed1 == user_key)
    • Many existing programs use account keys as PDA seeds. Say I want to store some metadata about each user. It makes a lot of sense to just set seed1 == user_key. This instantly gives you the guarantee that at most one metadata account can exist for each user, and also an easy way to determine it's address for frontends.
  • At the time of the attack, the account is system-owned.
    • From an attacker's perspective, this is quite annoying. System-owned means the account can't have any data. It could be a native SOL vault, though. Or the attack could be run before the contract initializes the account. But since owner has to be very specific, we'll also not be able to write any data. In that case this is a DoS only.
  • The first 21 bytes of program_id have to decode to valid UTF-8.
    • Empirical tests have shown that the probability of this happening is about 1 in 180,000 for a randomly generated sample of public keys. Not likely, but not impossible either.

The Impact

Given the bug's constraints, we think it's unlikely that it could be used to cause loss-of-funds in your normal Solana contract. But it might well be feasible to run a denial-of-service of certain user accounts by assigning some wrong owner via the seeded account mechanism before the PDA is initialized. That way, a contract could never create the account.

The primary way we think malicious people could have exploited the bug is as a powerful rug pull mechanism, which even the best formally verified smart contract audit couldn't have found. So even if the underlying program is audited, not upgradable or managed by a DAO, a malicious deployer can empty the funds of a contract if they went through the effort of grinding a program id that fulfills the above mentioned constraint and chose their PDA derivations the right way.

The Fix

We reported this vulnerability to the Solana team right away through their bug bounty program on 18/06/2021. This issue was fixed promptly in this commit. Note that an internal version of the fix was already complete on the same day we reported the vulnerability. A check was added to make sure the owner of a seeded account cannot end with the ProgramDerivedAddress marker:

if owner.len() >= PDA_MARKER.len() {
    let slice = &owner[owner.len() - PDA_MARKER.len()..];
    if slice == PDA_MARKER {
        return Err(PubkeyError::IllegalOwner);
    }
}