How Elixir Solves a Difficult Security Problem

Michael Lubas, 2023-05-03

The authentication mechanism had been subject to numerous design reviews and penetration tests. The owners were confident that no feasible means existed of attacking the mechanism to gain unauthorized access.

In fact, the authentication mechanism contained a subtle flaw. Occasionally, when a customer logged in, he gained access to the account of a completely different user, enabling him to view all that user’s financial details, and even make payments from the other user’s account. The application’s behavior initially appeared to be random: the user had not performed any unusual action to gain unauthorized access, and the anomaly did not recur on subsequent logins.

After some investigation, the bank discovered that the error was occurring when two different users logged in to the application at precisely the same moment. It did not occur on every such occasion—only on a subset of them. The root cause was that the application was briefly storing a key identifier about each newly authenticated user within a static (nonsession) variable. After being written, this variable’s value was read back an instant later. If a different thread (processing another login) had written to the variable during this instant, the earlier user would land in an authenticated session belonging to the subsequent user.

The Web Application Hacker’s Handbook, 2nd Ed, Stuttard and Pinto, Pg 426

The Elixir programming language does not suffer from the above security problem, known as a data race, because it is built on top of the Erlang virtual machine, the gold standard for concurrent programming. Different threads cannot access the same variable in Elixir, providing a strong security guarantee against unexpected behavior. Contrast this with Go, where two goroutines may access the same variable, leading to Go including a data race detector.

The terms “data race” and “race condition” have different meanings. A data race occurs when two different OS threads access the same memory location, and one of these events is a write. The insecure banking example illustrates the danger of data races. The term race condition is more broad, and refers to an error that occurs due to the timing of events in an application. For example, consider the follow pseudo-code that handles banking withdrawals:

def withdrawal(account, amount) do
  balance = get_balance(account)
  if amount <= balance do
    perform_withdrawal(account, amount)

The example account has $50 in it, and the owner quickly submits two requests to withdrawal $40. The :timer.sleep(1000) is included to illustrate the timing gap between the balance check and the account balance being reduced. The owner is able to successfully withdrawal $80 from a balance of only $50. Race conditions are due to errors is business logic, and there is no silver bullet to completely eliminate them.

Returning to the problem of data races, where multiple threads accessing the same memory address leads to bugs, there is a solution. Erlang solved this problem decades ago, by introducing a concurrency model where code runs in lightweight threads of execution called processes. An Erlang process is not an operating system process, it is fully managed by the Erlang virtual machine, with a small memory footprint, and is fast to create and terminate. Erlang processes do not share memory, they communicate through message passing, and data structures in Erlang are immutable.

Erlang was released in 1986, designed for the real world problem of building always online telecommunication systems. Elixir was created in 2011, and has seen extraordinary adoption because it applies the decades of engineering wisdom from Erlang to help developers build high-uptime, low-latency applications in a variety of domains, one of which is web applications. Race conditions are still possible in Elixir and Erlang, but data races are not. The creator of Elixir, José Valim, mentions data races in a recent blog post, My Future with Elixir: set-theoretic types.

For example, Rust’s type system helps prevent bugs such as deallocating memory twice, dangling pointers, data races in threads, and more. But adding such type system to Elixir would be unproductive because those are not bugs that we run into in the first place, as those properties are guaranteed by the garbage collector and the Erlang runtime.

On March 8, 2021 all users of Github were logged out due to a security vulnerability related to thread safety in Ruby on Rails. These complex, critical security bugs are lurking in web applications everywhere, from banking to healthcare. Elixir and the Phoenix framework are modern, high quality tools for building web applications, that completely solve the thread safety problem. Every business that uses Elixir in production today enjoys this benefit.

Further Reading

Elixir is Safe by Nathan Long -

Finding Race Conditions in Erlang with QuickCheck and PULSE - stops data breaches by securing your Elixir and Phoenix apps. Detect and fix critical security issues today. Attending ElixirConf EU (April 17th) in Lisbon? founder Michael Lubas is giving the training Elixir Application Security and will be speaking at the conference. Hope to see you there!

Subscribe to stay up to date on new posts.