Elixir is (Still) Safe

Michael Lubas, 2023-07-24

In March 2021 Nathan Long published Elixir is Safe, a post about the security benefits of using Elixir, which focused on memory and thread safety. It is an excellent article for programmers and executives about the security benefits of Elixir. In July 2023 I googled “Elixir is Safe”, and the first result was a snippet from the paper, “Vision for a Secure Elixir Ecosystem: An Empirical Study of Vulnerabilities in Elixir Programs”, which was published by the ACM in April 2022.

Several of the paper’s authors are students, and this post is not meant to criticize them as researchers. The paper is written as a reaction to Nathan’s article, which accurately describes the security benefits of Elixir. The paper’s claims are misleading, and the fact that it currently ranks above the original article on Google, and was published in a reputable journal, warrants a response.

“Practitioners perceive Elixir to be a ‘safe’ language, as the language allows practitioners to write fast software programs without introducing vulnerabilities, unlike other languages, such as C[16]. Positive perception of practitioners about the safety and security Elixir programs is subject to empirical validation. Practitioner perceptions are formed through personal experience, and not based on empirical evidence [9]. A systematic empirical investigation that quantifies reported security vulnerabilities can shed light on the state of security of Elixir programs. Such empirical investigation can also yield recommendations for practitioners and researchers on how to securely develop Elixir programs.”

[16] Nathan Long. 2021. Elixir is Safe. https://dockyard.com/blog/2021/03/30/elixir-is-safe

Nobody thinks Elixir “allows practitioners to write fast software programs without introducing vulnerabilities”. The article “Elixir is Safe” never makes that claim, rather it mentions that Elixir is a memory safe language, which C is not, and discusses the thread safety benefits of Elixir. These are both important topics, for example Elixir completely eliminates security issues due to data races, a severe vulnerability that can lead to users being logged into the wrong account, a disaster for banking and medical portals.

An empirical study showing that Elixir programs contain vulnerabilities is not surprising, because software written in every language has them. Imagine Elixir as a type of asphalt which reduces the rate of potholes in roads, and Nathan’s article is explaining this benefit. Then a paper is published saying, “roads paved with Elixir still have potholes”, ignoring that it results in fewer potholes. The paper does not mention memory or thread safety at all, which are the specific security benefits Nathan discusses. Rather, it surveys popular open-source Elixir projects for vulnerability related commits.

Which repos were selected for the dataset? The paper does not mention them, below is a list pulled from the dataset. The unix command awk -F "," '{print $1}' ELIXIR_FINAL_SECURITY_BUG_DATASET.csv | sort | uniq was used to get this list.

1. 30-days-of-elixir
2. absinthe
3. analytics
4. awesome-elixir
5. credo
6. curlconverter
7. distillery
8. edeliver
9. elixir
10. elixir-koans
11. guardian
12. httpoison
13. papercups
14. phoenix
15. phoenix-trello
16. poison
17. quantum-core
18. realtime
19. rustler

It is surprising to only see 19 repos, when the paper says 25. The inclusion of 30-days-of-elixir, awesome-elixir, and elixir-koans is debatable, considering these repos do not store projects intended to be run in production. The counts for “Elixir Files” and “Elixir-related Commits” in the dataset do match the paper.

These numbers do match the dataset, but the definition of “program” is misleading. For example, commit 2d68e9c maps to two files in the dataset:

absinthe/test/absinthe/phase/document/complexity_test.exs
absinthe/lib/absinthe/phase/document/complexity/result.ex

View this commit on Github. Both of these files are considered a “program” in the paper. It is misleading to say the dataset has “319 programs with vulnerability related commits”, the paper came to that number by counting files per commit classified as vulnerable.

In the paper’s dataset, the repo for Plausible Analytics is included, with commit f7b37fe. You can view this commit on Github, but it is for a PR with 32 changed files, so the commit alone is not enough information to determine what vulnerability the dataset is referring to. The paper mentions that commits were found via a keyword search for terms often used to describe vulnerabilities, one of which is “sql”.

lib/plausible_web/controllers/api/external_controller.ex

postgres_health =
  case Ecto.Adapters.SQL.query(Plausible.Repo, "SELECT 1", []) do
    {:ok, _} -> "ok"
    e -> "error: #{inspect(e)}"
  end

Sobelow will flag this code as vulnerable, even though it is not, because no user input is being passed to the SQL query.

lib/plausible_release.ex

  defp do_create_ch_db() do
    db_to_create = Keyword.get(Application.get_env(:plausible, :clickhouse), :database)
    IO.puts("create #{inspect(db_to_create)} clickhouse database/tables if it doesn't exist")
    Clickhousex.query(:clickhouse, "CREATE DATABASE IF NOT EXISTS #{db_to_create}", [])
  end

The string interpolation in a Clickhouse DB query looks suspicious, but the db_to_create value is not set by user input, so there is no vulnerability. The keywords “security” and “vulnerability” also appear in the PR. Link to sub-commits:

b3f783e - Adds a container security tool, not related to Elixir.

57171f8 - The text of this comment says, “fixing some vulnerabilities identified by the scanning tools”, however it involved a Debian upgrade, not related to Elixir.

The paper counts f7b37fe as “28 programs with vulnerability related commits”, because it involves 28 Elixir source code files, which is not correct. The lack of clarity around the vulnerable code in the dataset is concerning, making it impossible to truly replicate the paper’s findings, because you have to guess which specific lines the dataset is referring to as vulnerable. The dataset contains no labels for the type of security issue found (DoS, XSS, CSRF, etc), no information on the severity of each issue, and no discussion on how an attacker would use each vulnerability. This limits its utility in discussing the security of Elixir applications.

It is highly unlikely that the dataset from this paper includes vulnerabilities related to thread or memory safety, which is the main point Nathan correctly makes in his article Elixir is Safe.

Paraxial.io stops data breaches by helping developers ship secure applications. Get a demo or start for free.