Elixir Security: Real World Sobelow

Michael Lubas, 2023-10-04

Sobelow is the best static code analysis security tool for Elixir and Phoenix. If you are using Elixir in production today, it is highly recommended that all code is scanned with Sobelow, because it can detect critical vulnerabilities that lead to data breaches. This check is a requirement in many regulated industries, including finance and healthcare. Getting started with Sobelow is easy, using it effectively requires some work. Your first Sobelow scan will likely result in a high number of findings, some of which are false positives. Classifying all these findings is a significant project, daunting enough to stop many businesses from taking full advantage of Sobelow’s power.

The goal of this article is to provide a guide for effectively using Sobelow. It will cover project planning, classifying findings, and using Sobelow effectively. Academic and business writing tends to use “we”, I’m going to use “I” in this article because I have very personal experience with Sobelow. I’ve used it while working as a security engineer, and contributed to the project. The company I started, Paraxial.io, helps people use Sobelow effectively. The adoption of Elixir at many companies was made possible due to Sobelow. Without static security analysis, hundreds of Elixir jobs would not exist. It’s a fantastic security tool, and I hope my experience will help you use it effectively.

1. Sobelow Project Planning

2. Understanding the Security Model

3. The First Sobelow Scan

4. Triaging Findings

5. False Positives

6. Limitations

7. Paraxial.io and Sobelow

8. Appendix A - Learning Web Security

1. Sobelow Project Planning

Every business running Elixir in production should be using Sobelow. What does “using Sobelow” mean though? It could be:

Sobelow is in the mix.exs file and was run one time, several years ago.
During the annual pentest engagement, the auditor uses Sobelow to search for vulnerabilities.
Every code change going into production must result in a Sobelow scan with zero findings. This process is automated in CI/CD on all code changes.

There is clearly a spectrum of maturity here. The final example, where Sobelow is automatically scanning all new code, is the ideal state for most businesses. Getting there is not an easy task. The project involves:

Running Sobelow for the first time.
Triaging all the initial findings. This requires a significant amount of work that must be performed by an experienced developer who also understands security. Many projects fail here.
Documenting true and false positives. This assumes a standard process for tracking security issues already exists. If it does not, that's more scope creep.
Putting Sobelow into the CI/CD pipeline. The code change is not difficult, compared to what comes next.
When Sobelow emits several new findings on a new branch, how are those findings triaged? Who is responsible for fixing the security issues? Is that process being followed?

Here’s a ranking of each task, from most work to least:

Documenting and fixing true positive vulnerabilities.
Triaging findings, documenting false positives.
Training developers on the correct procedures to handle Sobelow findings.
Installing and running Sobelow initially, then in CI/CD

Why go through all this trouble? Your Elixir application may be vulnerable to a number of security vulnerabilities. For example, the 2017 Equifax data breach started with a remote code execution (RCE) vulnerability in Apache Struts, which is to Java what Phoenix is to Elixir. Equifax had to pay $425 million due to the breach, and you can see why executives may be interested in the security of your Elixir code. Sobelow detects several types of vulnerabilities, including RCE.

2. Understanding the Security Model

When you run Sobelow, it will report findings such as “Your application is vulnerable to SQL injection”. If you are familiar with this type of vulnerability, you understand the very serious security incident it can lead to. To illustrate what a security model is, consider two Elixir applications:

SecurityCamAdmin - An Elixir application for managing security cameras. Created by a hobbyist, there is no commercial offering. Intended to be hosted on a private network, it can be exposed to the public internet with an authentication prompt. New users cannot sign up, accounts must be created by the admin.

BlackCatProjects - An Elixir application for project management, open sourced by the company that created it. Anyone on the internet can create an account, because each new signup is a potential customer.

Now consider a SQL injection vulnerability where the attacker must have a valid account to exploit it. For SecurityCamAdmin, an attacker scanning the public internet cannot register a new account. The attacker could attempt default credential pairs, such as “admin/admin”, then launch the attack from the compromised account. Most real world instances would not be vulnerable. For BlackCatProjects, an attacker is free to create an account, read the public source code, figure out the SQL injection issue, and hack the company. A much more severe problem!

Sobelow does not know any of this. It takes Elixir source code as input, and produces a list of findings as output. It does not have any context on how the application is hosted, or how a real attacker would exploit the vulnerability. Understanding the context of how the code is being used is necessary to effectively use Sobelow. Here are some questions to assist with this process:

Is the source code public or private?
Can users from the public internet create an account?
How would a real attacker find and exploit this vulnerability?
Does the vulnerability require source code access to exploit?
If an attacker does exploit this vulnerability, what is the business impact?

3. The First Sobelow Scan

Add Sobelow to your project’s mix.exs file:

This is a common pattern you will see with installations. Sobelow is only installed in the “dev” and “test” environments, not production. This is because the Sobelow scan is performed before the code is deployed, having the dependency included in production is unnecessary, so it’s excluded to save space and compile time. runtime: false means Sobelow will not be started as part of the application supervision tree. Sobelow is scanning the source code at compile time, so starting the dependency at runtime would be pointless.

Run mix deps.get then mix sobelow:

Now that Sobelow has scanned your code, the real work begins.

4. Triaging Findings

Sobelow has 31 different finding types. I have published a guide, with context about each finding, on the Paraxial.io Github. Ideally, if you are tasked with triaging Sobelow findings, you have some background in web application security. You should be familiar with concepts including CSRF, XSS, and RCE. If these acronyms mean nothing to you, there is significantly more work ahead of you. To effectively classify a finding as a true or false positive, you need to understand the vulnerability being reported.

Potion Shop is an intentionally vulnerable Elixir application, where users can register for an account and leave reviews on different types of Potions. Installing and running Sobelow will report the following:

mix sobelow -v (Verbose output, will print relevant code with a finding)

Config.CSRF: Missing CSRF Protections - High Confidence
File: lib/carafe_web/router.ex
Pipeline: browser_auth
Line: 16

pipeline(:browser_auth) do
  plug(:accepts, ["html"])
  plug(:fetch_session)
  plug(:fetch_live_flash)
  plug(:put_root_layout, {CarafeWeb.LayoutView, :root})
  plug(:put_secure_browser_headers)
  plug(:fetch_current_user)
end

Is this enough information to confirm the finding is a true or false positive? No, because we cannot tell if the :browser_auth pipeline is actually being used. Triaging this finding requires the following:

The person investigating should be familiar with cross site request forgery (CSRF) attacks, understanding how a malicious website can trigger a POST request, and how an attacker would use this vulnerability. CSRF is a difficult vulnerability to understand, if my first introduction to this concept was the Sobelow findings guide, I would be totally lost. It takes time and experimentation to understand this concept.
An understanding of how Potion Shop works. While not a real e-commerce site, using the web interface it should be clear that userA should not be able to post a review from userB's account.
An understanding of Potion Shop's source code. The POST request going to the create_review action in PotionController is the true source of this vulnerability.

Would a scope with only GET requests still be considered vulnerable? Maybe, if those GET requests resulted in state changing actions. The reason GET requests should not be used for state changing actions is because a CSRF token cannot be included in the body. The state changing POST request, which transfers money, creates a review, or adds an admin user, is the key to understanding what CSRF means.

Read the Sobelow Guide for UID 5, Config.CSRF: Missing CSRF Protections for additional context and how to fix this issue. If you have never heard the acronym CSRF before reading this article, you probably feel lost right now. This is not a good introduction to CSRF, and my intent here is not to teach the concept from scratch. Rather, it is to show why experience with these vulnerabilities is necessary to use Sobelow effectively. If you would like to learn more about web security for Elixir, see Appendix A for recommendations.

Potion Shop has several security issues that Sobelow detects, the majority of findings are true positives. In a real world Elixir project many findings will be false positives. Changelog is a developer-focused media company that hosts several podcasts @ changelog.com. The CMS for this site is an open source Elixir application. Installing and running Sobelow results in 85 findings:

@ changelog.com % mix sobelow --quiet      
Sobelow: 85 findings found. Run again without --quiet to review findings.

To determine if any of these findings are valid, we first must understand the security model of changelog.com. Anyone on the internet can create an account and use the site, but only privileged users can create new posts. The worst type of vulnerability is one that does not require user interaction, for example an SQL injection that can be triggered by a public attacker. A less severe example would be the same issue, SQL injection, but the attacker must first authenticate as an admin.

There are also vulnerabilities that require attacking a live user session, for example XSS. A stored XSS attack here would be high severity, if the page where the vulnerability exists is viewed by most users. A reflected XSS vulnerability, where the victim has to click a link created by the attacker, requires that additional step, reducing the impact. There are frameworks and checklists for assigning severity to a vulnerability, but none of that is useful if your understanding of the application is wrong.

Consider the finding:

SQL.Query: SQL injection - Low Confidence
File: lib/changelog/schema/news/news_item.ex
Line: 656
Function: query_recommendations:653
Variable: query

The source of this finding is Changelog.Repo.query/1, however there is not enough information to triage the finding. Searching for query_recommendations shows where the function is used.

When you see a SQL statement in code, that’s a good sign. We can now confidently say this function is not vulnerable to SQL injection. This might be surprising, since we did not even check if user input is being passed into recommendation_query. Even if an attacker can get malicious data into this query, it will be parameterized safely. Notice the “$1” and “$2” in the queries. This means that malicious input will be safely handled at runtime. From the Sobelow findings guide:

Not safe:
Ecto.Adapters.SQL.query(Repo, "SELECT * FROM potions WHERE name = #{user_input}")

Not safe:
Ecto.Adapters.SQL.query(Repo, "SELECT * FROM potions WHERE name = " <> user_input)

Safe:
Ecto.Adapters.SQL.query(Repo, "SELECT * FROM potions WHERE name = $1", [user_input])

The SQL code is safe, this finding is a false positive. Consider a different scenario, where the code is using the unsafe string interpolation. Then you would have to trace the flow of data to the query, and determine if the values can be set by user input. This takes more time. In our example the real query arguments are both integers, so the code is safe.

5. False Positives

The previous example of a Sobelow finding for SQL injection is known as a false positive. Sobelow has several options for filtering out false positives:

1. The `sobelow_skip` Comment

To view only the SQL injection finding, run:

@ changelog.com % mix sobelow --compact -i XSS,Config,Traversal,DOS --skip
[+] SQL.Query: SQL injection - lib/changelog/schema/news/news_item.ex:656

The -i flag ignores the provided finding types, and --skip does nothing right now, because no findings have been marked as false positive. The following comment will cause Sobelow to ignore the finding:

@ changelog.com % mix sobelow --compact -i XSS,Config,Traversal,DOS --skip
@ changelog.com %

2. The `--mark-skip-all` flag

From the documentation, “When integrating Sobelow into a new project, there can be a large number of false positives. To mark all printed findings as false positives, run sobelow with the –mark-skip-all flag.”

Running mix sobelow --mark-skip-all will produce a normal scan output, and create a new file, .sobelow-skips:

Each MD5 hash represents one finding. Note that when you mark a finding as false positive this has implications for future code changes.

With sobelow_skip, if the code of the function is changed, potentially introducing a vulnerability, Sobelow will skip over that function and not report a true positive.
With --mark-skip-all, if the code of the function is changed, Sobelow will scan it again, catching the true positive.

This illustrates why --mark-skip-all is preferable in some situations. As the documentation hints, getting started with Sobelow on a large codebase involves the following:

Get the initial findings
Classify each as true or false positive
Track fixing the true positives in your ticketing system
Mark all findings as false positive
Integrate Sobelow in your CI/CD pipeline, failing on new findings

6. Limitations

Do not use Sobelow to check for vulnerable dependencies. The README mentions it can check for known-vulnerable dependencies, however the data for these checks is hard-coded into the project. Instead, use MixAudit, which does the same thing, but uses the actively updated GitHub Advisory Database.

Sobelow only scans the top level code in an Elixir project, it does not scan the source code of a projects dependencies. For example, the Paginator library before 1.0.0 was vulnerable to remote code execution. Before the vulnerability was reported, if you were using Paginator in your project, the only way to detect this issue was doing a security assessment on the source code of Paginator. Now that the vulnerability is public, MixAudit detects if you are using a vulnerable version.

Why not run Sobelow against all the dependencies in a project? It could detect a serious security problem lurking a layer beneath the top level code. I’ve looked into this myself, and believe there’s some potential for this area of research.

7. Paraxial.io and Sobelow

Application Secure

So far all the software discussed is open source. Paraxial.io is an application security platform for Elixir, with an offering that provides metrics on the security of Elixir and Phoenix code. Leadership often has questions about application security:

How many security scans did we run last month? Last year?
How many vulnerabilities are currently outstanding?
Do we have a record of each scan and all findings going back one month? One year?
If we have a sensitive question about Elixir security, which cannot be posted publicly, how do we get help?

A pattern emerges across many companies: a dashboard and record keeping system for security scans. There is the option to build it yourself, uploading scan results to an s3 bucket and building out a dashboard in the metrics platform of your choice.

The Paraxial.io Application Secure product manages all this for you.

The guide to fixing Sobelow findings I wrote is inlined with each result:

This is just one feature of the Paraxial.io Application Secure product. If your company has questions about Elixir security, advice on how to triage and fix findings reported by Paraxial.io is available through a consulting package.

Elixir Developer Security Training

How long does it take to correctly triage and fix a Sobelow finding? It depends how familiar the person doing the work is with Elixir and web security. Paraxial.io offers Elixir Developer Security training, with small, private classes for businesses using Elixir. If you’re an individual, and your work cannot schedule a private class, I usually do an Elixir security training for the relevant conferences (ElixirConf, Code BEAM). Subscribing to the Paraxial.io mailing list is the best way to get the dates.

The training uses the open source vulnerable Elixir application Potion Shop, with interactive labs to help students understand the root cause and context of common security issues. Some companies do secure code training via power point, with examples in a completely different programming language. I’m very glad to be running a course on security using Elixir.

Security Consulting

You may want to contract out several types of Elixir security projects, including:

Triaging Sobelow findings
Secure Code Review
Penetration testing
SOC 2, HIPPA, and ISO 27001 compliance work

Paraxial.io also offers security consulting, client testimonials can be found on the services page.

8. Appendix A - Learning Web Security

There are two domains of knowledge relevant here:

General web security topics: vulnerabilities (XSS, RCE, etc), access control, cryptography
Elixir specific web security topics: binary_to_term, action reuse CSRF, raw for HTML

For general web security, “The Web Application Hacker’s Handbook, 2nd edition” is my favorite book. The author and creator of Burp Suite, Dafydd Stuttard, said in a 2020 interview there will not be a 3rd edition, and instead recommends the PortSwigger Web Security Academy. The labs are free, constantly updated with new vulnerabilities, and are where I would start from scratch today.

For learning more about Elixir security, here are the most relevant resources:

Paraxial.io stops data breaches by helping developers ship secure applications. Get a demo or start for free.

Elixir Security: Real World Sobelow

1. Sobelow Project Planning

2. Understanding the Security Model

3. The First Sobelow Scan

4. Triaging Findings

5. False Positives

6. Limitations

7. Paraxial.io and Sobelow

8. Appendix A - Learning Web Security

1. Sobelow Project Planning

2. Understanding the Security Model

3. The First Sobelow Scan

4. Triaging Findings

5. False Positives

1. The sobelow_skip Comment

2. The --mark-skip-all flag

6. Limitations

7. Paraxial.io and Sobelow

8. Appendix A - Learning Web Security

1. The `sobelow_skip` Comment

2. The `--mark-skip-all` flag