Extracting data from insecure Elasticsearch templates

Search functionality has gone from a cool feature to a commodity – everyone expects search to be there and just work. At the same time, the infrastructures, data models, and back-ends behind the friendly search bar can now be incredibly complex, requiring novel approaches that go far beyond sending a query to a database server. Elasticsearch was developed to meet that need and is now the world’s most popular enterprise search engine, so knowing how to use it properly is critical for data security – especially as it’s very easy to get this wrong.

Before we get into some examples of insecure usage, let’s take a step back to see how we got where we are today with Elasticsearch.

A silent paradigm shift in web development

Before the cloud era, web development was a lot less specialized. Sure, we had our fair share of frameworks that required specific knowledge to build a functioning application, but their job was mostly to abstract away all of the tedious work that came with building a web application, such as session, cookie, or user management. In practice, you could treat these solutions as extensions of existing programming languages, so learning them was a matter of remembering a few new functions and patterns.

The last few years, however, have seen a major (if silent) shift. While previous frameworks and technologies were mostly designed to make a developer’s job easier, more recent projects are increasingly focused on enabling programmers to write applications that are more technically robust and advanced under the hood. This is especially true for highly distributed systems where you need completely different approaches to ensure scalability, reliability, and performance.

All this was not really surprising given the success and expansion of cloud-based systems, especially with big technology companies releasing their highly specialized tooling for anyone to use. One motive was to attract an open-source community that would build upon – and of course improve – the technologies they were using for their own products.

Enter Elasticsearch

In a distributed world, many of these new technologies had to be dramatically different from what was used before, forcing a change to the way we write our applications. Elasticsearch is a good example of a technology that seems familiar but represents a completely different approach to searching data. Even though it’s been around for quite a while, its usefulness for all sorts of enterprise applications was not immediately apparent to a wider audience. Before, you would have mostly used a traditional SQL database, such as MySQL – or Microsoft SQL Server, as a lot of our readers are painfully aware.

But Elasticsearch had several features that would make it a lot more popular among developers, such as the fact that it sports a JSON-based API – a feature that anyone who’s ever had to write an SQL query of slightly above-average complexity would happily pay a large amount of money for. Another reason good old SQL queries fell a bit out of favor is that they have historically been a leading cause of unauthorized surprise backups. Clearly, there was a need for alternatives.

What are search templates?

The alternative we want to talk about today is search templates, specifically Mustache templates as used in Elasticsearch (and supported for all major programming languages). Imagine you want to provide search functionality on your website, for example to search for blog posts. Specifically, you want to give users the ability to filter results by their own criteria, such as the ID of the post. Here’s one example of such a template:

{
    "script": {
        "lang": "mustache",
        "source": {
            "query": {
                "match": {
                    "postID": {
                        "query": "{{ID}}"
                    }
                }
            }
        }
    }
}

As you can see, we have specified our search criteria within the template and allow for variable user input between double braces. The template is conveniently written in JSON format and specifies a postID parameter that we want to get from the user. The {{ID}} template string will be expanded to reflect actual user input and automatically sanitized, so we can embed user input in the template knowing we are relatively safe from injection – that is, if the template is written correctly.

Mustache equals triple trouble

As you can see in the example above, a simple placeholder in a Mustache template uses double braces, so something like {{username}}. When the template is processed, everything will be automatically sanitized to not interfere with the syntax of whatever you’re putting the placeholder in.

So far, so good – but according to the documentation, this is not the only kind of placeholder:

All variables are HTML escaped by default. If you want to return unescaped HTML, use the triple mustache: {{{name}}}.

Depending on what templating system you are already familiar with and what you were using it for, you may be more accustomed to using such triple brace syntax by default, going for {{{username}}} instead of {{username}}. And that will work equally well for something like a standard username, so you might not notice the catch. The docs say you are now returning unescaped HTML, in effect using the raw input value within the template before it is passed to Elasticsearch for evaluation.

Unsanitized user input being used directly in data queries? Sounds like a classic injection vulnerability. Let’s explore.

A vulnerable code example

Before diving into a vulnerable query, let’s define the data we’re querying. Say we have an index called kitties containing documents with name and age fields. Here’s the index, with just two documents:

{
    ...

    "hits": [
        {
            "_index": "kitties",
            "_id": "2",
            "_score": 1.0,
            "_source": {
                "name": "Mila",
                "age": 3
            }
        },
        {
            "_index": "kitties",
            "_id": "1",
            "_score": 1.0,
            "_source": {
                "name": "Marley",
                "age": 2
            }
        }
    ]

    ...
}

If we can find a vulnerable template somewhere that uses the triple brace syntax to search this index, we may be able to return all of the documents – even if we don’t know a single name to put in the query. A vulnerable search template might look something like this:

POST /_scripts/vuln-search-template HTTP/1.1
Host: example.com
Content-Length: 263

...

{
    "script": {
        "lang": "mustache",
        "source": {
            "query": {
                "match": {
                    "name": {
                        "query": "{{{name}}}"
                    }
                }
            }
        }
    }
}

You can see the {{{name}}} placeholder here. Before the request is passed to Elasticsearch, the placeholder is replaced with the supplied name value. In theory, the name in the index must match the input for a document to be returned. If we pass the name Marley as below, Elasticsearch will return the document with that feline’s name and age:

POST /kitties/_search/template HTTP/1.1
Host: example.com

...

{
  "id": "vuln-search-template",
  "params": {
    "name": "Marley"
  }
}

But now for the catch: the placeholder uses triple brace syntax, so it is not escaped or sanitized. This means we can easily change the meaning of the JSON query. To start with, we can break out of the string that contains {{{name}}} with just a double quote character. By injecting "zero_terms_query":"all" into the query, we can effectively turn this into a match-all query, which allows us to return all documents (the equivalent of the old 'or 1 = 1' trick for SQL injection). Here is an example exploit:

POST /kitties/_search/template
Host: example.com

...

{
  "id": "vuln-search-template",
  "params": {
    "name": "\", \"zero_terms_query\":\"all\"}}}}"
  }
}

In practical terms, if an Elasticsearch template uses a triple brace placeholder and you know the placeholder name, sending a query like the one above could expose all data from that index – not good.

Doubling down on search template security

So where does this leave us? The problem is not really in the way templates are used. In fact, search templates are a powerful mechanism for sending repetitive queries with changing parameters, such as when using a search bar on a website. Used correctly, they also add a layer of security through automatic input encoding and sanitization.

The security takeaway here is that when working with templates in Elasticsearch, you need to be very careful to use the right placeholder syntax. Any search template that uses a triple brace placeholder is vulnerable to injection and could reveal your entire search index and data to attackers, so it’s a good idea to enforce the more secure double brace syntax – and also make sure you’re not running legacy or third-party code that uses the insecure version.

Extracting data from insecure Elasticsearch templates

A silent paradigm shift in web development

Enter Elasticsearch

What are search templates?

Mustache equals triple trouble

A vulnerable code example

Doubling down on search template security

Related Articles

SQL Injection Cheat Sheet

HTTP security headers: An easy way to harden your web applications

How you can disable directory listing on your web server – and why you should

JSON injection

A silent paradigm shift in web development

Enter Elasticsearch

What are search templates?

Mustache equals triple trouble

A vulnerable code example

Doubling down on search template security

Related Articles

Most Popular Articles

SQL Injection Cheat Sheet

HTTP security headers: An easy way to harden your web applications

How you can disable directory listing on your web server – and why you should

JSON injection