Introduction to GraphQL API security

GraphQL is a data query and manipulation language for building APIs that is quickly gaining popularity. While it comes with built-in validation and type-checking, it also has its share of security shortcomings that attackers can exploit to access sensitive data.

Sven Morgenroth on Paul’s Security Weekly #714

Invicti security researcher Sven Morgenroth appeared on the Paul’s Security Weekly cybersecurity podcast to talk about GraphQL and show some of the security pitfalls of working with GraphQL APIs. Watch the full interview below and read on for an overview of GraphQL security.

GraphQL: The new kid on the web API block

In a REST-dominated world, GraphQL represents a very different approach to designing and querying APIs. Originally developed internally by Facebook in 2012 as a specialized interface for non-SQL databases, it marks a move from operation-driven to data-driven queries. Compared to working with a REST API, the ability to ask for and receive only a specific data set is a fundamental change, as is working with a single endpoint.

A typical REST API provides a separate endpoint for each operation it exposes, and its users are limited to calling one of the available endpoints with its supported parameters and working with whatever data it returns. If there is no endpoint that exactly fits your needs, it is up to you to extract information from the results that are available. This can add overhead in terms of performance and bandwidth as your application is forced to fetch more data than it needs only to discard some of it.

Calling a GraphQL endpoint, on the other hand, is like executing a database query: you specify what data you need from the schema and then get only matching results. Apart from the obvious benefit of getting data that is ready to work with, this precision can also greatly reduce bandwidth requirements. Schema definitions add validation and type checking capabilities to minimize data errors. And because the GraphQL endpoint merely passes queries to resolver functions and returns the results, you don’t need to change the API every time you need different data or want to modify a back-end service.

Defining and using GraphQL APIs

Whereas a REST API specification is a list of endpoints, a GraphQL API only has a single endpoint with a defined data schema. Similar to a database schema, a GraphQL schema specifies the names, types, and structure of data fields that the API provides. Again, where a REST API tells you what operations are available, a GraphQL API tells you what data is available. Because you have to define the schema and keep it updated, the API is self-documenting, which is a great boon for developers, especially combined with automatic type checking for both data and arguments.

Borrowing an example from the Apollo docs (Apollo being a popular GraphQL platform), a very simple schema for a book catalog might be:

type Book {
  title: String
  author: Author
}

type Author {
  name: String
  books: [Book]
}

So we have books and authors as custom types. Each book has one title (a string) and one author (an Author object), while each author has one name (a string) and any number of books (Book objects). A REST API providing such information would likely have two separate endpoints with different URLs to return a list of books and a list of authors. For GraphQL, all queries are sent to the same URL, and the results depend only on the query structure. You can also provide inputs and arguments to get exactly the data your application needs.

Mapping out a GraphQL API for attack

To test or attack an API, you need to map out its attack surface. For a REST API, this is just a matter of finding the endpoints because once you know the endpoints, you have your target URLs. But simply knowing a GraphQL endpoint doesn’t give you much information about how to use it or test it – you need to know the schema so you can build valid queries.

To make development easier, GraphQL includes an introspection feature that allows designers and developers to browse the schema and build valid queries interactively. While extremely useful for development, introspection can also allow attackers to map out the schema without access to the full schema definition, so it’s a good idea to disable it for private APIs once in production (wherever possible).

For the Apollo platform, attackers can also abuse another convenience feature, namely field name autocorrection. By default, Apollo GraphQL attempts to correct non-existent field names and returns the nearest matching name. By carefully choosing input strings, attackers can abuse autocorrection to discover valid field names by brute force.

Attack vectors for GraphQL APIs

Because it is fundamentally different from REST APIs, GraphQL has its own security challenges on top of issues common to all APIs. As with any other API, you need to consider the risk of second-level injections into underlying interfaces or data sources. Implementing proper authentication and authorization is also a common issue, though GraphQL authorization adds some unique challenges. Another pitfall that is specific to GraphQL is the risk of denial of service through recursive queries.

Authentication and authorization

Securely controlling automated access to any API (and indirectly to the data behind it) is a daunting task. With GraphQL, authorization is especially tricky because there can be multiple ways of getting to the same data through different queries – so how do you ensure that only authorized API users can access a specific field?

Unless implemented centrally and consistently, access authorization can leave gaping holes in data security, especially when bolting a GraphQL layer on top of a REST API. This is often the case in the early stages of GraphQL adoption, where the new API is added to an existing application. If authentication and authorization are not coordinated for both APIs, gaps can appear that allow attackers to access sensitive data. This is why authenticated vulnerability scanning is such an important part of security testing.

Denial of service through recursion

The freedom to rearrange and restructure queries in multiple ways is one of the biggest advantages of GraphQL – but also a potential attack vector. There is nothing to stop API users from crafting and sending recursive queries that call the same fields over and over again. Without some kind of rate limiting, this can lead to denial of service (DoS), especially with large data sets.

While extremely easy to abuse with only a few lines of query text, this DoS attack vector is hard to block because recursive queries are valid GraphQL syntax, and there could be legitimate reasons to use them. This makes it difficult to tell a resource-intensive query from an attack attempt.

Vulnerabilities in secondary contexts

The final attack avenue discussed here is not limited to GraphQL specifically but applies wherever you are passing requests across multiple layers. Imagine that you have a GraphQL API on top of a REST API that interacts with a back-end database. Without proper input sanitization, attackers may be able to target the underlying REST API by injecting, for example, a path traversal payload as a field value in a GraphQL query. As far as GraphQL is concerned, it would be a valid string value, but if inserted unquestioningly into the corresponding REST API call, that string may allow the attacker to navigate to a different endpoint to perform unauthorized operations.

Is GraphQL less secure than REST APIs?

All that being said, GraphQL is not inherently less secure than other web API architectures. In fact, compared to REST APIs, which rely on convention rather than any well-defined standard, GraphQL comes with built-in validation and type checking features, so it has at least some security out-of-the-box. As a relatively young technology, GraphQL is often bolted onto existing applications and interfaces, which further increases complexity and the potential for vulnerabilities.

Probably the biggest risk comes from users trying to implement GraphQL interfaces without understanding how they differ from REST APIs. Apart from the security issues discussed above, this also applies to other areas, like usability and performance. For example, typical per-URL caching on the client side is great for REST APIs but won’t work for GraphQL because there’s only one endpoint. And while designing a REST API is mostly a matter of exposing all the endpoints that might be useful, defining a GraphQL schema is more like designing a database.

GraphQL API security checks in Invicti

Beyond having a robust set of security checks for REST APIs, Invicti can now also apply many of these checks to GraphQL APIs. To start testing an application via its GraphQL API, you simply import an existing GraphQL schema from a file of URL, specify the endpoint URL, and let the scanner get to work. If you import the schema from a URL, Invicti can track changes to the schema and always use the latest available definition for each scan.

Web application vulnerabilities that can be automatically detected by Invicti via GraphQL APIs include:

For more information, see our support page on scanning GraphQL APIs with Invicti.

Zbigniew Banach

About the Author

Zbigniew Banach - Sr Technical Content Writer

Technical Content Writer at Invicti. Drawing on his experience as an IT journalist and technical translator, he does his best to bring web application security and cybersecurity in general to a wider audience.