CSIPE

Published

- 36 min read

Building Secure APIs with GraphQL


Secure Software Development Book

How to Write, Ship, and Maintain Code Without Shipping Vulnerabilities

A hands-on security guide for developers and IT professionals who ship real software. Build, deploy, and maintain secure systems without slowing down or drowning in theory.

Buy the book now
The Anonymity Playbook Book

Practical Digital Survival for Whistleblowers, Journalists, and Activists

A practical guide to digital anonymity for people who can’t afford to be identified. Designed for whistleblowers, journalists, and activists operating under real-world risk.

Buy the book now
The Digital Fortress Book

The Digital Fortress: How to Stay Safe Online

A simple, no-jargon guide to protecting your digital life from everyday threats. Learn how to secure your accounts, devices, and privacy with practical steps anyone can follow.

Buy the book now

Introduction

GraphQL has revolutionized the way APIs are designed, offering flexibility and efficiency by allowing clients to request exactly the data they need. However, its dynamic nature introduces unique security challenges, such as over-fetching, query complexity, and unauthorized access. This article provides a comprehensive guide to building secure GraphQL APIs, focusing on techniques like validation, authentication, and authorization.

Unlike REST APIs, where you can add security controls at the router level and rely heavily on HTTP conventions, GraphQL requires you to think about security throughout your entire stack — from schema design and query validation, through resolver execution, all the way to transport configuration. The earlier you build security into your GraphQL architecture, the less costly it becomes to maintain.

Why Security Is Essential for GraphQL APIs

1. Dynamic Query Execution

GraphQL allows clients to construct their queries dynamically, increasing the risk of abuse through malicious or overly complex queries.

2. Fine-Grained Data Access

While this flexibility is powerful, it requires careful control to prevent unauthorized data exposure.

3. Increased Attack Surface

The schema, resolvers, and underlying systems all need to be secured to protect the API.

Architecture of a Secure GraphQL API

  1. Frontend Client:
  • Sends structured GraphQL queries.
  • Handles authentication tokens for API access.
  1. GraphQL Server:
  • Validates incoming queries.
  • Enforces authentication and authorization rules.
  • Optimizes query resolution to prevent over-fetching.
  1. Database:
  • Ensures secure storage and retrieval of data.
  1. Middleware:
  • Adds layers for rate limiting, query analysis, and logging.

Understanding GraphQL’s Execution Pipeline

To write effective security controls, you need to understand exactly what happens between the moment a client sends a query and the moment data leaves your server. GraphQL processes every request through a well-defined pipeline, and security controls must hook into specific stages of that pipeline to be effective.

Stage 1: Parsing

The GraphQL engine receives a raw string — the query text — and parses it into an Abstract Syntax Tree (AST). This stage is purely syntactic: the engine checks that the query is valid GraphQL syntax. Parsing happens before any schema validation, which means a malformed or deliberately oversized query string can consume CPU cycles just by being parsed. This is one reason why setting a request body size limit at the HTTP layer (before GraphQL ever sees the request) is important — it prevents degenerate inputs from reaching the parser at all.

Stage 2: Validation

After parsing, the engine validates the AST against your schema. This is where custom validation rules — including depth limits and cost analysis rules — are executed. If a query fails validation, the engine returns validation errors immediately without executing any resolvers. This is your first and cheapest line of defense against malformed or malicious queries, because it rejects bad requests before touching your database or any external service.

The validationRules option in Apollo Server accepts an array of rule functions that run at this stage. The NoIntrospection rule and graphql-depth-limit rules both operate here. Custom rules you write can inspect the query’s AST to enforce application-specific constraints, such as blocking queries that request certain field combinations.

Stage 3: Execution

Once the query passes validation, the engine begins executing resolvers. Starting from the root query or mutation fields, it traverses the selection set and calls the corresponding resolver function for each requested field. Resolvers execute in parallel where possible for fields at the same level, and sequentially for mutations (to prevent race conditions on writes).

This is the stage where authorization decisions must be made. By the time a resolver runs, the query is already committed to execution — there is no way to roll it back without returning an error from the resolver itself. This is why inserting authorization checks inside each resolver (or via permission middleware that wraps resolvers) is so important. A resolver that returns data without checking whether the current user is allowed to access it has already leaked that data to the execution result.

Stage 4: Response Formatting

After all resolvers complete, the engine assembles the result object and applies error formatting. Any errors thrown by resolvers are collected into the errors array in the response, while successfully resolved fields populate the data object. This separation of data and errors is unique to GraphQL — a response can contain both partial data and errors simultaneously, which creates subtleties in authorization. A resolver that throws an authorization error for one field does not necessarily prevent sibling fields from resolving successfully. You must decide deliberately whether a partial response is safe, or whether an authorization failure on any field should abort the entire response.

Why This Pipeline Matters for Security

Understanding the execution pipeline reveals why there is no single silver-bullet control for GraphQL security. Depth limiting at the validation stage prevents expensive queries, but does nothing for a shallow query that requests sensitive data from a resolver without an auth check. JWT verification in middleware catches unauthenticated requests before they reach any resolver, but does nothing if the JWT is valid and the user simply lacks permission for the specific data requested. Complete GraphQL security requires controls at every stage of this pipeline, coordinated to work together.


Field-Level Security and Sensitive Data Exposure

One of GraphQL’s most powerful features is also one of its most dangerous from a security perspective: the ability for clients to request exactly the fields they want means your security controls must account for every field individually, not just the top-level query entry points.

What Field-Level Data Exposure Looks Like

Consider a User type that evolves over time as your application adds features. Developers add fields like internalScore, fraudFlags, passwordResetToken, and lastFailedLoginAttempt directly to the type because it is convenient. The schema grows organically. Six months later, an unauthenticated request can retrieve all of these fields if none of them have individual access controls. Because GraphQL clients request fields explicitly, some of this exposure may go unnoticed — a security researcher simply needs to request those fields to find out if they are accessible.

Designing Types to Minimize Exposure

The first line of defense is schema design. Split types into public and private variants when subsets of fields require different access levels:

  • Create a PublicUserProfile type with only the fields safe for any authenticated user to see
  • Create a PrivateUserProfile type that extends PublicUserProfile with sensitive fields, accessible only to admins or the user themselves
  • Expose user(id: ID!): PublicUserProfile! on the public-facing query type and a separate adminUser(id: ID!): PrivateUserProfile! on the admin query type

This approach makes the access control boundary visible at the schema level rather than buried in resolver logic. When a new developer reads the schema, they immediately understand which fields require elevated access.

Preventing Accidental Field Exposure

When using code-first schema generation (as opposed to SDL-first), it is easy to accidentally expose internal model properties by automatically mapping database columns to GraphQL fields. Adopt an explicit allowlist approach: only expose fields that you have consciously decided to expose, rather than automatically mirroring your data model.

This applies to error messages as well. GraphQL resolvers can inadvertently expose internal data through well-intentioned error messages. An error that says “User with email john@example.com not found” confirms the existence of other users and leaks email address information. A properly sanitized error says only “Requested resource not found.”

The N+1 Problem and DataLoader

The N+1 query problem — where resolving a list of N items triggers N additional database queries for related data — is a performance issue, but it also has a security dimension. Without DataLoader or similar batching solutions, an attacker can construct a query that triggers thousands of database queries from a single request, effectively executing a DoS attack through legitimate resolver code. DataLoader batches and caches individual database calls within a single request, collapsing those N+1 calls into a single batched query. Security and performance align here: the same DataLoader implementation that improves response times also dramatically reduces the blast radius of a query-amplification attack.


Common GraphQL Security Challenges

1. Over-Querying

  • Attackers or clients may request excessive amounts of data, overwhelming the server.

2. Injection Attacks

  • Malicious inputs in queries can exploit vulnerabilities in resolvers or underlying databases.

3. Authorization Gaps

  • Misconfigured access controls can expose sensitive data to unauthorized users.

4. Lack of Monitoring

  • Without proper logging, detecting and mitigating attacks becomes difficult.

GraphQL vs REST: Understanding the Security Tradeoffs

When evaluating which API paradigm to adopt, security implications must be weighed alongside developer experience. Both REST and GraphQL can be secured effectively, but they present fundamentally different threat models and require different defensive approaches.

Security ConcernREST APIGraphQL API
Attack surfaceMultiple endpoints with predictable, auditable routesSingle /graphql endpoint — all operations pass through one gate
Schema discoveryNo built-in equivalentIntrospection exposes the full type system to any client by default
Response shapeFixed per endpoint — server controls the shapeClients define the shape — excess data exposure is client-driven
DoS via queriesLarge payload size or URL lengthDeeply nested or cyclically recursive queries can spike CPU/memory
Authorization granularityPer-endpoint middleware covers everything at that routeMust be enforced per-resolver — one missed resolver leaks data
BatchingRequires separate HTTP requests for each operationNative batching allows hundreds of operations in a single request
HTTP error semanticsStatus codes (401, 403, 404) signal access failuresGraphQL almost always returns 200 OK — errors live in the body
Cache-friendlinessHTTP caching works natively via GET + URL structureRequires persisted queries, DataLoader, or CDN-aware tooling

The central takeaway: GraphQL is not inherently less secure than REST, but its flexibility shifts the security burden squarely onto your application layer. The same capabilities that make it powerful — dynamic query composition, deep nested relationships, batch operations — are precisely what attackers probe first.

The Single Endpoint Problem

Because GraphQL routes everything through POST /graphql, you cannot use URL patterns or HTTP verb restrictions to segment access. Every client — authenticated or not — reaches the same handler. This moves the security perimeter from the router into your resolvers, middleware stack, and schema design choices.

A web application firewall alone will not protect a GraphQL API. WAFs parse traffic volume and known attack signatures; they rarely parse GraphQL query structure. An alias-batching attack that embeds two hundred brute-force login attempts into a single query body looks identical to a single legitimate HTTP request — completely bypassing network-level rate limits and WAF heuristics.

REST vs GraphQL: Deciding Where to Focus Your Security Effort

LayerREST focusGraphQL focus
Route securityMiddleware per endpointResolver-level middleware
Input validationBody/query-param validationSchema type validation + custom scalars
Rate limitingPer endpoint per IPPer operation type, per user, per field
DoS preventionPayload size limitsDepth limits, cost analysis, query timeouts
Schema exposureNon-issue (endpoints are the interface)Disable introspection in production

Request Lifecycle and Security Boundaries

Before writing a single line of security code, it helps to visualize where each control fits. Missing any layer creates a viable bypass path for an attacker.

   flowchart TD
    A[Client Request] --> B[HTTP / TLS Layer]
    B --> C{Rate Limiter\nIP · User}
    C -->|Blocked| D[429 Too Many Requests]
    C -->|Allowed| E[Authentication Middleware\nJWT / Session Validation]
    E -->|Invalid Token| F[401 Unauthorized]
    E -->|Valid Token| G[GraphQL Engine]
    G --> H{Query Validation\nDepth · Cost · Timeout}
    H -->|Exceeded Limits| I[400 Bad Request]
    H -->|Valid Query| J[Resolver Execution]
    J --> K{Per-Resolver Authorization\nRBAC / ABAC}
    K -->|Access Denied| L[Error in Response Body]
    K -->|Authorized| M[Data Layer\nDB / External APIs]
    M --> N[Formatted Response]

Each arrow in this diagram represents a security control. The most commonly missing layers in real-world GraphQL implementations are:

  • Per-resolver authorization — developers add auth checks to top-level query resolvers but leave nested type resolvers unguarded, allowing sensitive data to leak through indirect traversal patterns
  • Query validation (depth/cost) — depth limits and cost analysis are not enabled by default in most GraphQL libraries; they must be explicitly opted into
  • Structured logging — without logging the full query text alongside the authenticated user’s identity, detecting an attack in progress is nearly impossible

Where Apollo Server Fits in the Lifecycle

Apollo Server sits at step G in the diagram above — it handles parsing, validation, and resolution. The layers before it (rate limiting, authentication) are typically implemented as Express/Fastify middleware or as an API gateway. The layers inside it (authorization, cost limiting) are implemented as validation rules and resolver middleware.

This architecture makes it crucial that your middleware chain is ordered correctly:

   // Correct middleware order
app.use(helmet()) // Security headers
app.use(rateLimit(limiterConfig)) // Rate limiting (before auth — limits unauthenticated abuse)
app.use('/graphql', authenticate) // Extract user from JWT
app.use(
	'/graphql',
	expressMiddleware(server, {
		context: async ({ req }) => ({
			user: req.user // Attach verified identity to context
		})
	})
)

If rate limiting is placed after the GraphQL handler, an attacker can exhaust your resolvers before the limiter ever fires.


Implementing a Secure GraphQL API

Step 1: Schema Design Best Practices

1.1 Avoid Overly Broad Queries

  • Design schemas that enforce specificity and limit the amount of data a single query can fetch.

Example:

   type User {
	id: ID!
	email: String!
	profile: Profile!
}

type Profile {
	bio: String
	avatarUrl: String
}

1.2 Leverage Query Depth Limits

  • Set depth limits to prevent deeply nested queries.

Implementation (using graphql-depth-limit):

   const depthLimit = require('graphql-depth-limit')

const server = new ApolloServer({
	typeDefs,
	resolvers,
	validationRules: [depthLimit(5)]
})

Step 2: Input Validation

2.1 Validate Query Inputs

  • Use tools like Joi or Zod to ensure user inputs meet expected criteria.

Example in Resolver:

   const Joi = require('joi')

const schema = Joi.object({
	email: Joi.string().email().required(),
	password: Joi.string().min(8).required()
})

const resolver = async (parent, args, context) => {
	const { error } = schema.validate(args.input)
	if (error) throw new Error(error.details[0].message)
	// Proceed with logic
}

Step 3: Authentication and Authorization

3.1 Use JSON Web Tokens (JWTs)

  • Authenticate users via JWTs in headers and validate them on every request.

Implementation Example:

   const jwt = require('jsonwebtoken')

const authenticate = (req) => {
	const token = req.headers.authorization
	if (!token) throw new Error('Unauthorized')
	return jwt.verify(token, process.env.JWT_SECRET)
}

3.2 Implement Role-Based Access Control (RBAC)

  • Assign roles to users and restrict access based on roles.

Example in Resolver:

   const resolver = async (parent, args, { user }) => {
	if (user.role !== 'admin') throw new Error('Access Denied')
	// Proceed with logic
}

Step 4: Preventing Over-Querying

4.1 Apply Query Cost Analysis

  • Assign a cost to each query field and reject overly expensive queries.

Implementation (using graphql-cost-analysis):

   const costAnalysis = require('graphql-cost-analysis')

const server = new ApolloServer({
	typeDefs,
	resolvers,
	validationRules: [
		costAnalysis({
			maximumCost: 100,
			onComplete: (cost) => console.log('Query cost:', cost)
		})
	]
})

4.2 Limit Query Complexity

  • Combine depth and cost limits for comprehensive protection.

Step 5: Monitoring and Rate Limiting

5.1 Enable Request Logging

  • Log incoming queries and their responses for auditing and debugging.

Example with Middleware:

   app.use((req, res, next) => {
	console.log(`Query: ${req.body.query}`)
	next()
})

5.2 Apply Rate Limiting

  • Prevent abuse by limiting the number of requests per user.

Implementation (using express-rate-limit):

   const rateLimit = require('express-rate-limit')

const limiter = rateLimit({
	windowMs: 15 * 60 * 1000, // 15 minutes
	max: 100 // Limit each IP to 100 requests per windowMs
})

app.use(limiter)

Step 6: Disabling Introspection in Production

GraphQL introspection is a built-in capability that allows any client to query your server’s full schema — every type, every query, every mutation, every field name, and every argument. During development this is incredibly useful; tools like GraphiQL and Apollo Studio depend on it. In production, however, leaving introspection open gives an attacker a complete, machine-readable map of your API surface.

What an Attacker Gains from Introspection

  • A list of every query and mutation, including those not documented publicly
  • Field names that hint at sensitive data (e.g., adminPanel, internalUserId, rawPasswordHash)
  • Deprecated fields that often have weaker or missing authorization guards
  • Type relationships that reveal how to compose maximum-cost nested queries

Disabling Introspection in Apollo Server

   import { ApolloServer } from '@apollo/server'
import { NoIntrospection } from 'graphql'

const server = new ApolloServer({
	typeDefs,
	resolvers,
	validationRules: process.env.NODE_ENV === 'production' ? [NoIntrospection] : []
})

Conditional Introspection for Internal Teams

When your internal developer team needs schema discovery but external clients should not have it, gate introspection on the authenticated user’s role:

   import { NoIntrospection } from 'graphql'

const buildValidationRules = (user) => {
	const isInternalDeveloper = user?.role === 'admin' || user?.role === 'developer'
	return isInternalDeveloper ? [] : [NoIntrospection]
}

// Inside your request handler context:
const server = new ApolloServer({
	typeDefs,
	resolvers,
	context: async ({ req }) => {
		const user = await getUser(req)
		return {
			user,
			validationRules: buildValidationRules(user)
		}
	}
})

Suppressing Field Suggestions

Even after disabling introspection, the default GraphQL behavior returns “Did you mean user?” hints when a caller misspells a field name. These hints partially re-expose your schema. Remove them in production using a custom error formatter:

   const server = new ApolloServer({
	typeDefs,
	resolvers,
	formatError: (formattedError) => {
		if (process.env.NODE_ENV === 'production') {
			return {
				...formattedError,
				message: formattedError.message.replace(/Did you mean .+\?/g, '').trim()
			}
		}
		return formattedError
	}
})

This small change significantly reduces the information available to an attacker who is manually probing your schema.


Step 7: Protecting Against Batching and Enumeration Attacks

GraphQL supports two forms of batching that attackers can exploit to multiply the impact of a single request.

Array batching — the client sends an array of separate operation objects in one HTTP request body:

   [
	{ "query": "{ user(id: \"1\") { email } }" },
	{ "query": "{ user(id: \"2\") { email } }" },
	{ "query": "{ user(id: \"3\") { email } }" }
]

Alias batching — multiple parallel resolver calls embedded within a single query using GraphQL aliases:

   query BruteForce {
	attempt1: login(username: "admin", password: "password123") {
		token
	}
	attempt2: login(username: "admin", password: "qwerty") {
		token
	}
	attempt3: login(username: "admin", password: "letmein") {
		token
	}
	attempt4: login(username: "admin", password: "abc123") {
		token
	}
}

Both techniques bypass network-level rate limits because each attack looks like a single HTTP request. A WAF, nginx rate limit, or API gateway counting requests-per-second will see exactly one request — regardless of how many operations it contains.

Disabling Array Batching

If your API does not need to serve batched requests (the majority do not), disable the feature entirely:

   const server = new ApolloServer({
	typeDefs,
	resolvers,
	allowBatchedHttpRequests: false // Disabled by default in Apollo Server 4+
})

Request-Scoped Counters for Alias-Based Attacks

When alias batching targets a sensitive operation like authentication, add a request-scoped counter in the resolver:

   const resolvers = {
	Mutation: {
		login: async (parent, { username, password }, context) => {
			// Track attempts within this single request
			context.loginAttempts = (context.loginAttempts || 0) + 1
			if (context.loginAttempts > 3) {
				throw new Error('Too many login attempts within a single request')
			}

			return authenticate(username, password)
		}
	}
}

All-in-One Protection with graphql-armor

The graphql-armor middleware bundles protection against batching, excessive depth, high field counts, and field suggestions into a single composable package:

   import { ApolloArmor } from '@escape.tech/graphql-armor'

const armor = new ApolloArmor({
	costLimit: { enabled: true, maxCost: 5000 },
	depthLimit: { enabled: true, n: 7 },
	fieldCountLimit: { enabled: true, n: 30 }, // Max fields per query
	blockFieldSuggestion: { enabled: true } // Suppress schema hints
})

const server = new ApolloServer({
	typeDefs,
	resolvers,
	...armor.protect()
})

This is one of the fastest ways to establish a solid baseline of protection on any new GraphQL project. Start with conservative defaults and adjust based on your application’s legitimate query complexity requirements.


Tools for Securing GraphQL APIs

  1. Apollo Server:
  • Provides built-in support for query validation and authentication middleware.
  1. GraphQL Shield:
  • Enables declarative authorization rules.
  1. DataDog:
  • Offers monitoring and alerting for API usage patterns.
  1. GraphQL Playground:
  • Securely test queries with restricted access to sensitive data.

Real-World Use Cases

Use Case 1: E-Commerce Application

An e-commerce platform uses role-based access control (RBAC) to restrict admin-only operations like modifying product inventory. In practice, this means that every mutation — updateProductPrice, deleteProduct, adjustInventory — checks the authenticated user’s role before executing. But RBAC alone is not enough. The platform also deals with a subtler challenge: customers browsing product recommendations should only see products visible in their region, respecting import restrictions and regional pricing rules.

The team addresses this by injecting the customer’s region and tier into the GraphQL context at authentication time, then using that context in every product resolver to filter results. This context-driven filtering ensures that queries like products(category: "electronics") automatically respect regional rules without requiring the client to specify filtering parameters that it could manipulate. Because the filter is applied server-side in the context, there is no way for a client to bypass it by omitting filter arguments.

The platform also applies query cost analysis customized to their schema. Product searches involving nested variant and availability data are expensive — they retrieve data from three microservices. The team assigns higher weights to these resolvers in their cost analysis configuration, so a user cannot chain together requests for dozens of products with deep variant trees in a single query.

Use Case 2: Social Media Platform

A social media app applies query cost analysis to prevent users from fetching excessive amounts of data in a single request. Beyond query cost, the team faces a more nuanced challenge: privacy controls. Users can configure who sees their posts — public, followers only, or specific friend lists. Every query that returns posts must respect these privacy settings, regardless of how the posts are requested.

The team implements this with a privacy filter applied inside the Post resolver. Before returning any post, the resolver checks whether the requesting user falls into the post’s allowed audience. This check runs uniformly — whether a post is accessed through user(id).posts, through a feed query, or through a search result, the same privacy evaluation runs every time. By centralizing the check in the Post type resolver rather than in each calling query, the team avoids the risk of a future developer adding a new query path that accidentally bypasses privacy settings.

To prevent batching-based enumeration of user profile data — a common abuse vector for social platforms — the team also tracks how many distinct user profiles are accessed within a single request. A request that loads more than fifty distinct user objects triggers an automatic flag and rate limit, covering the alias-batching attack pattern that network-level tools cannot detect.

Use Case 3: Healthcare API

A healthcare API illustrates the stakes of getting authorization wrong. Patient records are among the most sensitive data any system can handle, and the regulatory consequences of unauthorized disclosure are severe. The team uses a combination of field-level permissions and row-level security to enforce that a physician can only access records for patients under their care.

This means authorization is not simply a role check — it requires evaluating a relationship in the database (does this physician have an active care relationship with this patient?) on every request. Rather than duplicating this relationship query in every resolver, the team implements it once in a DataLoader that batches these relationship checks efficiently. The DataLoader result is cached for the duration of the request, so a resolver that checks authorization multiple times during a complex query only hits the database once.

The team also mandates comprehensive audit logging: every GraphQL query that touches a patient record is logged with the authenticated user’s identity, the full query text, and a timestamp. These logs feed into the platform’s SIEM system, where anomalous patterns — such as a single user accessing hundreds of distinct patient records in a short timeframe — generate alerts for the security team.



Common Mistakes and Anti-Patterns

Even experienced developers make predictable security mistakes when building GraphQL APIs. Recognizing these patterns early saves substantial debugging time — and prevents incidents that are embarrassing to explain.

Anti-Pattern 1: Authorization at the Wrong Layer

The most pervasive mistake is checking authorization only at the top-level query resolver while leaving nested type resolvers completely unguarded:

   // WRONG — only the root query resolver is gated
const resolvers = {
	Query: {
		adminDashboard: async (parent, args, { user }) => {
			if (!user || user.role !== 'admin') throw new Error('Forbidden')
			return getDashboardData() // Returns an object — nested resolvers run next
		}
	},
	DashboardData: {
		// No auth check here — accessible through ANY query that returns DashboardData
		financialRecords: async (parent) => {
			return getFinancialRecords(parent.id) // Data leak!
		}
	}
}

An attacker who can reach DashboardData through any other query — even an innocuous one — can call financialRecords without being an admin.

The fix: Apply authorization at every resolver that accesses sensitive data. Use graphql-shield to enforce this declaratively and prevent a single missed resolver from becoming a vulnerability:

   import { shield, rule, and } from 'graphql-shield'

const isAdmin = rule({ cache: 'contextual' })(
	async (parent, args, ctx) => ctx.user?.role === 'admin'
)

export const permissions = shield({
	Query: {
		adminDashboard: isAdmin
	},
	DashboardData: {
		financialRecords: isAdmin // Field-level protection too
	}
})

Anti-Pattern 2: Exposing Sequential Object IDs

Using database primary keys (sequential integers or guessable UUIDs) as GraphQL node IDs enables Insecure Direct Object Reference (IDOR) attacks. An attacker can iterate through IDs to enumerate resources they should not be able to access:

   # Trivially enumerates all user records
query {
	user(id: "1001") {
		email
		phoneNumber
		ssn
	}
	user(id: "1002") {
		email
		phoneNumber
		ssn
	}
}

The fix: Use opaque global IDs to obscure internal structure, and always explicitly verify ownership in the resolver:

   import { toGlobalId, fromGlobalId } from 'graphql-relay'

const userResolver = async (parent, { id }, { currentUser }) => {
	const { type, id: rawId } = fromGlobalId(id)
	if (type !== 'User') throw new Error('Invalid ID type')

	const user = await User.findById(rawId)
	if (!user) throw new Error('Not found')

	// CRITICAL: verify the requester is allowed to access this resource
	if (user.id !== currentUser.id && currentUser.role !== 'admin') {
		throw new Error('Forbidden')
	}
	return user
}

Anti-Pattern 3: Verbose Error Messages in Production

By default many GraphQL servers return full stack traces in their error responses, leaking implementation details, file paths, dependency versions, and internal variable names to any caller who triggers an error:

   {
	"errors": [
		{
			"message": "Cannot read properties of undefined (reading 'id')",
			"extensions": {
				"exception": {
					"stacktrace": [
						"TypeError: Cannot read properties of undefined (reading 'id')",
						"    at UserResolver (/app/src/resolvers/user.js:42:18)",
						"    at field /app/node_modules/graphql/execution/execute.js:540:20"
					]
				}
			}
		}
	]
}

The fix: Use Apollo Server’s formatError hook to log full errors internally while returning sanitized messages to clients:

   class UserFacingError extends Error {
	constructor(message) {
		super(message)
		this.name = 'UserFacingError'
	}
}

const server = new ApolloServer({
	typeDefs,
	resolvers,
	formatError: (formattedError, originalError) => {
		// Always log the full error server-side
		logger.error('GraphQL error', {
			message: originalError.message,
			stack: originalError.stack
		})

		// Return safe messages in production
		if (process.env.NODE_ENV === 'production') {
			if (!(originalError instanceof UserFacingError)) {
				return { message: 'An internal error occurred', code: 'INTERNAL_ERROR' }
			}
		}
		return formattedError
	}
})

Anti-Pattern 4: Unbounded List Queries

Returning all records from a list field with no pagination enables a single query to dump an entire database:

   # Can return millions of records with a single query
query {
	users {
		id
		email
		profile {
			bio
			avatarUrl
		}
		posts {
			title
			content
			tags
			createdAt
		}
	}
}

The fix: Enforce pagination and apply a hard maximum page size enforced in the resolver:

   const resolvers = {
	Query: {
		users: async (parent, { first = 10, after }, context) => {
			const MAX_PAGE_SIZE = 100
			if (first > MAX_PAGE_SIZE) {
				throw new UserFacingError(`Cannot request more than ${MAX_PAGE_SIZE} users per page`)
			}
			return User.paginate({ limit: first, cursor: after })
		}
	}
}

Also define the schema to make the intent explicit:

   type Query {
	users(first: Int = 10, after: String): UserConnection!
}

Anti-Pattern 5: Client-Supplied Identity

Never accept identity information — user ID, role, email — from the request body or query arguments. Always derive identity server-side from a verified token. This is an obvious rule that is violated more often than you might expect, particularly when APIs are designed to be called by other internal services:

   // WRONG — the client claims its own role
const resolvers = {
	Mutation: {
		deleteUser: async (parent, { userId, callerRole }) => {
			if (callerRole !== 'admin') throw new Error('Forbidden')
			return User.delete(userId) // Client-supplied callerRole — completely bypassable
		}
	}
}

// CORRECT — role comes from the verified JWT injected into context
const resolvers = {
	Mutation: {
		deleteUser: async (parent, { userId }, { user }) => {
			if (!user || user.role !== 'admin') throw new Error('Forbidden')
			return User.delete(userId)
		}
	}
}

Anti-Pattern 6: Trusting GraphQL Variables for Authorization Logic

A subtle variant of the above: using a userId mutation variable as the authorization check rather than the context:

   // WRONG — user can pass any userId and modify records they don't own
createComment: async (parent, { postId, userId, content }, context) => {
	return Comment.create({ postId, userId, content })
}

// CORRECT — tie the new record to the authenticated user, not a client-supplied value
createComment: async (parent, { postId, content }, { user }) => {
	if (!user) throw new Error('Unauthorized')
	return Comment.create({ postId, userId: user.id, content })
}

Advanced Authorization with GraphQL Shield

graphql-shield provides a declarative permission layer that wraps your entire schema. It separates authorization logic from business logic — resolvers stay focused on data fetching while all permission rules live in one auditable file.

Defining Reusable, Composable Rules

   import { rule, shield, and, or, allow, deny } from 'graphql-shield'

// Reusable atomic rules — compose them like building blocks
const isAuthenticated = rule({ cache: 'contextual' })(
	async (parent, args, ctx) => ctx.user !== null
)

const isAdmin = rule({ cache: 'contextual' })(
	async (parent, args, ctx) => ctx.user?.role === 'admin'
)

const isEditor = rule({ cache: 'contextual' })(
	async (parent, args, ctx) => ctx.user?.role === 'admin' || ctx.user?.role === 'editor'
)

// Ownership rule — needs 'strict' cache because the result depends on args
const isResourceOwner = rule({ cache: 'strict' })(async (parent, { id }, ctx) => {
	const resource = await Post.findById(id)
	return resource?.authorId === ctx.user?.id
})

The cache option controls how often the rule function is called per request:

Cache modeWhen to use
'contextual'Rule depends only on context (same result for every field in the request)
'strict'Rule depends on resolver arguments (varies per field call)
'no_cache'Rule has side effects or must be evaluated fresh every time

Applying a Permission Matrix

   // permissions.js
import { shield, allow } from 'graphql-shield'
import { isAuthenticated, isAdmin, isEditor, isResourceOwner } from './rules.js'

export const permissions = shield(
	{
		Query: {
			publicPosts: allow, // Explicitly public
			me: isAuthenticated,
			userById: and(isAuthenticated, isAdmin),
			allUsers: isAdmin
		},
		Mutation: {
			createPost: isAuthenticated,
			updatePost: and(isAuthenticated, or(isResourceOwner, isEditor)),
			deletePost: and(isAuthenticated, or(isResourceOwner, isAdmin)),
			publishPost: and(isAuthenticated, isEditor),
			banUser: isAdmin
		},
		Post: {
			// Field-level permissions — anonymous users cannot see draft content
			draftContent: and(isAuthenticated, or(isResourceOwner, isEditor))
		}
	},
	{
		allowExternalErrors: true, // Pass through UserFacingError instances as-is
		fallbackError: 'Not authorized' // Default for any field not explicitly listed
	}
)

The fallbackError option acts as a security net — any field not explicitly configured defaults to denied rather than allowed.

Wiring Shield into Apollo Server

   import { ApolloServer } from '@apollo/server'
import { makeExecutableSchema } from '@graphql-tools/schema'
import { applyMiddleware } from 'graphql-middleware'
import { permissions } from './permissions.js'

// Build schema → apply permission middleware → pass to Apollo
const schema = makeExecutableSchema({ typeDefs, resolvers })
const protectedSchema = applyMiddleware(schema, permissions)

const server = new ApolloServer({ schema: protectedSchema })

This approach makes permissions testable in isolation — you can unit test each rule function independently without needing a running GraphQL server.


Testing GraphQL Security

Automated tests are the most reliable way to verify that security controls behave correctly. They should run on every pull request and explicitly test failure paths — not just the happy path.

Unit Testing Authorization

Test every combination of user role against every sensitive operation. Four test cases for every protected mutation is a good baseline:

   import { describe, it, expect, beforeEach } from 'vitest'
import { createTestClient } from 'apollo-server-testing'
import { buildTestServer } from './test-helpers.js'

const DELETE_POST = `
  mutation DeletePost($id: ID!) {
    deletePost(id: $id) { id }
  }
`

// Assume 'post-1' belongs to 'user-1'
describe('deletePost authorization', () => {
	it('allows the post author to delete their own post', async () => {
		const { mutate } = createTestClient(buildTestServer({ user: { id: 'user-1', role: 'user' } }))
		const res = await mutate({ mutation: DELETE_POST, variables: { id: 'post-1' } })
		expect(res.errors).toBeUndefined()
	})

	it('blocks a different authenticated user from deleting', async () => {
		const { mutate } = createTestClient(buildTestServer({ user: { id: 'user-2', role: 'user' } }))
		const res = await mutate({ mutation: DELETE_POST, variables: { id: 'post-1' } })
		expect(res.errors?.[0]?.message).toMatch(/not authorized/i)
	})

	it('allows an admin to delete any post', async () => {
		const { mutate } = createTestClient(buildTestServer({ user: { id: 'admin-1', role: 'admin' } }))
		const res = await mutate({ mutation: DELETE_POST, variables: { id: 'post-1' } })
		expect(res.errors).toBeUndefined()
	})

	it('rejects unauthenticated callers', async () => {
		const { mutate } = createTestClient(buildTestServer({ user: null }))
		const res = await mutate({ mutation: DELETE_POST, variables: { id: 'post-1' } })
		expect(res.errors?.[0]?.message).toMatch(/unauthorized/i)
	})
})

Testing Depth and Cost Limits

Your query protection limits are only effective if they actually reject out-of-bounds queries. Write explicit tests that send queries designed to exceed your configured thresholds:

   describe('Query protection limits', () => {
	// This query nests 7 levels deep — assuming a depth limit of 5
	const DEEP_QUERY = `
    query {
      user(id: "1") {
        posts {
          comments {
            author {
              posts {
                comments { author { id } }
              }
            }
          }
        }
      }
    }
  `

	it('rejects queries that exceed the configured depth limit', async () => {
		const { query } = createTestClient(buildTestServer({ user: null }))
		const res = await query({ query: DEEP_QUERY })
		expect(res.errors).toBeDefined()
		expect(res.errors?.[0]?.message).toMatch(/depth/i)
	})

	it('verifies introspection is rejected in production mode', async () => {
		process.env.NODE_ENV = 'production'
		const { query } = createTestClient(buildTestServer({ user: null }))
		const res = await query({ query: '{ __schema { types { name } } }' })
		expect(res.errors).toBeDefined()
		process.env.NODE_ENV = 'test'
	})
})

Testing That Errors Do Not Leak Internal Information

   describe('Error sanitization in production', () => {
	it('does not return stack traces in production mode', async () => {
		process.env.NODE_ENV = 'production'
		const { query } = createTestClient(buildTestServer({ user: { id: 'user-1', role: 'user' } }))
		// Trigger an internal resolver error by querying a non-existent resource
		const res = await query({ query: '{ user(id: "non-existent") { id } }' })
		const errorExtensions = res.errors?.[0]?.extensions
		expect(errorExtensions?.exception?.stacktrace).toBeUndefined()
		process.env.NODE_ENV = 'test'
	})
})

Automated Security Scanning Tools

Pair unit tests with purpose-built GraphQL security scanners for broader coverage:

ToolTypeBest Used For
InQLBurp Suite plugin / standalone CLIGenerates full query sets from introspection; ideal for manual penetration tests
ClairvoyancePython CLIExtracts schema field names even when introspection is disabled using brute-force field guessing
graphql-copLightweight Node CLIAudits common misconfigurations (introspection, batching, field suggestions) in minutes
EscapeSaaS platformContinuous testing with CI/CD integration; scans for business logic issues
graphql-voyagerVisual browser toolRenders the schema as an interactive graph — useful for manual attack-surface review

Run graphql-cop as part of your pre-deployment pipeline:

   npx graphql-cop -t https://staging.api.example.com/graphql -o json

Security Headers and Transport Layer

Transport-level security is often overlooked because developers focus their attention on the query layer. The HTTP layer, however, carries its own set of exploitable weaknesses.

Apply Security Headers with Helmet

   import helmet from 'helmet'

app.use(
	helmet({
		hsts: {
			maxAge: 63072000, // 2 years in seconds
			includeSubDomains: true,
			preload: true // Submit to the HSTS preload list
		},
		contentSecurityPolicy: {
			directives: {
				defaultSrc: ["'self'"],
				scriptSrc: ["'self'"],
				connectSrc: ["'self'"]
			}
		},
		frameguard: { action: 'deny' }, // Prevent clickjacking
		noSniff: true, // Disable MIME sniffing
		xssFilter: true
	})
)

Restrict CORS to Known Origins

A wildcard CORS policy with credentials: true is a critical misconfiguration. Always use an explicit allow-list:

   import cors from 'cors'

const allowedOrigins = (process.env.ALLOWED_ORIGINS || '').split(',')

app.use(
	'/graphql',
	cors({
		origin: (origin, callback) => {
			// Allow server-to-server requests that carry no Origin header
			if (!origin || allowedOrigins.includes(origin)) {
				callback(null, true)
			} else {
				callback(new Error(`Origin ${origin} is not permitted by CORS policy`))
			}
		},
		credentials: true, // Required for cookie-based authentication
		methods: ['POST'] // GraphQL only needs POST
	})
)

Enforce JSON Content-Type

Restricting the request content type prevents CSRF attacks via HTML form submissions. Browsers send HTML forms as application/x-www-form-urlencoded, not application/json — this simple check blocks that entire attack class on your GraphQL endpoint:

   app.use('/graphql', (req, res, next) => {
	const contentType = req.headers['content-type'] || ''
	if (req.method === 'POST' && !contentType.includes('application/json')) {
		return res.status(415).json({
			error: 'Unsupported Media Type — Content-Type must be application/json'
		})
	}
	next()
})

Request Body Size Limits

A deeply nested but otherwise valid JSON payload can consume significant memory before your depth limiter fires. Set a hard cap at the HTTP layer:

   import express from 'express'

// Before your GraphQL middleware
app.use(express.json({ limit: '1mb' }))

Transport Security Quick Reference

ControlRecommendation
HTTPSMandatory everywhere — enforce HSTS with preload: true
CORSExplicit origin allow-list; never use * with credentials: true
Content-TypeRestrict to application/json — blocks most CSRF vectors
HTTP MethodsPOST only for the GraphQL endpoint in most cases
Request Body SizeHard limit (e.g., 1 MB) before GraphQL parsing begins
CookiesUse httpOnly, secure, and sameSite=Strict flags

GraphQL Security Checklist

Use this as a final verification gate before any GraphQL API ships to production.

Schema Design

CheckWhy It Matters
All list fields require pagination with a maximum page sizePrevents single-query data dumps
No sensitive internal fields on public response typespasswordHash, secretToken, etc., should never appear in schemas
Input types defined for all mutationsEnables schema-level type validation and cleaner code
Deprecated fields removed or explicitly guardedOld fields accumulate security debt over time
No circular reference paths that enable infinite nestingDepth limiting alone does not save you from schema-level cycles

Query Protection

CheckHow to Verify
Depth limit configuredSend a query that exceeds the configured depth — expect a 400 error
Cost/complexity limit configuredSend a high-complexity query — expect rejection
Pagination with a max page size enforcedRequest first: 999999 — expect an error, not 999999 records
Query timeout setTrigger a slow resolver — verify it stops within the timeout window
Array batching disabled or cappedSend a JSON array body — expect rejection or enforcement

Authentication and Authorization

CheckHow to Verify
All non-public resolvers require authenticationSend requests with no token — expect 401 or authorization error
Authorization enforced at resolver level, not just entry pointConfirm nested type resolvers are covered in your shield rules
IDOR protection: ownership verified before returning dataFetch another user’s object with your token — expect denial
Identity derived from the verified token, not request payloadRemove userId from request variables and confirm it still works correctly

Configuration

CheckHow to Verify
Introspection disabled (or role-gated) in productionRun { __schema { types { name } } } against production — expect rejection
GraphiQL disabled in productionNavigate to /graphql in a browser — expect no interactive IDE
Stack traces absent from error responsesTrigger a known resolver error and inspect the response body
NODE_ENV=production set in the deployment environmentVerify in runtime configuration or deployment logs

Transport and Infrastructure

CheckHow to Verify
HTTPS enforced with HSTSCheck response headers for Strict-Transport-Security
CORS restricted to known originsSend a request from an unlisted origin — expect a CORS error
Security headers presentUse securityheaders.com or your test suite to verify
Rate limiting activeExceed the configured threshold — expect 429 Too Many Requests
Request body size cappedSend a 10 MB body — expect rejection before GraphQL parsing

GraphQL Security in Federated and Microservice Architectures

As organizations scale their GraphQL deployments, many move from a single monolithic GraphQL server to a federated architecture — sometimes called a supergraph — where multiple downstream subgraph services each own a slice of the overall schema. Apollo Federation and similar tools stitch these subgraphs together behind a single gateway. This architecture introduces distinct security challenges that a monolithic GraphQL API does not face.

The Gateway as a Trust Boundary

In a federated architecture, the gateway receives client requests and distributes them across multiple subgraph services. A common mistake is treating traffic that arrives at a subgraph from the gateway as inherently trusted. If a subgraph blindly accepts any request from the gateway without further verification, an attacker who compromises the gateway (or who can route traffic directly to a subgraph, bypassing the gateway entirely) gains full access to that subgraph’s data.

Every subgraph must independently verify the identity and authorization of the request it receives, not just the gateway. The standard approach is to forward the original user’s JWT from the gateway to each subgraph as a request header, and let each subgraph validate and decode it independently. Never replace the user’s JWT with a gateway-issued service token that implicitly grants elevated access to all downstream subgraphs.

Consistent Authorization Across Subgraphs

In a federated setup, a single GraphQL field might be assembled from data contributed by two or three different subgraphs. This means an authorization check that runs in subgraph A might not run in subgraph B, even though both contribute data to the same response. Teams that own individual subgraphs need to agree on and enforce consistent authorization standards — a security audit of one subgraph is not sufficient when the field it contributes is accessed through a join resolved across multiple services.

Shared authorization libraries, common middleware packages, and regular cross-subgraph security reviews help maintain consistency. Some organizations implement a dedicated authorization service that all subgraphs call to evaluate permissions, rather than implementing authorization logic independently in each.

Schema Composition and Breaking Changes

Federation adds a second dimension to schema security: the security of the federation itself. When multiple teams contribute subgraph schemas that are composed together, changes made by one team can inadvertently expose new fields through the gateway’s merged schema. Establish a schema review process — similar to a code review process — that flags any new field added to a publicly exposed type for security evaluation before it reaches production.

Tooling such as Apollo Schema Checks or GraphQL Inspector can catch breaking changes and flag newly introduced fields in a CI/CD pipeline before they are deployed to production. Integrating this check as a required pull-request gate ensures that no field ships without review.

Network Isolation for Subgraphs

Subgraph services should not be exposed directly to the public internet. They should only be reachable from within your internal network, and only the gateway should be allowed to communicate with them (subject to IP allowlisting or mutual TLS between the gateway and each subgraph). This network segmentation means that even if an attacker discovers a subgraph’s address, they cannot send arbitrary queries to it without first passing through the gateway’s authentication and authorization controls.

Distributed Rate Limiting

Standard in-process rate limiting (using express-rate-limit or similar) counts requests per process. In a federated or horizontally scaled architecture, each process maintains its own counter, and a client that sends requests to different servers bypasses the limit entirely. Federated deployments require distributed rate limiting backed by a shared store such as Redis. Libraries like rate-limiter-flexible support Redis-backed distributed counters and make this straightforward to configure.


  1. Enhanced Tooling:
  • More frameworks will emerge to simplify query cost analysis and depth limiting.
  1. Federated Security:
  • Distributed GraphQL architectures will adopt standardized security practices.
  1. AI-Powered Monitoring:
  • Artificial intelligence will detect and mitigate suspicious query patterns in real time.

Conclusion

Securing a GraphQL API requires a holistic approach, combining schema design, authentication, query analysis, and monitoring. By implementing these best practices and leveraging the right tools, you can protect your API from common vulnerabilities while delivering a seamless experience to your users.

The most important shift in mindset for GraphQL security is acknowledging that it operates differently from REST. The single endpoint, the dynamic query model, and the introspective schema all change where attacks come from and how they succeed. Rate limiting at the network level is not enough. Middleware at the server entry point is not enough. Security must be woven through every layer — from the HTTP transport all the way down to individual field resolvers.

Start with the highest-impact controls first: disable introspection in production, set depth and cost limits, enforce authentication in context setup, and apply authorization in every resolver that returns sensitive data. These four steps alone will eliminate the majority of real-world GraphQL vulnerabilities. Then layer on the defensive-in-depth controls: rate limiting, batching protection, structured error handling, security headers, and comprehensive logging.

Treat your security testing with the same rigor you give to functional testing. Write explicit tests that verify authorization failures, not just authorization successes. Confirm that your depth limits actually reject deep queries. Run a tool like graphql-cop against your staging environment before every significant release. Build these checks into your CI/CD pipeline so that regressions are caught automatically rather than discovered in production.

GraphQL’s evolution toward federated architectures and increasingly complex permission models makes ongoing security investment more important, not less. The teams that build the most trustworthy GraphQL APIs are the ones that treat security as a continuous engineering discipline — not an afterthought applied before launch. Start applying these practices today, revisit your security posture regularly, and build applications that both users and auditors can trust.