graphql-deduplicator
A GraphQL response deduplicator. Removes duplicate entities from the GraphQL response.
graphql-deduplicator
A GraphQL response deduplicator.
Removes duplicate entities from the GraphQL response.
Client support
graphql-deduplicator
works with any GraphQL client that appends __typename
and id
fields to every resource. If your client automatically does not request __typename
and id
fields, these fields can be specified in your GraphQL query.
graphql-deduplicator
has been tested with apollo-client
.
How does it work?
__typename
and an id
values are used to construct a resource identifier. The resource identifier is used to normalize data. As a result, when GraphQL API response contains a resource with a repeating identifier, the apollo-client
is going to read only the first instance of the resource and ignore duplicate entities. graphql-deduplicator
strips body (fields other than __datatype
and id
) from all the duplicate entities.
Motivation
graphql-deduplicator
is designed to reduce the GraphQL response size by removing body of duplicate entities. This allows to make queries that return large datasets of repeated data without worrying about the cost of the response body size, time it takes to parse the response or the memory the reconstructed object will consume.
Real-life example
Consider the following schema:
interface Node { id: ID!} type Movie implements Node { id: ID! name: String! synopsis: String!} type Event implements Node { id: ID! movie: Movie! date: String! time: String!} type Query { events ( date: String ): [Event!]!}
Using this schema, you can query events for a particular date, e.g.
{ events (date: "2017-05-19") { __typename id date time movie { __typename id name synopsis } }}
Note: If you are using apollo-client
, then you do not need to include __typename
when constructing the query.
The result of the above query will contain a lot of duplicate information.
I've run into this situation when building https://applaudience.co.uk. A query retrieving 300 events produced a response of 1.5MB. When gziped, that number dropped to 100KB. However, the problem is that upon receiving the response, the browser needs to parse the entire JSON document. Parsing 1.5MB JSON string is (a) time consuming and (b) memory expensive.
The good news is that we do not need to return body of duplicate records (see How does it work?). For all duplicate records we only need to return __typename
and id
. This information is enough for apollo-client
to identify the resource as duplicate and skip it. In case when a response includes large and often repeated fragments, this will reduce the response size 10x, 100x or more times.
In case of the earlier example, the response becomes:
The synopsis
and name
fields have been removed from the duplicate Movie
entity.
Usage
Server-side
You need to format the final result of the query. If you are using graphql-server
, configure formatResponse
, e.g.
;;; const app = ; app; app;
Client-side
apollo-client
Example usage with You need to modify the server response before it is processed by the GraphQL client. If you are using apollo-client
, use link
configuration to setup an afterware, e.g.
// @flow ;;;;; const httpLink = credentials: 'include' uri: '/api'; const inflateLink = { return ;}; const apolloClient = cache: link: ; ;
apollo-boost
Example usage with It is not possible to configure link with apollo-boost
. Therefore, it is not possible to use graphql-deduplicator
with apollo-boost
. Use apollo-client
setup.
Note: apollo-boost
will be discontinued starting Apollo Client v3.
Best practices
Enable compression conditionally
Do not break integration of the standard GraphQL clients that are unaware of the graphql-deduplicator
.
Use deflate
only when client requests to use graphql-deduplicator
, e.g.
// Server-side app;
// Client-side const httpLink = credentials: 'include' uri: '/api?deduplicate=1';
Apollo Server
Example using with// [..] const createContext = req : return req // [..] public { const context graphqlResponse = o // Ensures `?deduplicate=1` is used in the request if contextreqquerydeduplicate && graphqlResponsedata && !graphqlResponsedata__schema const data = return ...o graphqlResponse: ...graphqlResponse data return o } const apolloServer = // [..] context: createContext extensions: