Gatsby 5: How to query data from multiple GraphQL sources
Introduction
It's been a while since I migrated my blog from WordPress to Gatsby.
When I started the migration process I initially opted to use JavaScript and the
gatsby-transformer-javascript-frontmatter
plugin to build my blog pages.
A while ago, I decided to give MDX a try. MDX provided almost the same features I could get with JavaScript, plus the ability to have posts with a much cleaner and clearer syntax.
The main challenge now was to be able to query data from both sources, JavaScript frontmatter
and MDX frontmatter
,
to provide a consistent and clean developer experience.
Fortunately, both plugins used a similar frontmatter
structure, making the task straightforward.
This is my particular case and main motivation. However, I'm sure that there are many other use cases that might lead to this requirement. In this article, I will share my journey and guide you through customizing the Gatsby GraphQL schema to query data from multiple sources.
Customizing the GraphQL schema
The GraphQL schema consumed by Gatsby can be customized by leveraging the
createSchemaCustomization
callback in the gatsby-node.js
file.
Here's a basic structure of the createSchemaCustomization function:
exports.createSchemaCustomization = ({actions}) => {
const {createTypes} = actions;
const typeDefs = `
# type definitions
`;
createTypes(typeDefs);
};
To enable data consumption from multiple sources, we start by creating a GraphQL interface that both JavaScript and MDX frontmatter
nodes will implement.
Here's a snippet of the interface definition that is part of the typeDefs
constant defined in the previous structure:
interface BlogPost implements Node {
id: ID!
frontmatter: Frontmatter!
fields: FrontmatterDerivedFields!
allFrontmatter: JSON!
allFields: JSON!
}
The BlogPost
interface extends the base Node
interface, which is common to all nodes in Gatsby.
We then define the fields that are common to both JavaScript and MDX frontmatter
nodes.
In my case, the frontmatter
field, is the one that both official frontmatter
plugins provide.
And the fields
field, which is one that I generate during the
onCreateNode
callback for both node types.
Next, we define the Frontmatter
and FrontmatterDerivedFields
types:
type Frontmatter {
slug: String!
langKey: String!
author: String!
title: String!
created: String!
lastModified: String!
categories: [String!]!
tags: [String!]!
description: String!
image: String!
readingTime: String!
comments: File @fileByRelativePath
pageQuery: String
}
type FrontmatterDerivedFields {
type: String!
filePath: String!
createdYear: Int!
createdMonth: String!
createdDay: String!
createdYearMonth: Date! @dateformat
createdDateShort: Date! @dateformat
lastModifiedDateShort: Date! @dateformat
}
While not strictly necessary (as Gatsby can infer types),
these type definitions ensure that each frontmatter
node must contain all the specified fields.
However, if we don't add these types, at least one node of each frontmatter
type must contain all the fields.
Lastly, we define the Mdx
and JavascriptFrontmatter
types, both implementing the BlogPost interface:
type Mdx implements Node & BlogPost {
id: ID!
frontmatter: Frontmatter!
fields: FrontmatterDerivedFields!
allFrontmatter: JSON! @proxy(from: "frontmatter")
allFields: JSON! @proxy(from: "fields")
}
type JavascriptFrontmatter implements Node & BlogPost {
id: ID!
frontmatter: Frontmatter!
fields: FrontmatterDerivedFields!
allFrontmatter: JSON! @proxy(from: "frontmatter")
allFields: JSON! @proxy(from: "fields")
}
This is all the content that needs to be added to the typeDefs
constant.
With that, we have defined the GraphQL schema that will be consumed by Gatsby and that will allow us to query data from
the MDX and JavaScript frontmatters
.
Note the @proxy
directive in the allFrontmatter
and allFields
fields.
These are artificial fields that replicate those provided in the frontmatter
and fields
fields.
I use this as a hack to be able to query frontmatter
fields without the need to specify them in the GraphQL query.
Let us now see how we can query data from both sources.
Querying data from multiple sources
With our BlogPost
interface defined, we can now query data from both sources efficiently.
Here's a sample query that retrieves the title
and slug
fields from both sources:
query MyQuery {
allBlogPost {
edges {
node {
frontmatter {
slug
title
}
}
}
}
}
Rather than running two separate queries and merging the results on the client, this approach allows us to run a single query that retrieves the data from both sources. In addition, we'll also be able to sort the merged results by any field:
query MyQuery {
allBlogPost(sort: {frontmatter: {created: ASC}}) {
# ...
}
}
Conclusion
In this article, I've demonstrated how to customize the Gatsby GraphQL schema to query data from multiple sources.
More specifically, the frontmatter
field that is provided by the JavaScript and MDX frontmatter
plugins.
This simple yet effective process ensures a consistent developer experience when working with data from diverse sources. I hope this article proves useful in addressing your unique use cases and streamlining your development workflow.