Gatsby 5: How to query data from multiple GraphQL sources
Introduction
It's been a while since I migrated my blog from WordPress to Gatsby.
When I started the migration process I initially opted to use JavaScript and the
gatsby-transformer-javascript-frontmatter
plugin to build my blog pages.
A while ago, I decided to give MDX a try. MDX provided almost the same features I could get with JavaScript, plus the ability to have posts with a much cleaner and clearer syntax.
The main challenge now was to be able to query data from both sources, JavaScript frontmatter and MDX frontmatter,
to provide a consistent and clean developer experience.
Fortunately, both plugins used a similar frontmatter structure, making the task straightforward.
This is my particular case and main motivation. However, I'm sure that there are many other use cases that might lead to this requirement. In this article, I will share my journey and guide you through customizing the Gatsby GraphQL schema to query data from multiple sources.
Customizing the GraphQL schema
The GraphQL schema consumed by Gatsby can be customized by leveraging the
createSchemaCustomization
callback in the gatsby-node.js file.
Here's a basic structure of the createSchemaCustomization function:
exports.createSchemaCustomization = ({actions}) => {
const {createTypes} = actions;
const typeDefs = `
# type definitions
`;
createTypes(typeDefs);
};To enable data consumption from multiple sources, we start by creating a GraphQL interface that both JavaScript and MDX frontmatter nodes will implement.
Here's a snippet of the interface definition that is part of the typeDefs constant defined in the previous structure:
interface BlogPost implements Node {
id: ID!
frontmatter: Frontmatter!
fields: FrontmatterDerivedFields!
allFrontmatter: JSON!
allFields: JSON!
}The BlogPost interface extends the base Node interface, which is common to all nodes in Gatsby.
We then define the fields that are common to both JavaScript and MDX frontmatter nodes.
In my case, the frontmatter field, is the one that both official frontmatter plugins provide.
And the fields field, which is one that I generate during the
onCreateNode
callback for both node types.
Next, we define the Frontmatter and FrontmatterDerivedFields types:
type Frontmatter {
slug: String!
langKey: String!
author: String!
title: String!
created: String!
lastModified: String!
categories: [String!]!
tags: [String!]!
description: String!
image: String!
readingTime: String!
comments: File @fileByRelativePath
pageQuery: String
}
type FrontmatterDerivedFields {
type: String!
filePath: String!
createdYear: Int!
createdMonth: String!
createdDay: String!
createdYearMonth: Date! @dateformat
createdDateShort: Date! @dateformat
lastModifiedDateShort: Date! @dateformat
}While not strictly necessary (as Gatsby can infer types),
these type definitions ensure that each frontmatter node must contain all the specified fields.
However, if we don't add these types, at least one node of each frontmatter type must contain all the fields.
Lastly, we define the Mdx and JavascriptFrontmatter types, both implementing the BlogPost interface:
type Mdx implements Node & BlogPost {
id: ID!
frontmatter: Frontmatter!
fields: FrontmatterDerivedFields!
allFrontmatter: JSON! @proxy(from: "frontmatter")
allFields: JSON! @proxy(from: "fields")
}
type JavascriptFrontmatter implements Node & BlogPost {
id: ID!
frontmatter: Frontmatter!
fields: FrontmatterDerivedFields!
allFrontmatter: JSON! @proxy(from: "frontmatter")
allFields: JSON! @proxy(from: "fields")
}This is all the content that needs to be added to the typeDefs constant.
With that, we have defined the GraphQL schema that will be consumed by Gatsby and that will allow us to query data from
the MDX and JavaScript frontmatters.
Note
Note the @proxy directive in the allFrontmatter and allFields fields.
These are artificial fields that replicate those provided in the frontmatter and fields fields.
I use this as a hack to be able to query frontmatter fields without the need to specify them in the GraphQL query.
Let us now see how we can query data from both sources.
Querying data from multiple sources
With our BlogPost interface defined, we can now query data from both sources efficiently.
Here's a sample query that retrieves the title and slug fields from both sources:
query MyQuery {
allBlogPost {
edges {
node {
frontmatter {
slug
title
}
}
}
}
}Rather than running two separate queries and merging the results on the client, this approach allows us to run a single query that retrieves the data from both sources. In addition, we'll also be able to sort the merged results by any field:
query MyQuery {
allBlogPost(sort: {frontmatter: {created: ASC}}) {
# ...
}
}Conclusion
In this article, I've demonstrated how to customize the Gatsby GraphQL schema to query data from multiple sources.
More specifically, the frontmatter field that is provided by the JavaScript and MDX frontmatter plugins.
This simple yet effective process ensures a consistent developer experience when working with data from diverse sources. I hope this article proves useful in addressing your unique use cases and streamlining your development workflow.
