WordPress powers a huge share of the web, but teams outgrow it — whether it’s the monolithic architecture, plugin fatigue, or the need for structured content that works across channels. Sanity is a common destination for these teams, and the migration path is well-trodden but full of decisions.
This guide walks through the full process: exporting from WordPress, mapping your content model, handling media, and converting HTML to Portable Text.
Before You Start
Take stock of what you’re actually migrating. WordPress sites accumulate years of content, and not all of it needs to come over.
Audit your content:
- How many post types do you have? (Posts, Pages, custom post types via CPT UI or ACF)
- Which content is still relevant? Archive pages from 2014 may not be worth migrating
- What custom fields exist? (ACF fields, Yoast SEO metadata, WooCommerce product data)
- How are images referenced? (Media Library, hotlinked external URLs, inline in post HTML)
- What taxonomies matter? (Categories, tags, custom taxonomies)
Tip
Run a content audit before writing any migration code. Export your WordPress database and query it to understand the true shape of your data — the admin UI often hides complexity that the database reveals.
Step 1: Export Content from WordPress
The WordPress REST API is the cleanest way to extract content programmatically. It gives you structured JSON rather than raw database dumps.
// scripts/export-wordpress.ts
const WP_URL = "https://your-site.com/wp-json/wp/v2";
async function fetchAllPages(endpoint: string) {
const items: any[] = [];
let page = 1;
let totalPages = 1;
while (page <= totalPages) {
const res = await fetch(`${WP_URL}/${endpoint}?per_page=100&page=${page}`);
totalPages = Number(res.headers.get("x-wp-totalpages"));
const data = await res.json();
items.push(...data);
page++;
}
return items;
}
const posts = await fetchAllPages("posts");
const pages = await fetchAllPages("pages");
const categories = await fetchAllPages("categories");
const tags = await fetchAllPages("tags");
const media = await fetchAllPages("media");
// Write to disk for inspection
import { writeFileSync } from "node:fs";
writeFileSync("export/posts.json", JSON.stringify(posts, null, 2));
writeFileSync("export/pages.json", JSON.stringify(pages, null, 2));
writeFileSync("export/categories.json", JSON.stringify(categories, null, 2));
writeFileSync("export/tags.json", JSON.stringify(tags, null, 2));
writeFileSync("export/media.json", JSON.stringify(media, null, 2));
console.log(
`Exported ${posts.length} posts, ${pages.length} pages, ${media.length} media items`
);
For custom post types, append them to the export:
// If you registered a "project" post type with show_in_rest = true
const projects = await fetchAllPages("project");
writeFileSync("export/projects.json", JSON.stringify(projects, null, 2));
Warning
The REST API only exposes post types with show_in_rest enabled. If a custom post type isn’t appearing, check its registration in your theme or plugin. You may need to temporarily enable it before exporting.
Handling ACF Fields
If you use Advanced Custom Fields, install the ACF to REST API plugin. It adds an acf object to each post’s JSON response containing all custom field values.
// With ACF to REST API active, each post includes:
// post.acf.hero_image
// post.acf.subtitle
// post.acf.related_posts (array of post IDs)
Step 2: Design Your Sanity Schema
Don’t replicate your WordPress structure in Sanity. WordPress organizes content around posts and pages with metadata bolted on via custom fields. Sanity lets you model content as structured documents from the ground up.
Common mapping decisions:
| WordPress | Sanity |
|---|---|
| Posts | article document type |
| Pages | page document type (or multiple: landingPage, aboutPage) |
| Categories | category document type with references |
| Tags | Array of strings on the document, or a tag document type |
| Featured Image | image field with asset reference |
| ACF Repeater fields | Arrays of objects |
| ACF Flexible Content | Portable Text custom blocks or an array of typed objects |
| Yoast SEO fields | seo object field (title, description, ogImage) |
| Post content (HTML) | Portable Text (blockContent) |
Example Schema
// schemas/article.ts
import { defineType, defineField, defineArrayMember } from "sanity";
export default defineType({
name: "article",
title: "Article",
type: "document",
fields: [
defineField({
name: "title",
type: "string",
validation: (rule) => rule.required(),
}),
defineField({
name: "slug",
type: "slug",
options: { source: "title" },
validation: (rule) =>
rule.required().custom((slug) => {
if (!slug?.current) return "Required";
if (!/^[a-z0-9-]+$/.test(slug.current))
return "Slug must be lowercase with hyphens only";
return true;
}),
}),
defineField({
name: "publishedAt",
type: "datetime",
}),
defineField({
name: "excerpt",
type: "text",
rows: 3,
}),
defineField({
name: "featuredImage",
type: "image",
options: { hotspot: true },
}),
defineField({
name: "body",
type: "blockContent",
}),
defineField({
name: "categories",
type: "array",
of: [defineArrayMember({ type: "reference", to: [{ type: "category" }] })],
}),
defineField({
name: "seo",
type: "object",
fields: [
defineField({ name: "metaTitle", type: "string" }),
defineField({ name: "metaDescription", type: "text", rows: 2 }),
],
}),
defineField({
name: "wordpressId",
type: "number",
readOnly: true,
hidden: true,
description: "Original WordPress post ID for migration tracking",
}),
],
});
Note
Keep a wordpressId field on migrated documents. It’s hidden from editors but invaluable for debugging, re-running migrations, and mapping references between WordPress IDs and Sanity document IDs.
Step 3: Migrate Images and Media
Images need to be uploaded to Sanity’s asset pipeline before you can reference them in documents. Do this as a separate step before migrating posts.
// scripts/migrate-media.ts
import { createClient } from "@sanity/client";
import { writeFileSync, readFileSync } from "node:fs";
const client = createClient({
projectId: "your-project-id",
dataset: "production",
apiVersion: "2024-01-01",
token: process.env.SANITY_TOKEN,
useCdn: false,
});
const media = JSON.parse(readFileSync("export/media.json", "utf-8"));
// Map WordPress media IDs to Sanity asset IDs
const assetMap: Record<number, string> = {};
for (const item of media) {
const sourceUrl = item.source_url;
if (!sourceUrl) continue;
try {
const response = await fetch(sourceUrl);
const buffer = Buffer.from(await response.arrayBuffer());
const filename = sourceUrl.split("/").pop() ?? "image";
const asset = await client.assets.upload("image", buffer, {
filename,
title: item.title?.rendered,
description: item.alt_text,
});
assetMap[item.id] = asset._id;
console.log(`Uploaded: ${filename} -> ${asset._id}`);
} catch (err) {
console.error(`Failed to upload media ${item.id}: ${err}`);
}
}
// Save the map for use in post migration
writeFileSync("export/asset-map.json", JSON.stringify(assetMap, null, 2));
console.log(`Uploaded ${Object.keys(assetMap).length} of ${media.length} media items`);
Tip
Upload images before migrating posts. This gives you an assetMap that translates WordPress media IDs to Sanity asset references, which you’ll need when converting post content and featured images.
Step 4: Convert HTML to Portable Text
This is the hardest part of any WordPress migration. WordPress stores post content as HTML (with shortcodes, Gutenberg blocks, and years of editor quirks). Sanity uses Portable Text — a structured, JSON-based rich text format.
Use the @portabletext/block-tools package to handle the conversion:
npm install @portabletext/block-tools jsdom
// scripts/html-to-portable-text.ts
import { htmlToBlocks } from "@portabletext/block-tools";
import { JSDOM } from "jsdom";
import { Schema } from "@sanity/schema";
// Build a minimal schema for the converter
const schema = Schema.compile({
name: "default",
types: [
{
name: "blockContent",
type: "array",
of: [
{
type: "block",
marks: {
decorators: [
{ title: "Bold", value: "strong" },
{ title: "Italic", value: "em" },
{ title: "Code", value: "code" },
],
annotations: [
{
name: "link",
type: "object",
fields: [{ name: "href", type: "url" }],
},
],
},
},
{ type: "image" },
],
},
],
});
const blockContentType = schema
.get("blockContent")
export function convertHtmlToPortableText(html: string) {
return htmlToBlocks(html, blockContentType, {
parseHtml: (html) => new JSDOM(html).window.document,
rules: [
// Handle WordPress image blocks
{
deserialize(el, next, block) {
if (el.tagName === "IMG") {
return block({
_type: "image",
_sanity: { source: el.getAttribute("src") },
});
}
return undefined;
},
},
// Handle WordPress captions
{
deserialize(el, next, block) {
if (
el.tagName === "FIGURE" &&
el.classList.contains("wp-block-image")
) {
const img = el.querySelector("img");
const caption = el.querySelector("figcaption");
if (!img) return undefined;
return block({
_type: "image",
_sanity: { source: img.getAttribute("src") },
caption: caption?.textContent ?? undefined,
});
}
return undefined;
},
},
],
});
}
The _sanity.source field on images is a temporary marker. After conversion, you’ll replace these with proper Sanity asset references using the assetMap from Step 3.
Step 5: Run the Migration
Now bring it all together — map WordPress posts to Sanity documents, convert the HTML body, wire up images and categories.
// scripts/migrate-posts.ts
import { createClient } from "@sanity/client";
import { JSDOM } from "jsdom";
import { readFileSync } from "node:fs";
import { convertHtmlToPortableText } from "./html-to-portable-text.js";
const client = createClient({
projectId: "your-project-id",
dataset: "production",
apiVersion: "2024-01-01",
token: process.env.SANITY_TOKEN,
useCdn: false,
});
const posts = JSON.parse(readFileSync("export/posts.json", "utf-8"));
const media = JSON.parse(readFileSync("export/media.json", "utf-8"));
const assetMap = JSON.parse(readFileSync("export/asset-map.json", "utf-8"));
const categoryMap = JSON.parse(readFileSync("export/category-map.json", "utf-8"));
function buildDocument(post: any) {
// Convert HTML body to Portable Text
let body = convertHtmlToPortableText(post.content.rendered);
// Replace image source URLs with Sanity asset references
body = body.map((block: any) => {
if (block._type === "image" && block._sanity?.source) {
const assetId = findAssetByUrl(block._sanity.source, assetMap);
if (assetId) {
return {
_type: "image",
_key: block._key,
asset: { _type: "reference", _ref: assetId },
};
}
}
return block;
});
// Strip HTML from excerpt using JSDOM instead of fragile regex
const excerptDom = new JSDOM(post.excerpt.rendered);
const excerpt = excerptDom.window.document.body.textContent?.trim() ?? "";
// Build the Sanity document
const doc: any = {
_type: "article",
title: post.title.rendered,
slug: { _type: "slug", current: post.slug },
publishedAt: post.date,
excerpt,
body,
wordpressId: post.id,
categories: post.categories
.map((id: number) => categoryMap[id])
.filter(Boolean)
.map((ref: string, i: number) => ({
_type: "reference",
_ref: ref,
_key: `cat-${i}`,
})),
};
// Add featured image with alt text if it exists
if (post.featured_media && assetMap[post.featured_media]) {
const mediaItem = media.find((m: any) => m.id === post.featured_media);
doc.featuredImage = {
_type: "image",
asset: { _type: "reference", _ref: assetMap[post.featured_media] },
alt: mediaItem?.alt_text || undefined,
};
}
return doc;
}
// Batch creates to stay within Sanity's transaction size limit
const BATCH_SIZE = 100;
for (let i = 0; i < posts.length; i += BATCH_SIZE) {
const batch = posts.slice(i, i + BATCH_SIZE);
const transaction = client.transaction();
for (const post of batch) {
transaction.create(buildDocument(post));
}
await transaction.commit();
console.log(
`Batch ${Math.floor(i / BATCH_SIZE) + 1} committed (${batch.length} posts)`
);
}
console.log(`Migrated ${posts.length} posts`);
Step 6: Validate the Migration
After running the migration, verify the results before pointing your frontend at the new data.
// Count migrated articles
count(*[_type == "article" && defined(wordpressId)])
// Find articles missing body content
*[_type == "article" && (!defined(body) || length(body) == 0)]{
_id, title, wordpressId
}
// Find articles missing featured images
*[_type == "article" && !defined(featuredImage)]{
_id, title, wordpressId
}
// Check category references resolved
*[_type == "article" && length(categories) == 0]{
_id, title, wordpressId
}
Run these queries in Sanity Vision against your staging dataset. Compare the counts against your WordPress export. Any discrepancies point to edge cases in your conversion scripts — malformed HTML, missing media, or unmapped categories.
Querying Images on the Frontend
Sanity generates LQIP (Low Quality Image Placeholder) data for every uploaded image, but it’s not included automatically — you need to query it explicitly. Without it, blur-up placeholders won’t work and images will flash in without a smooth transition.
*[_type == "article"]{
title,
featuredImage {
asset->{
_id,
url,
metadata {
lqip,
dimensions { width, height }
}
},
alt,
hotspot,
crop
}
}
Common Pitfalls
Shortcodes. WordPress shortcodes ([gallery], [contact-form], plugin-specific ones) aren’t HTML — they’re placeholders that get rendered server-side. htmlToBlocks won’t know what to do with them. You need custom rules to either convert them to Portable Text custom blocks or strip them.
Gutenberg blocks. Gutenberg stores content as HTML comments (<!-- wp:paragraph -->) wrapping standard HTML. The HTML conversion handles the inner content, but Gutenberg-specific blocks (columns, cover images, reusable blocks) need custom deserializers.
Character encoding. WordPress stores content with HTML entities (&, ’, “). Make sure your conversion pipeline decodes these properly — they can slip through and appear as literal text in Sanity.
Internal links. WordPress posts link to each other with full URLs (https://your-site.com/2024/my-post). After migration, these point to your old WordPress site. You’ll need a post-migration step to rewrite internal links to either relative paths or Sanity document references.
Migration Checklist
- Audit WordPress content (post types, custom fields, media, taxonomies)
- Design Sanity schemas — don’t replicate WordPress structure
- Export content via REST API
- Upload media to Sanity and save the asset map
- Migrate categories/taxonomies first (posts reference them)
- Convert HTML to Portable Text with custom rules for your content
- Run migration against a cloned dataset
- Validate with GROQ queries — compare counts, check for missing data
- Handle edge cases (shortcodes, Gutenberg blocks, internal links)
- Run against production
- Set up redirects from old WordPress URLs to new paths