Solution to missing content after importing WordPress to Squarespace

If you have an old WordPress blog that you want to import into Squarespace, you’ll come across these directions, which note that you can’t import content from custom fields.  In a nutshell, the solution to this is to create a script to update all the posts in Wordpress’ export XML file so that their custom field content is within their primary content tag.

I recently had to do this and choose to use PHP to write the following script.  First export an XML file with just a few posts for quick testing and for referencing to figure out the tagname of the custom field you want to target.

<?php
// ACTION: add your orignal XML filename here
$xml_file_name = 'my_export_file_name';

// this increases the time limit before PHP hangs
set_time_limit(60*60);

// ACTION: enter the "namespace prefixes" from the top of the xml file
$content_prefix = 'http://purl.org/rss/1.0/modules/content/';
$wp_prefix = 'http://wordpress.org/export/1.2/';

$dom = new DOMDocument;
$dom->loadXML(file_get_contents($xml_file_name.'.xml'));

foreach ($dom->getElementsByTagName('item') as $node) {
    // this iterates through each custom field within each post
    foreach($node->getElementsByTagNameNS($wp_prefix, 'postmeta') as $meta)  {
        // ACTION: find the "nodeValue" of the custom field tag you want to pull into the main content
        $custom_field_nodValue = '_videoembed_manual';
        /* this checks to see if the custom field 
        nodeValue in the current iteration matches 
        your target custom field nodeValue */
    	if($meta->getElementsByTagNameNS($wp_prefix, 'meta_key')->item(0)->nodeValue == $custom_field_nodValue)  {
            // this saves the main content in the post
    		$content_orig = $node->getElementsByTagNameNS($content_prefix, 'encoded')->item(0)->nodeValue;
            // this saves the custom field content
    		$custom_field_content = $meta->getElementsByTagNameNS($wp_prefix, 'meta_value')->item(0)->nodeValue;
            // this combines both into a new variables
            $content_new = $custom_field_content . $content_orig;
            // ACTION: uncomment the next line if you have CDATA in the custom field
            // $content_new = $dom->createCDATASection($content_new);
            // this strips out the content from the main content tag
            $node->getElementsByTagNameNS($content_prefix, 'encoded')->item(0)->nodeValue = '';
            // this adds in the new, combined content to the main content tag
            $node->getElementsByTagNameNS($content_prefix, 'encoded')->item(0)->appendChild($content_new);
    		break;
    	}
    }       
}

// this saves the new XML into a variable
$string = $dom->saveXML();
// and creates a file from that variable with "-new" appended to the filename
file_put_contents($xml_file_name.'-new.xml', $string);
?>

This is the first time I’ve done anything like this so this might not be the most efficient approach but it worked for me.  If you know a more efficient way to do this please let me know!