My previous article gave some examples of using Drupal’s Migrate API to modify content when restructuring a site. The last couple of examples in that article imported content into newly added fields. This article, the second part of the two-part series, will look further at importing content, with examples that create paragraphs, nodes, URL aliases, and URL redirects.
These examples are based on work I did for one of our clients as part of a "website refresh". They wanted to revamp their website’s About Us section, replacing their old pages with a redesigned, reorganized, rewritten set of pages. I needed to update the site’s structure and theme to support the new pages. Then I needed to create the pages themselves, using content provided by the client.
I didn’t only need to create the pages on the production site. First, I wanted to create the pages on my local dev site so that I could incrementally test my configuration and theme changes. Then, I wanted to deploy those changes and the new pages to a staging site so that the client could review them before they went live. Finally, I would deploy all of that to the production site. It took some up-front work to write the migrations, but it paid off once I was able to import content just by running commands.
Since I already covered the modules to install and the commands to run in the previous article, I’ll jump right into the code examples.
Creating nodes with paragraphs
The content type for the new pages was called Landing Page. It had a Components field in which a content editor could add an unlimited number of paragraphs of various types.
In a Drupal migration, you specify a destination, which is the type of thing that you’re going to be creating or modifying. For nodes, the destination plugin is entity:node
. For paragraphs, the destination plugin is entity_reference_revisions:paragraph
. When you migrate nodes that have a paragraphs reference field, you do it in two stages: first migrate the paragraphs, then migrate the nodes.
For our client’s "About Us" section, there were 16 paragraph types, some of them nested. I wrote a separate migration for each paragraph type. Here, I’ll greatly simplify things by pretending there’s just one paragraph type, called Formatted Text, that just has a Text (formatted, long) field called Body. Here’s a migration that creates Formatted Text paragraphs:
id: about_formatted_text_paragraphs label: 'Create Formatted Text paragraphs for Landing Page nodes in the About Us section' source: plugin: embedded_data data_rows: - id: about_intro body: |- <p> I'm your only friend. I'm not your only friend. But I'm a little glowing friend. But really I'm not actually your friend. </p> - id: about_mission body: |- <p> I have a secret to tell from my electrical well. It's a simple message and I'm leaving out the whistles and bells. </p> <p> So the room must listen to me filibuster vigilantly. My name is blue canary. One note, spelled L-I-T-E. </p> - id: about_history body: |- <p> There's a picture opposite me of my primitive ancestry, which stood on rocky shores and kept the beaches shipwreck-free. Though I respect that a lot, I'd be fired if that were my job after killing Jason off and countless screaming Argonauts. </p> ids: id: type: string process: field_body/value: body field_body/format: plugin: default_value default_value: basic_html destination: plugin: 'entity_reference_revisions:paragraph' default_bundle: formatted_text
- Although I could have used Migrate Source CSV to import the data from a CSV file (as I showed in the previous article), given how the content was provided to me, it was going to be about the same effort to use
embedded_data
, so I just went with that. - In the source data, the
body
properties use the symbol|-
. This is a combination of two operators that tell YAML how to interpret multiline strings.|
says to preserve newlines.-
makes an exception for newlines at the end of the string (they should be discarded). - The process for
field_body/value
copies thebody
value from the source data into the Body field of the Formatted Text paragraph. - The process for
field_body/format
sets the Body field’s text format tobasic_html
.
When you run drush migrate:import about_formatted_text_paragraphs
, that creates the paragraphs, but they’re not yet attached to any nodes. To check if the paragraphs were imported correctly, you could list the paragraphs in Views or inspect the database (specifically, the migrate_map_about_formatted_text_paragraphs
, paragraphs_item
, and paragraph__field_body
tables).
Now let’s move on to the next stage: creating the nodes that reference the paragraphs. Here’s an extremely simplified version of the migration I used for the Landing Page nodes:
id: about_landing_page_nodes label: 'Create Landing Page nodes in the About Us section' source: plugin: embedded_data data_rows: - id: about title: 'About Us' components: - paragraph: about_intro - paragraph: about_mission - paragraph: about_history ids: id: type: string process: title: title tmp_components_lookup: plugin: sub_process source: components process: tmp_paragraph: plugin: migration_lookup migration: - about_formatted_text_paragraphs source: paragraph no_stub: true field_landing_page_components: plugin: sub_process source: '@tmp_components_lookup' process: target_id: tmp_paragraph/0 target_revision_id: tmp_paragraph/1 uid: plugin: default_value default_value: 1 destination: plugin: 'entity:node' default_bundle: landing_page migration_dependencies: required: - about_formatted_text_paragraphs
- This migration just creates one Landing Page node, which references the three Formatted Text paragraphs created by the previous migration.
- In the source plugin’s
data_rows
, theparagraph
values correspond toid
s in the previous migration’s source data. - The processes for
tmp_components_lookup
(a pseudofield) andfield_landing_page_components
work together to populate the Landing Page node’s Components field.- The process for
tmp_components_lookup
uses thesub_process
plugin to loop through theparagraph
items listed undercomponents
in the source data. For each item,migration_lookup
looks up the paragraph whoseid
in the previous migration matches theparagraph
in this migration.- The output of the
migration_lookup
plugin here is not a single ID, like it would be if you were looking up a node. Because a paragraph is anentity_reference_revisions
instead of anentity
, the output here is an array that has the paragraph ID at key 0 and the paragraph revision ID at key 1. - I store that output in a sub-property called
tmp_paragraph
so that I’ll be able to access it in the process forfield_landing_page_components
. - So the output of the
tmp_components_lookup
process is an array, in which each value is an array that maps the keytmp_paragraph
to an array with 2 values (the paragraph ID and paragraph revision ID).
- The output of the
- The process for
field_landing_page_components
loops through the array created by thetmp_components_lookup
property. For each item, it extracts the paragraph ID and paragraph revision ID and stores them in the corresponding properties offield_landing_page_components
.
- The process for
Creating URL aliases
For each of the new Landing Page nodes, I needed to assign a URL alias. Here’s an example that generates a path_alias
entity for the node created by the previous migration:
id: about_landing_page_aliases label: 'Create URL aliases for Landing Pages in the About section' source: plugin: embedded_data data_rows: - id: about alias: /about ids: id: type: string constants: path_prefix: /node/ process: landing_page_nid: plugin: migration_lookup migration: about_landing_page_nodes source: id no_stub: true path: plugin: concat source: - constants/path_prefix - '@landing_page_nid' alias: alias destination: plugin: 'entity:path_alias' migration_dependencies: required: - about_landing_page_nodes
- In the source plugin’s
data_rows
, theid
matches the one in the previous migration’s source data. - The process for
landing_page_nid
(a pseudofield) locates the node that the previous migration created for the source data with thatid
. The output of the process is the node ID. - The process for
path
constructs a path from the node ID, for example/node/99
from ID 99, and assigns that to the URL alias’spath
field. - The process for
alias
sets that field in the URL alias to thealias
specified in the source data.
Creating URL redirects
Once the new Landing Page nodes were created, I needed to delete the old pages and redirect their former paths to the new nodes. Here’s a migration that creates URL redirects (using the Redirect module):
id: about_redirects label: 'Create URL redirects for the About section' source: plugin: embedded_data data_rows: - from: node/22 to: /about - from: constantinople to: /istanbul - from: new/york to: /new/amsterdam ids: from: type: string constants: uri_prefix: 'internal:' status_code: 301 process: redirect_source: from redirect_redirect: plugin: concat source: - constants/uri_prefix - to status_code: constants/status_code destination: plugin: 'entity:redirect'
- This migration doesn’t depend on any previous migrations.
- The migration sets the following fields on each redirect that it creates:
redirect_source
, whose field type is defined in the Redirect module, but for the purpose of this migration we can treat as Text (plain)redirect_redirect
, a Link fieldstatus_code
, an Number field
- The process for
redirect_source
grabs thefrom
value from the source data. This field wants the path without the initial slash. - The process for
redirect_redirect
constructs a URI from theto
value in the source data. (As an alternative to using theconcat
plugin, in the source data I could have added theinternal:
prefix to eachto
value, e.g.to: internal:/about
.) - The process for
status_code
sets the value to 301.
To be clear, this migration doesn’t delete the old pages. When I need to delete or unpublish pages after adding redirects, I use the node bulk form, Views Bulk Operations, or a Drush script.
Final thoughts
Are migrations the best way to deploy new content? Not necessarily. It depends on the situation. Other options include Workspaces, Feeds, and Node Export,
One thing I appreciate about Migrate API is that it’s flexible enough to handle a lot of different situations. Once you get the hang of it, you can use the same general procedure to create or update any kind of entity, including ones that reference other entities.
I also like that, when things go wrong with migrations, they’re usually fairly easy to debug. With Migrate Devel installed, you can run drush migrate:import
with the --migrate-debug
flag to print the source and destination property values. If a process plugin isn’t behaving as expected, you can find it in the documentation and follow the link to its source code to see what it’s actually doing.
In these examples I have not touched on the PHP-language portion of the Migrate API at all. I actually haven’t had to write any custom migration plugins so far. Using the comprehensive set of plugins already provided by core and contrib modules, I’ve been able to do everything I need just in YAML.
I had to stretch my brain a bit to get used to thinking in terms of the YAML API instead of PHP. It’s a little weird, but kind of fun, to figure out how to write what are essentially small programs in a format that was designed for data serialization. Nevertheless, it’s an expressive API in which you can accomplish a lot with just a little code. As much as I enjoy writing code, the less custom code there is on the site, the fewer the bugs and the lower the long-term maintenance burden.
Jaymie Strecker has been a software developer at Kosada for 13 years. There’s only one thing that they know how to do well, and they’ve often been told that you only can do what you know how to do well, and that’s be them.