Using Drupal migrations to deploy new content

My previous article gave some examples of using Drupal’s Migrate API to modify content when restructuring a site. The last couple of examples in that article imported content into newly added fields. This article, the second part of the two-part series, will look further at importing content, with examples that create paragraphs, nodes, URL aliases, and URL redirects.

These examples are based on work I did for one of our clients as part of a "website refresh". They wanted to revamp their website’s About Us section, replacing their old pages with a redesigned, reorganized, rewritten set of pages. I needed to update the site’s structure and theme to support the new pages. Then I needed to create the pages themselves, using content provided by the client.

I didn’t only need to create the pages on the production site. First, I wanted to create the pages on my local dev site so that I could incrementally test my configuration and theme changes. Then, I wanted to deploy those changes and the new pages to a staging site so that the client could review them before they went live. Finally, I would deploy all of that to the production site. It took some up-front work to write the migrations, but it paid off once I was able to import content just by running commands.

Since I already covered the modules to install and the commands to run in the previous article, I’ll jump right into the code examples.

Creating nodes with paragraphs

The content type for the new pages was called Landing Page. It had a Components field in which a content editor could add an unlimited number of paragraphs of various types.

In a Drupal migration, you specify a destination, which is the type of thing that you’re going to be creating or modifying. For nodes, the destination plugin is entity:node. For paragraphs, the destination plugin is entity_reference_revisions:paragraph. When you migrate nodes that have a paragraphs reference field, you do it in two stages: first migrate the paragraphs, then migrate the nodes.

For our client’s "About Us" section, there were 16 paragraph types, some of them nested. I wrote a separate migration for each paragraph type. Here, I’ll greatly simplify things by pretending there’s just one paragraph type, called Formatted Text, that just has a Text (formatted, long) field called Body. Here’s a migration that creates Formatted Text paragraphs:

id: about_formatted_text_paragraphs
label: 'Create Formatted Text paragraphs for Landing Page nodes in the About Us section'
source:
 plugin: embedded_data
 data_rows:
 -
 id: about_intro
 body: |-
 <p>
 I'm your only friend. I'm not your only friend. But I'm a little glowing friend. But really I'm not actually your friend.
 </p>
 -
 id: about_mission
 body: |-
 <p>
 I have a secret to tell from my electrical well. It's a simple message and I'm leaving out the whistles and bells.
 </p>
 <p>
 So the room must listen to me filibuster vigilantly. My name is blue canary. One note, spelled L-I-T-E.
 </p>
 -
 id: about_history
 body: |-
 <p>
 There's a picture opposite me of my primitive ancestry, which stood on rocky shores and kept the beaches shipwreck-free. Though I respect that a lot, I'd be fired if that were my job after killing Jason off and countless screaming Argonauts.
 </p>
 ids:
 id:
 type: string
process:
 field_body/value: body
 field_body/format:
 plugin: default_value
 default_value: basic_html
destination:
 plugin: 'entity_reference_revisions:paragraph'
 default_bundle: formatted_text
  • Although I could have used Migrate Source CSV to import the data from a CSV file (as I showed in the previous article), given how the content was provided to me, it was going to be about the same effort to use embedded_data, so I just went with that.
  • In the source data, the body properties use the symbol |-. This is a combination of two operators that tell YAML how to interpret multiline strings. | says to preserve newlines. - makes an exception for newlines at the end of the string (they should be discarded).
  • The process for field_body/value copies the body value from the source data into the Body field of the Formatted Text paragraph.
  • The process for field_body/format sets the Body field’s text format to basic_html.

When you run drush migrate:import about_formatted_text_paragraphs, that creates the paragraphs, but they’re not yet attached to any nodes. To check if the paragraphs were imported correctly, you could list the paragraphs in Views or inspect the database (specifically, the migrate_map_about_formatted_text_paragraphs, paragraphs_item, and paragraph__field_body tables).

Now let’s move on to the next stage: creating the nodes that reference the paragraphs. Here’s an extremely simplified version of the migration I used for the Landing Page nodes:

id: about_landing_page_nodes
label: 'Create Landing Page nodes in the About Us section'
source:
 plugin: embedded_data
 data_rows:
 -
 id: about
 title: 'About Us'
 components:
 -
 paragraph: about_intro
 -
 paragraph: about_mission
 -
 paragraph: about_history
 ids:
 id:
 type: string
process:
 title: title
 tmp_components_lookup:
 plugin: sub_process
 source: components
 process:
 tmp_paragraph:
 plugin: migration_lookup
 migration:
 - about_formatted_text_paragraphs
 source: paragraph
 no_stub: true
 field_landing_page_components:
 plugin: sub_process
 source: '@tmp_components_lookup'
 process:
 target_id: tmp_paragraph/0
 target_revision_id: tmp_paragraph/1
 uid:
 plugin: default_value
 default_value: 1
destination:
 plugin: 'entity:node'
 default_bundle: landing_page
migration_dependencies:
 required:
 - about_formatted_text_paragraphs
  • This migration just creates one Landing Page node, which references the three Formatted Text paragraphs created by the previous migration.
  • In the source plugin’s data_rows, the paragraph values correspond to ids in the previous migration’s source data.
  • The processes for tmp_components_lookup (a pseudofield) and field_landing_page_components work together to populate the Landing Page node’s Components field.
    • The process for tmp_components_lookup uses the sub_process plugin to loop through the paragraph items listed under components in the source data. For each item, migration_lookup looks up the paragraph whose id in the previous migration matches the paragraph in this migration.
      • The output of the migration_lookup plugin here is not a single ID, like it would be if you were looking up a node. Because a paragraph is an entity_reference_revisions instead of an entity, the output here is an array that has the paragraph ID at key 0 and the paragraph revision ID at key 1.
      • I store that output in a sub-property called tmp_paragraph so that I’ll be able to access it in the process for field_landing_page_components.
      • So the output of the tmp_components_lookup process is an array, in which each value is an array that maps the key tmp_paragraph to an array with 2 values (the paragraph ID and paragraph revision ID).
    • The process for field_landing_page_components loops through the array created by the tmp_components_lookup property. For each item, it extracts the paragraph ID and paragraph revision ID and stores them in the corresponding properties of field_landing_page_components.

Creating URL aliases

For each of the new Landing Page nodes, I needed to assign a URL alias. Here’s an example that generates a path_alias entity for the node created by the previous migration:

id: about_landing_page_aliases
label: 'Create URL aliases for Landing Pages in the About section'
source:
 plugin: embedded_data
 data_rows:
 -
 id: about
 alias: /about
 ids:
 id:
 type: string
 constants:
 path_prefix: /node/
process:
 landing_page_nid:
 plugin: migration_lookup
 migration: about_landing_page_nodes
 source: id
 no_stub: true
 path:
 plugin: concat
 source:
 - constants/path_prefix
 - '@landing_page_nid'
 alias: alias
destination:
 plugin: 'entity:path_alias'
migration_dependencies:
 required:
 - about_landing_page_nodes
  • In the source plugin’s data_rows, the id matches the one in the previous migration’s source data.
  • The process for landing_page_nid (a pseudofield) locates the node that the previous migration created for the source data with that id. The output of the process is the node ID.
  • The process for path constructs a path from the node ID, for example /node/99 from ID 99, and assigns that to the URL alias’s path field.
  • The process for alias sets that field in the URL alias to the alias specified in the source data.

Creating URL redirects

Once the new Landing Page nodes were created, I needed to delete the old pages and redirect their former paths to the new nodes. Here’s a migration that creates URL redirects (using the Redirect module):

id: about_redirects
label: 'Create URL redirects for the About section'
source:
 plugin: embedded_data
 data_rows:
 -
 from: node/22
 to: /about
 -
 from: constantinople
 to: /istanbul
 -
 from: new/york
 to: /new/amsterdam
 ids:
 from:
 type: string
 constants:
 uri_prefix: 'internal:'
 status_code: 301
process:
 redirect_source: from
 redirect_redirect:
 plugin: concat
 source:
 - constants/uri_prefix
 - to
 status_code: constants/status_code
destination:
 plugin: 'entity:redirect'
  • This migration doesn’t depend on any previous migrations.
  • The migration sets the following fields on each redirect that it creates:
    • redirect_source, whose field type is defined in the Redirect module, but for the purpose of this migration we can treat as Text (plain)
    • redirect_redirect, a Link field
    • status_code, an Number field
  • The process for redirect_source grabs the from value from the source data. This field wants the path without the initial slash.
  • The process for redirect_redirect constructs a URI from the to value in the source data. (As an alternative to using the concat plugin, in the source data I could have added the internal: prefix to each to value, e.g. to: internal:/about.)
  • The process for status_code sets the value to 301.

To be clear, this migration doesn’t delete the old pages. When I need to delete or unpublish pages after adding redirects, I use the node bulk form, Views Bulk Operations, or a Drush script.

Final thoughts

Are migrations the best way to deploy new content? Not necessarily. It depends on the situation. Other options include Workspaces, Feeds, and Node Export,

One thing I appreciate about Migrate API is that it’s flexible enough to handle a lot of different situations. Once you get the hang of it, you can use the same general procedure to create or update any kind of entity, including ones that reference other entities.

I also like that, when things go wrong with migrations, they’re usually fairly easy to debug. With Migrate Devel installed, you can run drush migrate:import with the --migrate-debug flag to print the source and destination property values. If a process plugin isn’t behaving as expected, you can find it in the documentation and follow the link to its source code to see what it’s actually doing.

In these examples I have not touched on the PHP-language portion of the Migrate API at all. I actually haven’t had to write any custom migration plugins so far. Using the comprehensive set of plugins already provided by core and contrib modules, I’ve been able to do everything I need just in YAML.

I had to stretch my brain a bit to get used to thinking in terms of the YAML API instead of PHP. It’s a little weird, but kind of fun, to figure out how to write what are essentially small programs in a format that was designed for data serialization. Nevertheless, it’s an expressive API in which you can accomplish a lot with just a little code. As much as I enjoy writing code, the less custom code there is on the site, the fewer the bugs and the lower the long-term maintenance burden.

Jaymie Strecker has been a software developer at Kosada for 13 years. There’s only one thing that they know how to do well, and they’ve often been told that you only can do what you know how to do well, and that’s be them.