BTMash

Blob of contradictions

Migrate more with less! Or how I learned to love and create dynamic migrations.

Thu, 02/16/2012 - 12:15 -- btmash

As many folk know (and as folk can see by my posts on the topic), I am very big fan of Migrate. It takes a while to figure out what you want to do and how to do it, but the power is absolutely immense. And having complete control over the source object down to manipulating the destination node/user/entity while still working within a framework has made this my favourite module. Over the next few weeks (months?), I will be, once again, writing about some of the things you can do with Migrate that I did not know about until recently (when I've been asked to work on more migrations).

I've been working with a new client on migrating their old Drupal 6 website which used Ubercart over into Drupal 7 and using Commerce. But as many folk are coming to realize, Commerce is very different from Ubercart. For one, products themselves are no longer types of nodes - they are now their own entities. And product displays have been separated from the actual products themselves (so you can have the same product appear in different ways on your site). It is a very powerful system but also more challenging. Thankfully, the Commerce Migrate module takes care of a large part of the migration. I would strongly recommend using it at your base migration module if it works (and in most cases, it should). But there was more data I needed to bring into the new site (all the taxonomy vocabularies and terms from the vocabularies). Normally, when you create a migration for migrate, you have to create a migration class for any entity/bundle you want to migrate over into your site. You can do the manipulations in other ways and avoid several classes (you can manipulate the entity before it gets saved so you could push a dynamic vocabulary on it), but generally, the right methodology would be to have multiple classes so you have a better way to track the migrations. Luckily, there was something very interesting I learned from the commerce migrate module that I didn't know about before: DynamicMigrations. In essence, Dynamic Migrations from the Migrate module allow you to define multiple classes that will be handled by a single DynamicMigration class. So you get the advantage of having one class of code but have all the migrations get tracked separately! I'll paste out the code as I am using it on the client site.

First, the info file (named migration_terms.info):

  1. name = Super Term Migration
  2. description = Module to migrate all vocabularies and terms from Drupal 6 to Drupal 7
  3. package = Migrate
  4. core = 7.x
  5. dependencies[] = commerce_migrate
  6. dependencies[] = taxonomy
  7.  
  8. files[] = migration_terms_taxonomy.inc

Since the site primarily involves working with migrations from commerce_migrate (and commerce_migrate actually does a lot of setup work that I want the client to avoid having to redo over again), I made commerce_migrate a dependency of the module. You don't have to in your case, but you will otherwise need to hardcode where the data is going to be coming from for your website (or look at how I did it in my migration. xdeb also did a migration with a similar ui and you can look at his code here where I might talk with him to patch a modification of what I wrote in.) The files addition lets Drupal know where to find any classes this module will implement (you can always add it to your module but this does make it cleaner and use less memory so the class is only loaded when something requires it.

Now, the contents of my install file (named migration_terms.install)

  1. <?php
  2. /**
  3.  * Implementation of hook_requirements().
  4.  */
  5. function migration_terms_requirements($phase) {
  6. $connection = commerce_migrate_ubercart_get_source_connection();
  7. $requirements['commerce_migrate_terms'] = array(
  8. 'title' => t('Commerce Migrate'),
  9. 'description' => t('Vocabulary table exists.'),
  10. 'severity' => REQUIREMENT_INFO,
  11. );
  12. if ($connection->schema()->tableExists($table)) {
  13. $requirements['commerce_migrate_terms']['severity'] = REQUIREMENT_ERROR;
  14. }
  15. return $requirements;
  16. }
  17.  
  18. /**
  19.  * Implementation of hook_install().
  20.  */
  21. function migration_terms_install() {
  22. $query = "SELECT vid, name, description FROM {vocabulary}";
  23. $connection = commerce_migrate_ubercart_get_source_connection();
  24. $result = $connection->query($query);
  25. foreach ($result as $record) {
  26. $vocabulary = new stdClass();
  27. $vocabulary->name = $record->name;
  28. $vocabulary->machine_name = drupal_strtolower(preg_replace('/__+/', '_', preg_replace("/[^A-Za-z0-9_]/", "_", $record->name)));
  29. $vocabulary->description = $record->description;
  30. $vocabulary->hierarchy = 0;
  31. $vocabulary_migration = 'MigrationTaxonomyTermMigration_'. $vocabulary->machine_name;
  32. Migration::registerMigration('MigrationTaxonomyTermMigration', $vocabulary_migration, array('dst_vocabulary' => $vocabulary->machine_name, 'src_vid' => $record->vid));
  33. taxonomy_vocabulary_save($vocabulary);
  34. }
  35. }
  36.  
  37. /**
  38.  * Implementation of hook_uninstall().
  39.  */
  40. function demigration_terms_uninstall() {
  41. $vocabulary_migrations = MigrationTaxonomyTermMigration::getTaxonomyTermMigrations();
  42. foreach ($vocabulary_migrations as $migration) {
  43. Migration::deregisterMigration($migration);
  44. }
  45. }

Now, the code that I have to save my vocabulary names is not quite so important (its a query to get the list of vocabularies from my old database, some sanitizing of the old term name, and then saving it using taxonomy_vocabulary_save()). We could have created a migration to actually do a proper import of the vocabularies, install is the devil, etc. The important part is Migration::registerMigration that basically tells Migrate the following (and lets say this is for a vocabulary named 'foobar' which had a vocabulary id of 2 in the original database):

Track a new migration class named 'MigrationTaxonomyTermMigration_foobar' which is actually going to be handled by the class 'MigrationTaxonomyTermMigration'. The new vocabulary name is 'foobar' and the vocabulary id on the old database is '2'.

Similarly, when you uninstall this module, it needs to de-register any migrations that had been dynamically created by this module using Migration::deregisterMigration. In my scenario, I also added a requirements check to ensure the original database had a vocabulary table I could grab data from.

As for our module file, this is going to be really simple (migration_terms.module):

  1. <?php
  2. /**
  3.  * Implementation of hook_migrate_api().
  4.  */
  5. function de_migration_terms_migrate_api() {
  6. $api = array(
  7. 'api' => 2,
  8. );
  9. return $api;
  10. }

Simple! Our module is now ready to define these classes we've talked about. Or wait...its only going to be 1 class (migration_terms_taxonomy.inc).

  1. <?php
  2. /**
  3.  * @file
  4.  * Dynamic Taxonomy Term Migration
  5.  * This is a dynamic migration, reused for every vocabulary.
  6.  */
  7.  
  8. class MigrationTaxonomyTermMigration extends DynamicMigration {
  9. ...
  10. }

As mentioned above, we're going to use the DynamicMigration class to handle any/all term migratiosn into our new site. But before we can define everything in our constructor, we need to provide 2 functions (this comes as a result of what we have in our install file). These functions are generateMachineName() and our custom function to get a list of vocabularies in our Drupal 7 website, getTaxonomyTermMigrations. generateMachineName() is the important one as it confirms with migrate that "yes, I do handle the migration for this class!" The getTaxonomyTermMigrations() is used in de-registering the migration classes when the module gets uninstalled.

  1. /**
  2.   * Construct the machine name (identifying the migration in "drush ms" and other places).
  3.   */
  4. protected function generateMachineName($class_name = NULL) {
  5. return 'TaxonomyTermMigration_' . ucfirst($this->arguments['dst_vocabulary']);
  6. }
  7.  
  8. /**
  9.   * Return a list of all product migrations.
  10.   */
  11. public static function getTaxonomyTermMigrations() {
  12. $migrations = array();
  13. $results = db_query("SELECT machine_name FROM {taxonomy_vocabulary}");
  14. foreach ($results as $record) {
  15. $migrations[] = 'TaxonomyTermMigration_' . $record->machine_name;
  16. }
  17. return $migrations;
  18. }

So now that those are out of the way, we can get into the meat of the migration, our constructor:

  1. public function __construct(array $arguments) {
  2. $this->arguments = $arguments;
  3. parent::__construct();
  4. $this->description = t('Migrate Tag Terms for vocabulary %vocabulary from original vid %original_vid', array('%vocabulary' => $arguments['dst_vocabulary'], $arguments['src_vid']));
  5. // Things will be set up against the vid.
  6. // Create a map object for tracking the relationships between source rows
  7. // and their resulting Drupal objects. Usually, you'll use the MigrateSQLMap
  8. // class, which uses database tables for tracking. Moreover, we need to
  9. // pass schema definitions for the primary keys of the source and
  10. // destination - we need to be explicit for our source, but the destination
  11. // class knows its schema already.
  12. $this->map = new MigrateSQLMap($this->machineName,
  13. 'tid' => array(
  14. 'type' => 'int',
  15. 'unsigned' => TRUE,
  16. 'not null' => TRUE,
  17. 'description' => 'D6 Unique Term ID',
  18. 'alias' => 'td',
  19. )
  20. ),
  21. MigrateDestinationTerm::getKeySchema()
  22. );
  23.  
  24. $connection = commerce_migrate_ubercart_get_source_connection();
  25. $query = $connection->select('term_data', 'td')
  26. ->fields('td', array('tid', 'vid', 'name', 'description', 'weight'))
  27. ->condition('td.vid', $arguments['src_vid'], '=');
  28. $query->join('term_hierarchy', 'th', 'td.tid = th.tid');
  29. $query->addField('th', 'parent');
  30. $query->orderBy('th.parent', 'ASC');
  31.  
  32. // Create a MigrateSource object, which manages retrieving the input data.
  33. $this->source = new MigrateSourceSQL($query, array(), NULL, array('map_joinable' => FALSE));
  34.  
  35. // Set up our destination - term in this case.
  36. $this->destination = new MigrateDestinationTerm($arguments['dst_vocabulary']);
  37.  
  38. // Assign mappings TO destination fields FROM source fields.
  39. $this->addFieldMapping('name', 'name');
  40. $this->addFieldMapping('description', 'description');
  41. $this->addFieldMapping('format')->defaultValue('plain_text');
  42. $this->addFieldMapping('weight', 'weight');
  43. $this->addFieldMapping('parent', 'parent')->sourceMigration($this->getMachineName());
  44.  
  45. // Unmapped source fields
  46. $this->addUnmigratedSources(array('vid'));
  47.  
  48. // Unmapped destination fields
  49. $this->addUnmigratedDestinations(array('path', 'parent_name'));
  50. }

Wow, there is quite a bit of code in there. Again, the key lines in this piece of code are actually at the very top with the constructor and the arguments.

  1. public function __construct(array $arguments) {
  2. ...
  3. }

As you might have seen in the install, we actually into the registration are few different things.

  1. Dynamic Migration class name
  2. Migration Tracking class name
  3. An array of arguments consisting of the vocabulary id in our new site and the vocabulary id from our old site.

The rest of the code that I have in the class is pretty much from all the work xdeb did on his D6 -> D7 migration site. My modifications were to where the query ran, passing in arguments to the query, ensuring the mapping is not joinable (this is done due to using commerce_migrate which means the db might not be on the same machine as the db for our new site), and setting up the destination term vocabulary. In total, 4 lines of changed code (these are lines 25, 28, 34, and 37, respectively).

Finally, we have a prepareRow() function to account for drupal's nitpickyness on the parent term id:

  1. public function prepareRow($current_row) {
  2. // If there is not parent term id for the term, unset the parent value.
  3. // Taxonomy does not like for a value to be provided if it does not exist.
  4. if ($current_row->parent == 0) {
  5. unset($current_row->parent);
  6. }
  7.  
  8. // We could also have used this function to decide to skip a row, in cases
  9. // where that couldn't easily be done through the original query. Simply
  10. // return FALSE in such cases.
  11. return TRUE;
  12. }

We've gone through a lot in this code. The dynamic migrations can be a little daunting, but if used under the right circumstances, it can save a lot of reimplementation of the same code. In this scenario, we have ~ 160 lines of code (100 of which are for the actual migration code). Nearly 50% of the code is to account for doing the dynamic migrations. But these 80 lines of code mean we don't have to recreate these mappings again. Or manage changes we made for one term migration into all the other classes. In the case of the site I did this for, it has 11 different sets of vocabulary with their own terms. So, in a best case, I would have to rewrite 60 of the migration code lines 11 times (or create a parent class and have them all reference that...kinda sounds like this dynamic migration!). It makes things easier to track down, fix, and enforce for your migrations. Depending on how much different data you are importing, this could save you a lot of time.

In my next blog post, I will talk about what you can do when working with a dynamic migration (or actually, any migration) and needing to bring in additional data to fill in any missing pieces of your content (in my case, the product displays needed to get the taxonomy terms mapped into their respective fields). I hope all of this has been helpful. If something doesn't make a lot of sense, leave a comment! I want to keep improving this documentation and hopefully, all of this will be even more useful to other folk looking to migrate content in the Drupal community.