BTMash

Blob of contradictions

Entity Caching and Drush - a sweet match.

Tue, 07/12/2011 - 10:30 -- btmash

I am a big fan of the Display Suite module. Its quite flexible and gets you up and running with a look/feel fairly quickly. One of my favourite features of Display Suite is that you can create various build modes so that they can power your views or results or have then get used in a various areas of your site. And since Display Suite plugs into any type of entity, this makes things even more fantastic (as an example, I had one of my taxonomy terms have 2 build modes: the default 'page view' mode which showed upcoming events and an 'archive' mode which showed an archive of events that had passed on our site which were then tied into views).

However, all of this comes at a cost. Using fields with views means that whatever fields you choose to view will be a part of the query and that's that - views then goes into rendering the resultset (there is other alteration work that goes on as well but for the most part, you have one query and you're done). With using an entity view (or a content view with a build mode), you first have the query to get the Entity IDs. Then the entire entity is going to be loaded before rendering can begin. Which could mean many more queries (one site that I have been working on have well over 50 fields per node for one of it's content types). This could mean the one page that you have becomes much heavier than it may have already been. Thankfully, catch wrote a really cool module called Entity Cache which will cache a loaded node into a cache table so all those queries from before do not need to run again. He even has it set up so that the cache remains until cache expiry time and/or the entity content is updated.

The only caveat is that when you are developing a site and the cache needs to constantly get cleared out for the css/js, the entity caches also get flushed out. And for this to happen on a live site could mean some pages load much more slowly than others (since the cache needs to get reloaded again after cron that time). This is where you can use Drush to help out!

Below is a simple drush script that does the following:

  1. Find all entities that utilize the entitycache mechanism. Optionally provide the machine name of the entity type (eg. node, comment, user, taxonomy_term, etc)
  2. Gather all the entity_ids for the particular entity type
  3. Perform entity_load on each

There is a lot that can be done to improve the drush script (check it out in the issue queue - I have a fair number of TODOs ^_^) but being able to clear out the cache and then reload the entities back in has allowed me to keep a site fairly snappy (and learn even more things about Drupal - I had no idea that performing entity_load on multiple entity items was a fair bit faster than running entity_load separately for each individual entity. I don't know if there is a magic number but having it at processing 50 nodes at the same time made the script decrease the processing time by nearly 60%). My plan will be to use batch scripting so that this could be run by a content editor after flushing out the site caches to fill it back up again. Or maybe a scheduler to cache entities that haven't already done so. So many ideas...

I have pasted it at pastebin (LINK) and below if you want to take a look. Leave some feedback here or in the issue if you have ideas on other ways this could be improved :)

<?php

/**
* @file entitycache.drush.inc
*   drush integration for entitycache.
*   Simple idea - load each of the entities.
*   @TODO: Figure out how to do it in batches.
*   @TODO: Figure out how to only load entities not yet in cache.
*   @TODO: Figure out a way to have it done via a scheduler.
*/

/**
* Implementation of hook_drush_help().
*/
function entitycache_drush_help($section) {
  switch ($section) {
    case 'drush:entity-load-cache':
      return dt('Used without parameters, this command loads all the various objects in the entities into cache using entity_load()');
  }
}

/**
* Implementation of hook_drush_command().
*/
function entitycache_drush_command() {
  $items = array();
  $items['entitycache-load-cache'] = array(
    'callback' => 'drush_entitycache_load_cache',
    'description' => t('Load the cache with the various entities configured to use the cache'),
    'arguments' => array(
      'type' => dt('Optional. Only load the particular entity type objects into cache'),
    ),
    'bootstrap' => DRUSH_BOOTSTRAP_DRUPAL_FULL,
    'aliases' => array('elc'),
  );
 
  return $items;
}

/**
* Load the cache bin with content from the various entities.
* @param $type Optional. The specific type of entity that should
*   get it's content cached.
*/
function drush_entitycache_load_cache($type = '') {
  $types = entitycache_supported_core_entities(TRUE);
  $start = time();
  if (!empty($type)) {
    if (!isset($types[$type])) {
      drush_die("Type $type is not supported");
    }
    else {
      _drush_entitycache_load_cache($type);
    }
  }
  else {
    foreach ($types as $entity_type => $entity_controller) {
      _drush_entitycache_load_cache($entity_type);
    }
  }
  drush_print("Caching complete!");
  drush_print("Total processing time: ". (time() - $start) ." seconds.");
}

/**
* Load the cache bin with content from a specific type of entity.
*/
function _drush_entitycache_load_cache($type) {
  $max_limit = 50;
  drush_print("Begin caching $type");
  $query = new EntityFieldQuery;
  $result = $query
    ->entityCondition('entity_type', $type)
    ->execute();
   
  $keys = array();
  $limit = 0;
  foreach ($result[$type] as $entity_key => $entity_info) {
    $keys[] = $entity_key;
    $limit++;
    if ($limit >= $max_limit) {
      entity_load($type, $keys, array(), TRUE);
      $keys = array();
      $limit = 0;
    }
  }
}