Yesterday evening, I was working with a client on their site who are doing some interesting things with one of their custom search pages. They send ajax requests to the backend to get 2 types of values for their user:
- A count on the total number of a node type X that matched the criteria
- A count on the total number of another node type Y that is referenced in node type X (Y can be referenced multiple times by various X but for this, we just want to get back that value).
Instead of opting to go with straight database queries to get the data, they were using the EntityFieldQuery manner to get the initial list of X since they were using fields. Its not quite as fast, but its a much more flexible approach (and if they opt to change their field storage in the future to something like MongoDB, they can have something really fast without having to change a single line of code!). The one problem with EntityFieldQuery, however, is that it will only return back a listing of entity IDs. Meaning that if we want to get other pieces of data, we have to load up the entity. In their scenario, the only other piece of data that they wanted to retrieve was the reference field data. And performing an entire entity_load (or node_load to be specific) would mean they would also need to load up the 50 other fields that they are storing. So doing a retrieval like this on uncached content meant that the retrieval of this data alone took 3 to 4 seconds.
I was completely stumped on this because I only wanted this other field - there had to be an easier way to get at this data other than performing a straight query for it (again, you lose a chunk of flexibility). And thankfully, you don't need to. After reading through all the steps an entity_load goes through, I finally found what I was looking for:
field_attach_load(). Now, I know what you're thinking (and this is because it was mentioned on irc):
kinda sounds dirty to me but I don't know why
I'll assure you that it is quite safe. This is the function that an entity load goes through when it wants to load up all the fields. However, if you look at the
$options variable, you will see that it takes in a 'field_id'. So if you provide it with one field, it will try and retrieve just that (note that if the object has already been cached, it will give you the whole object anyways - a neat side effect of a whole entity load). So my initial code for the client looked something like the following (my example retrieves an associated image field after performing the entity query):
- $query = new EntityFieldQuery();
- $query->entityCondition('entity_type', 'node')
- ->entityCondition('bundle', 'article')
- ->propertyCondition('status', 1)
- ->fieldCondition('field_image', 'fid', 'NULL', '!=');
- $results = $query->execute();
- $articles = $results['node'];
- $fields = field_info_instances('node', 'article');
- $field_id = $fields['field_image']['field_id'];
- field_attach_load('node', $articles, FIELD_LOAD_CURRENT, array('field_id' => $field_id));
Now there are a ways to speed this up as well (if you know the field id, you don't have to go through the
field_info_instances() function. But the main gist of what this now lets you do is that it will go through all the field hooks that a field would normally go through as well so a module is free to modify it if your site is configured that way. Is it all slower than a normal field query? Sure? You could call on other functions inside field_attach_load if you want to drill down even further. But its still quite performant and still very flexible. And it honours your field storage calls.
I tried the above on a stock Drupal install (with about 500 nodes - 200 of which were 'articles'). The page load with using node_load took atleast 800 milliseconds every time (if everything was cached - if not, then my results averaged 2500 milliseconds). Using the functionality above took 280 milliseconds on average. A wonderful difference.
So the biggest takeaway I got from all this? Spend more time looking at the insides of whatever open source project you are working with! You might just find some amazing functionality that does exactly what you need and/or learn some cool ways to approach other problems (that may not even be related to the project you are working on) in the future.