BTMash

Blob of contradictions

Apache Solr and Ubuntu: Multiple Instances

Written

At CalArts, I have wanted to move the search functionality on our Drupal powered websites into something better for a long time. We have been using the Lucene API (which is lucene search ported to PHP) module on most of them since September of last year but (even though I am a big fan of the module) we truly wanted a way to offload the search services onto another vps (or server; basically, something more flexible). Over the past year, I have had the Apache Solr module running on our photo archive and the results have been nothing short of phenomenal: very fast (over 20x the content of the main calarts website yet over 20x the performance on hardware that is nearly 4 years old), fantastic results, and the faceted searching provides a way to find content you wouldn't otherwise. However, Drupal and Solr now also has another module that has come by: Search API. While I want to start evaluating between the two modules AND as I want to start moving all of our search capabilities over to Solr, I am left with one question: how do I get new instances of Solr up and running?

I could use something like Chef (ultimately, I most likely will) and replicate a Solr server, but that would also mean each vps would need to have a certain number of resources allocated to it to function nicely (which was not a guarantee). What I really wanted to do was create a solr server that could have multiple search instances (think of it like virtualhosts in apache but with solr search). We could provide it with a nice number of resources and *then* replicate our solr server configuration should the need arise in the future. Apache Solr actually supports such a concept with what are called multiple cores. I had a lot of trouble finding online resources that tackle creating such a setup until I came across a wonderful article by Dustin on setting up multiple cores. So I took what Dustin did and expanded on it with my own script for creating, reloading, unloading, and removing cores.

Installing Apache Solr

In the past, installing solr has been a pain (installing solr on freebsd is a particularly painful point) but the call to install the solr and see it up and running is:
apt-get install solr-tomcat
That's it! It'll probably take some time to install (java, tomcat, solr and a whole other bunch of dependencies need to get downloaded and installed) but at this stage, if you wanted a single solr server, you are ready to start setting it up with either the apachesolr module or the search api module. You should be able to see your solr search up and running at http://localhost:8080/solr (or the localhost with your domain or ip address if solr is not on your local machine).

Enable Multicore capability

Using your favorite text editor create a file called solr.xml at /usr/share/solr with the following contents:

  1. <?xml version='1.0' encoding='UTF-8'?>
  2. <solr sharedLib="lib" persistent="true">
  3. <cores adminPath="/admin/cores">
  4. </cores>
  5. </solr>

Next, you need to ensure that Tomcat is able to write out new versions of the solr.xml file. As cores are added or removed, this file is updated. The following commands ensure Tomcat has write permissions to needed directory and file:

  1. chown tomcat6.tomcat6 /usr/share/solr/solr.xml
  2. chown tomcat6.tomcat6 /usr/share/solr

And we can restart tomcat (/etc/init.d/tomcat6 restart). We are now ready to start setting up multiple cores.

Managing Cores

Before we can start creating multiple cores, we need to create config files, directories, set permissions, etc. To make the process easier, we'll first create a template config directory:

  1. cp -av /etc/solr/conf /etc/solr/conftemplate

Next, we edit the solrconfig.xml file by editing the dataDir option from:

  1. <dataDir>/var/lib/solr/data</dataDir>

to

  1. <dataDir>/var/lib/solr/data/CORENAME</dataDir>

NOTE

Since we are using drupal and the solr modules that come with this, I would recommend also copying over the solrconfig.xml file they provide to the template directory (you could name it drupal.solrconfig.xml), and making the same change. In the future you would simply need to overwrite the defaultsolrconfig.xml file with yours and you'd be good to go :)

Now that we have our template directory ready, we need a way to do a few things as a start:

  1. Create a new core. This involves letting solr know there is a new core and to create a copy of the configuration (these per-core config will be in /etc/solr/conf/
  2. Reload a core. This would reload the settings for a particular core so that tomcat (and your other search cores) do not need to restart
  3. Unload a core. This is to stop using a particular core. Note that this is *not* to remove the index, settings, etc for a particular core from the server.
  4. Remove a core. This would remove the core (index, settings, everything) from the server.

For this piece, I actually wrote a script which is available at http://pastebin.com/MWaqe7xi (I'm also pasting the current code below). The script does all of the above - what you want to do is download the script to a file on your server (in my case, I call it 'solr-admin.sh' and you can place it in your /usr/sbin folder /home//bin). Now I issue the following commands:

  1. Create new core: solr-admin create <CORENAME>
  2. Reload existing core settings: solr-admin reload <CORENAME>
  3. Unload existing core: solr-admin unload <CORENAME>
  4. Remove existing core: solr-admin remove <CORENAME>

There are a number of other functions that can also be created (such as merging the indexes of multiple cores into one, renaming a core, etc) which are not yet handled by the script. Everyone is welcome to contribute and help flesh this out :D

Contents of solr-admin.sh

  1. #!/bin/sh
  2.  
  3. # This file mimics creating / updating / unloading / deleting solr cores
  4.  
  5. # Create a new core
  6. # arg 1: core name
  7. create_solr_core() {
  8. # creates a new Solr core
  9. if [ "$1" = "" ]; then
  10. echo -n "Name of core to create: "
  11. read name
  12. else
  13. name=$1
  14. fi
  15.  
  16. if [ -d "/var/lib/solr/data/$name" ]; then
  17. echo "Cannot create $name: core with same name already exists"
  18. exit 1
  19. else
  20. mkdir /var/lib/solr/data/$name
  21. chown tomcat6.tomcat6 /var/lib/solr/data/$name
  22.  
  23. mkdir -p /etc/solr/conf/$name/conf
  24. cp -a /etc/solr/conftemplate/* /etc/solr/conf/$name/conf/
  25. sed -i "s/CORENAME/$name/" /etc/solr/conf/$name/conf/solrconfig.xml
  26. curl "http://localhost:8080/solr/admin/cores?action=CREATE&name=$name&instanceDir=/etc/solr/conf/$name"
  27. echo "Please read status from solr - by all accounts (pending a proper name), core $name was created"
  28. fi
  29. exit 0
  30. }
  31.  
  32. # Reload existing core
  33. # arg 1: core name
  34. reload_solr_core() {
  35. # reloads a Solr core
  36. if [ "$1" = "" ]; then
  37. echo -n "Name of core to reload: "
  38. read name
  39. else
  40. name=$1
  41. fi
  42.  
  43. if [ ! -d /var/lib/solr/data/$name ] || [ $name = "" ]; then
  44. echo "Core doesn't exist"
  45. exit
  46. fi
  47.  
  48. curl "http://localhost:8080/solr/admin/cores?action=RELOAD&core=$name"
  49. echo "Core $name has been reloaded"
  50. }
  51.  
  52. # Update existing core
  53. # arg 1: core name
  54. unload_solr_core() {
  55. if [ "$1" = "" ]; then
  56. echo -n "Name of core to remove: "
  57. read name
  58. else
  59. name=$1
  60. fi
  61.  
  62. if [ -d "/var/lib/solr/data/$name" ]; then
  63. curl "http://localhost:8080/solr/admin/cores?action=UNLOAD&core=$name"
  64. echo "Core $name has been unloaded"
  65. else
  66. echo "Core $name does not exist"
  67. fi
  68. }
  69.  
  70. # Remove existing core
  71. # arg 1: core name
  72. remove_solr_core() {
  73. if [ "$1" = "" ]; then
  74. echo -n "Name of core to remove: "
  75. read name
  76. else
  77. name=$1
  78. fi
  79.  
  80. if [ -d "/var/lib/solr/data/$name" ]; then
  81. unload_solr_core $name
  82. rm -rf /var/lib/solr/data/$name
  83. rm -rf /etc/solr/conf/$name
  84. echo "Deleted configuration settings for $name"
  85. else
  86. echo "Core $name does not exist"
  87. fi
  88. exit 0
  89. }
  90.  
  91. case "$1" in
  92. create)
  93. create_solr_core $2
  94. ;;
  95. reload)
  96. reload_solr_core $2
  97. ;;
  98. unload)
  99. unload_solr_core $2
  100. ;;
  101. remove)
  102. remove_solr_core $2
  103. ;;
  104. esac
  105.  
  106. exit 0