Skip to content

Configuring Solr

Martini enhances data accessibility by allowing you to link your own Solr server. Leveraging Solr, you can seamlessly link your own Solr cores, facilitating efficient indexing and search operations for your data.

The following sections describe how to configure Martini to use a remote instance of Solr.

Using Martini with a Remote Solr Instance

Procedure

  1. Update your instance properties file to inform Martini Server Runtime that you will be using a remote Solr server.

    It's recommended to create an override.properties file if one doesn't exist yet, and then add or modify these key-value pairs in the file:

    1
    2
    solr.mode=remote
    solr.url=http://<solr-server-url>
    

    If you have multiple Martini servers and want them to connect to a single remote Solr instance, you can also assign a unique Solr core prefix for each instance by setting each Martini server's solr.core-prefix instance property.

    1
    solr.core-prefix=<prefix>
    
    2. Create a Solr Core to link with Martini. Follow this guide to create a Solr core to link with Martini.

  2. Start or restart your remote Solr by executing ./solr start in the <solr-home>/bin directory.

  3. Restart your Martini Server Runtime instance. During startup, Martini Server Runtime should log messages like the following if it successfully communicates with the remote Solr instance:

    1
    2
    25/04/18 16:39:19.749 INFO  [RemoteSolrClient] Starting core: '<core-name>'
    25/04/18 16:39:19.755 INFO  [RemoteSolrClient] Starting core: '<core-name>' completed
    

Tuning Solr

At times, you may need to tune Solr to handle heavier indexing and searching loads, especially in production environments. This section discusses various configuration options to optimize Solr for production use.

Monitoring performance

Before making changes, monitor your Solr instance's performance statistics and log files accessible via the Solr Admin UI. Follow these steps to access the Solr Admin UI:

  1. Navigate to http://<host>:<port>/solr.
  2. Select a Solr core.
  3. Click Plugins/Stats to view performance metrics.

JVM

Proper JVM configuration is crucial for Solr performance. Allocate sufficient memory to Solr while leaving enough for the operating system. Configure memory allocation using -Xms and -Xmx arguments in the solr.in.sh or solr.in.cmd file.

Update handlers

Configure Solr's update handlers to optimize commit behavior, balancing between normal and soft commits to enhance search performance.

File descriptors

Adjust the system's file descriptor limit (ulimit -n) to prevent issues with incremental indexing and ensure Solr stability.

Distributed indexing and searching

Consider SolrCloud for horizontal scaling and fault tolerance. SolrCloud simplifies distributed search and indexing, supporting automatic load balancing and fault tolerance.

Further tuning

Customize Solr settings via the solrconfig.xml file to fine-tune performance based on your specific use case. Refer to the Apache Solr Reference Guide for comprehensive configuration options.

Using Martini with SolrCloud

SolrCloud is a distributed indexing and search solution that provides fault tolerance and high availability by distributing Solr cores across multiple servers. It offers features like replication and sharding, making it ideal for robust and scalable systems.

Before proceeding, ensure you are familiar with SolrCloud by reading Solr's guide on SolrCloud.

Configuring SolrCloud for Martini

To integrate SolrCloud with Martini, follow these steps:

  1. Configure an external ZooKeeper ensemble
  2. Configure SolrCloud
  3. Connect Martini to SolrCloud

The steps will also be described with examples. In this case, the examples will be setting up three instances of ZooKeeper and three instances of SolrCloud – quite similar to how a production environment would be configured.

Configuring an external ZooKeeper ensemble

ZooKeeper is a centralized service for maintaining configuration information, fail-over mechanisms, and state management. To configure ZooKeeper for your SolrCloud cluster:

  • Download ZooKeeper and set up a ZooKeeper ensemble with multiple instances. To know which version of ZooKeeper you should be using with your SolrCloud cluster, go to your Solr's <solr-home>/server/solr-webapp/webapp/WEB-INF/lib/ directory. In there, you should see the ZooKeeper library, which should tell you which version of ZooKeeper is compatible with your SolrCloud cluster.
  • Assign unique IDs to each ZooKeeper instance and configure the ZooKeeper ensemble.

Procedure

  1. Download a copy of ZooKeeper's portable installer (zookeeper-3.6.2.tar.gz in this case), extract the file, and copy the extracted directory to each of your ZooKeeper instances' home directory.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    cd /datastore/apps/zookeeper/
    
    # Dowload, extract, and rename the installer
    wget https://archive.apache.org/dist/zookeeper/zookeeper-3.6.2/apache-zookeeper-3.6.2-bin.tar.gz
    tar -xvf apache-zookeeper-3.6.2-bin.tar.gz
    mv apache-zookeeper-3.6.2-bin zookeeper
    
    # Copy the directory to each instance's home folder
    cp -r zookeeper/ /datastore/apps/zookeeper/instances/zk1/
    cp -r zookeeper/ /datastore/apps/zookeeper/instances/zk2/
    cp -r zookeeper/ /datastore/apps/zookeeper/instances/zk3/
    
  2. Create /data/<zookeper-id>/myid files in each instance's home directory where:

    • <zookeper-id> should be replaced with the ZooKeeper instance's respective ID number and;
    • Every myid file's content is also the ZooKeeper instance's respective ID number.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    # Create the directories first
    mkdir -p instances/zk1/data/1
    mkdir -p instances/zk2/data/2
    mkdir -p instances/zk3/data/3
    
    # Create the files
    echo "1" | tee -a /datastore/apps/zookeeper/instances/zk1/data/1/myid
    echo "2" | tee -a /datastore/apps/zookeeper/instances/zk2/data/2/myid
    echo "3" | tee -a /datastore/apps/zookeeper/instances/zk3/data/3/myid
    
  3. After creating the data directories, create each ZooKeeper instance's configuration file (zoo.cfg).

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    # Open the configuration file
    nano <zookeeper-home>/zookeeper/conf/zoo.cfg
    
    # Define the desired configuration
    tickTime=2000
    initTime=10
    initLimit=5
    syncLimit=5
    clientPort=2181
    dataDir=<zookeeper-home>/data/<zookeper-id>
    server.1=192.168.21.71:2888:3888
    server.2=192.168.21.72:2888:3888
    server.3=192.168.21.73:2888:3888
    

    Placeholders should be replaced:

    • <zookeeper-home> with the ZooKeeper instance's home directory.
    • <zookeeper-id> with the ZooKeeper instance's ID.

    Perform these commands for every ZooKeeper instance.

  4. Finally, start ZooKeeper by calling zkServer.sh start on each server.

    1
    2
    3
    /datastore/apps/zookeeper/instances/zk1/zookeeper/bin/zkServer.sh start
    /datastore/apps/zookeeper/instances/zk2/zookeeper/bin/zkServer.sh start
    /datastore/apps/zookeeper/instances/zk3/zookeeper/bin/zkServer.sh start
    

That's it! Your ZooKeeper quorum is now ready with three instances up and running ready to serve Solr.

Configuring SolrCloud

Once ZooKeeper is set up, proceed to configure SolrCloud.

Procedure

  1. Download and install Solr (This guide will be using Solr 8.11.3).
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    cd /datastore/apps/solr
    
    # Dowload, extract, and rename the installer
    wget https://dlcdn.apache.org/lucene/solr/8.11.3/solr-8.11.3.tgz
    tar -xvf solr-8.11.3.tgz
    mv solr-8.11.3 solr
    
    # Copy the directory to each instance's home folder
    cp -r solr/ instances/solr1/
    cp -r solr/ instances/solr2/
    cp -r solr/ instances/solr3/
    
  2. Create a copy of Martini's Solr core configurations in Solr's global configuration folder.

    1
    cp -r <martini-server-runtime-home>/conf/solr/cores/ /datastore/apps/solr/solr/server/configs/
    
  3. Upload Martini's Solr configuration files (schema.xml and solrconfig.xml), to ZooKeeper. You can send these files to any Solr server and they will automatically be copied to all other ZooKeeper servers' directories. Don't forget to perform the same operations for any other solr cores in other packages.

    1. Navigate to any Solr server's server/scripts/cloud-scripts directory which contains the zkcli.sh script.

      1
      cd /datastore/apps/solr/instances/solr1/solr/server/scripts/cloud-scripts
      
    2. Use the script below to upload configurations for the Solr cores.

      1
      ./zkcli.sh -zkhost 192.168.21.71:2181 -cmd upconfig -confname <core-name> -confdir /datastore/apps/solr/configs/cores/<core-name>/conf
      
      • The -confname arguments above have been defined such that they refer to the names of the collections whose configurations are to be uploaded to ZooKeeper via the command. These configuration names will be used later when creating the Solr collections Martini will be using.
      • Placeholders should be replaced.
  4. Start the Solr servers in SolrCloud mode.

    To do that, execute the following command for every Solr server:

    1
    <solr-server-home>/solr start -cloud -z <zookeeper-host-addresses>
    

    In this case, the commands would be:

    1
    2
    3
    /datastore/apps/solr/instances/solr1/solr/bin/solr start -cloud -z 192.168.21.71:2181,192.168.21.72:2181,192.168.21.73:2181
    /datastore/apps/solr/instances/solr2/solr/bin/solr start -cloud -z 192.168.21.71:2181,192.168.21.72:2181,192.168.21.73:2181
    /datastore/apps/solr/instances/solr3/solr/bin/solr start -cloud -z 192.168.21.71:2181,192.168.21.72:2181,192.168.21.73:2181
    

    Solr should now start and connect to your ZooKeeper ensemble.

  5. Finally, create the collections needed by Martini using the Solr Collections API's create endpoint. You may use the "create endpoint" of any configured Solr server.

    Here's an example on how to do this:

    1
    2
    3
    4
    5
    6
    **Example Request**
    
    ```bash
    curl -X GET \
      'http://192.168.21.74:8983/solr/admin/collections?action=CREATE&name=jte_{core-name}&numShards=3&replicationFactor=3&maxShardsPerNode=3&collection.configName={core-name}r&wt=json'
    ```
    
    • Placeholders should be replaced.

    Once done, open any of the Solr servers' Solr Admin UI on your browser. Click on the Cloud tab and then click Graph. In the Graph page, you should see a graphical representation of how your Solr collections are mapped or distributed in your network.

    Accessing the same page in other configured Solr servers should still yield the same map.

    At this point, your Solr cluster is ready.

Connecting Martini to SolrCloud

After configuring SolrCloud, make the necessary changes in Martini to connect to SolrCloud:

  1. Update Martini's instance properties file to specify SolrCloud mode and provide the IP addresses.

    1
    2
    3
    solr.mode=cloud
    solr.url=192.168.21.71:2181,192.168.21.72:2181,192.168.21.73:2181
    solr.core-prefix=jte
    
  2. Restart Martini to apply the changes.

By following these steps, you can seamlessly integrate Martini with SolrCloud for reliable and scalable search capabilities.