Configuring SolrCloud
Once done with setting up the ZooKeeper ensemble, you can now install and configure SolrCloud. This page will assist you on how to configure three instances of SolrCloud and connect them to the ZooKeeper ensemble. Solr includes an embedded ZooKeeper; however it's not recommended to use it in production systems as it defeats the purpose of redundancy.
Assumptions
- This guide will be using Solr
8.5.2
. - Universal configurations will be stored at a shared directory mounted at
/datastore/apps/solr/configs
. -
The IP addresses of the ZooKeeper hosts in the ZooKeeper ensemble are:
192.168.21.71
192.168.21.72
192.168.21.73
-
Three Solr servers will be configured for the SolrCloud cluster, namely
solr1
,solr2
, andsolr3
.-
solr1
- Home directory:
/datastore/apps/solr/instances/solr1
- IP address:
192.168.21.74
- Home directory:
-
solr2
- Home directory:
/datastore/apps/solr/instances/solr2
- IP address:
192.168.21.75
- Home directory:
-
solr3
- Home directory:
/datastore/apps/solr/instances/solr3
- IP address:
192.168.21.76
- Home directory:
-
Procedure
-
Download Solr's portable installer (
solr-8.5.2.zip
in this case), extract it, and copy the extracted directory to each Solr instance's home folder.1 2 3 4 5 6 7 8 9 10 11
cd /datastore/apps/solr # Dowload, extract, and rename the installer wget https://downloads.apache.org/lucene/solr/8.5.2/solr-8.5.2.tgz unzip solr-8.5.2.zip mv solr-8.5.2 solr # Copy the directory to each instance's home folder cp -r solr instances/solr1/ cp -r solr instances/solr2/ cp -r solr instances/solr3/
-
Create a copy of Martini's Solr core configurations in Solr's global configuration folder.
1
cp -r <martini-home>/solr/cores/datastore/apps/solr/server/configs
The directory structure of Solr should now look similar to the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Note that for the sake of brevity, many directories have been left out in the structure above.
-
Upload Martini's Solr core configuration files (
schema.xml
andsolrconfig.xml
), to ZooKeeper. You can send these files to any Solr server and they will automatically be copied to all other ZooKeeper servers' directories. Don't forget to perform the same operations for any other solr cores in other packages.-
Navigate to any Solr server's
server/scripts/cloud-scripts
directory which contains thezkcli.sh
script.1
cd /datastore/apps/solr/instances/solr1/solr/server/scripts/cloud-scripts
-
Use the script below to upload configurations for the
tracker
andinvoke_monitor
Solr cores.1 2
./zkcli.sh -zkhost 192.168.21.71:2181 -cmd upconfig -confname tracker -confdir /datastore/apps/solr/configs/cores/tracker/conf ./zkcli.sh -zkhost 192.168.21.71:2181 -cmd upconfig -confname invoke_monitor -confdir /datastore/apps/solr/configs/cores/invoke-monitor/conf
The
-confname
arguments above have been defined such that they refer to the names of the collections whose configurations are to be uploaded to ZooKeeper via the command. These configuration names will be used later when creating the Solr collections Martini will be using.The
zkcli.sh
scriptLearn more about Solr's
zkcli.sh
script by reading Solr's page on Command Line Utilities.
-
-
Start the Solr servers in SolrCloud mode.
To do that, set the value of
ZK_HOST
insolr.in.sh
file by adding the all of thezookeeper-host-addresses
in the cluster.ZK_HOST="host1:2181,host2:2181,host3:2181"
At default, port 2181 is used by Zookeeper to listen for client connections.
For a single solr instance per host, execute the following command for every Solr server:
Wherein -Dhost=${host-address} is the host IP of the server where the Solr instance is being run.1
<solr-server-home>/bin/solr start -Dhost=<host-address>
For multiple solr instances per host, execute the following commands for every Solr server:
1 2 3
/datastore/apps/solr/instances/solr1/bin/ start -p <port> -Dhost=<host-address> /datastore/apps/solr/instances/solr2/bin/ start -p <port> -Dhost=<host-address> /datastore/apps/solr/instances/solr3/bin/ start -p <port> -Dhost=<host-address>
In this case, the command for a single solr instance per host would be:
1
<solr-server-home>/bin/solr start -Dhost=192.168.21.71
If you are using more than one solr instance per host, the commands per host would be:
1 2 3
/datastore/apps/solr/instances/solr1/solr/bin/solr start -p 8983 -Dhost=192.168.21.71 /datastore/apps/solr/instances/solr2/solr/bin/solr start -p 8984 -Dhost=192.168.21.71 /datastore/apps/solr/instances/solr3/solr/bin/solr start -p 8985 -Dhost=192.168.21.71
Solr should now start and connect to your ZooKeeper ensemble. To check if Solr has started in SolrCloud mode, open the Solr Admin UI in your browser (by visiting:
http://<solr-ip-address>:8983
) and see if the Cloud tab appears, as seen in the screenshot below: -
Finally, create the collections needed by Martini (
invoke_monitor
andtracker
) using the Solr Collections API's create endpoint. You may use the "create endpoint" of any configured Solr server.Here's an example on how to do this:
Example Request
1 2
curl -X GET \ 'http://192.168.21.74:8983/solr/admin/collections?action=CREATE&name=jte_invoke_monitor&numShards=3&replicationFactor=3&maxShardsPerNode=3&collection.configName=invoke_monitor&wt=json'
Example Response
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
{ responseHeader: { status: 0, QTime: 12825 }, success: { 192.168.21.76:8983_solr: { responseHeader: { status: 0, QTime: 3286 }, core: "jte_invoke_monitor_shard3_replica3" }, 192.168.21.75:8983_solr: { responseHeader: { status: 0, QTime: 3014 }, core: "jte_invoke_monitor_shard2_replica2" }, 192.168.21.74:8983_solr: { responseHeader: { status: 0, QTime: 3431 }, core: "jte_invoke_monitor_shard3_replica1" } } }
Example Request
1 2
curl -X GET \ 'http://192.168.21.74:8983/solr/admin/collections?action=CREATE&name=jte_tracker&numShards=3&replicationFactor=3&maxShardsPerNode=3&collection.configName=tracker&wt=json'
Example Response
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
{ responseHeader: { status: 0, QTime: 14724 }, success: { 192.168.21.76:8983_solr: { responseHeader: { status: 0, QTime: 5468 }, core: "jte_tracker_shard3_replica1" }, 192.168.21.75:8983_solr: { responseHeader: { status: 0, QTime: 4846 }, core: "jte_tracker_shard1_replica3" }, 192.168.21.74:8983_solr: { responseHeader: { status: 0, QTime: 4989 }, core: "jte_tracker_shard2_replica2" } } }
The requests passed in a couple of query parameters. It is important to set the
name
andconfigName
parameters to their pre-defined values; all other properties can vary depending on your needs.-
name
The name of the collection. For prefixed core names, the format is:
<prefix>_<core_name>
.Create unique collection names by using prefixes
It is possible to connect multiple Martini instances to a single SolrCloud cluster. In this kind of scenario, having identical collection names will result in shared Solr collections. This means that the Tracker and Monitor data of different Martini instances will all reside in the same solr indexes.
If you choose to prevent this from happening, it is recommended to use prefixes for your collections. In fact the example above does this – the collection names are prefixed with
jte
(jte_tracker
andjte_invoke_monitor
). -
replicationFactor
The number of replicas SolrCloud will create.
-
maxShardsPerNode
The number of shards SolrCloud will create for each replica.
-
configName
The name of the configuration to use for this collection. Ideally, you should pass in the value of the
-confname
parameter defined earlier in step #3. -
wt
The type of response you want to receive.
Once done, open any of the Solr servers' Solr Admin UI on your browser. Click on the Cloud tab and then click Graph. In the Graph page, you should see a graphical representation of how your Solr collections are mapped or distributed in your network. It should look similar to the screenshot below:
Accessing the same page in other configured Solr servers should still yield the same map.
At this point, your Solr cluster is ready. You can now proceed to setting up Martini so that it can work with SolrCloud.
-