diff --git a/solr/content.md b/solr/content.md index 6a2220d5e..5eb2d5b1f 100644 --- a/solr/content.md +++ b/solr/content.md @@ -10,6 +10,8 @@ Learn more on [Apache Solr homepage](http://lucene.apache.org/solr/) and in the # How to use this Docker image +## Run Solr and index example data + To run a single Solr server: ```console @@ -26,46 +28,159 @@ $ docker exec -it --user=solr my_solr bin/solr create_core -c gettingstarted In the web UI if you click on "Core Admin" you should now see the "gettingstarted" core. -If you want to load some example data: +If you want to load some of the example data that is included in the container: ```console $ docker exec -it --user=solr my_solr bin/post -c gettingstarted example/exampledocs/manufacturers.xml ``` -In the UI, find the "Core selector" popup menu and select the "gettingstarted" core, then select the "Query" menu item. This gives you a default search for "*:*" which returns all docs. Hit the "Execute Query" button, and you should see a few docs with data. Congratulations! +In the UI, find the "Core selector" popup menu and select the "gettingstarted" core, then select the "Query" menu item. This gives you a default search for `*:*` which returns all docs. Hit the "Execute Query" button, and you should see a few docs with data. Congratulations! + +## Single-command demo + +For convenience, there is a single command that starts Solr, creates a collection called "demo", and loads sample data into it: + +```console +$ docker run --name solr_demo -d -P solr solr-demo +``` + +## Loading your own data + +If you want load your own data, you'll have to make it available to the container, for example by copying it into the container: + +```console +$ docker cp $HOME/mydata/mydata.xml my_solr:/opt/solr/mydata.xml +$ docker exec -it --user=solr my_solr bin/post -c gettingstarted mydata.xml +``` + +or by using Docker host volumes: + +```console +$ docker run --name my_solr -d -p 8983:8983 -t -v $HOME/mydata:/opt/solr/mydata solr +$ docker exec -it --user=solr my_solr bin/solr create_core -c gettingstarted +$ docker exec -it --user=solr my_solr bin/post -c gettingstarted mydata/mydata.xml +``` To learn more about Solr, see the [Apache Solr Reference Guide](https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide). +## Creating Cores + +In addition to the `docker exec` method explained above, you can create a core automatically at start time, in several ways. + +If you run: + +```console +$ docker run -d -P solr solr-create -c mycore +``` + +the container will: + +- run Solr in the background, on the loopback interface +- wait for it to start +- run the "solr create" command with the arguments you passed +- stop the background Solr +- start Solr in the foreground + +You can combine this with mounted volumes to pass in core configuration from your host: + +```console +$ docker run -d -P -v $PWD/myconfig:/myconfig solr solr-create -c mycore -d /myconfig +``` + +When using the `solr-create` command, Solr will log to the standard docker log (inspect with `docker logs`), +and the collection creation will happen in the background and log to `/opt/docker-solr/init.log`. + +This first way closely mirrors the manual core creation steps and uses Solr's own tools to create the core, +so should be reliable. + +The second way of creating a core at start time is using the `solr-precreate` command. This will create the core +in the filesystem before running Solr. You should pass it the core name, and optionally the directory to copy the +config from (this defaults to Solr's built-in "basic_configs"). For example: + +```console +$ docker run -d -P solr solr-precreate mycore +$ docker run -d -P -v $PWD/myconfig:/myconfig solr solr-precreate mycore /myconfig +``` +This method stores the core in an intermediate subdirectory called "mycores". This allows you to use mounted +volumes: + +```console +$ mkdir mycores +$ sudo chown 8983:8983 mycores +$ docker run -d -P -v $PWD/mycores:/opt/solr/server/solr/mycores solr solr-precreate mycore +``` + +This second way is quicker, easier to monitor because it logs to the docker log, and can fail immediately if something is wrong. +But, because it makes assumptions about Solr's "basic_configs", future upstream changes could break that. + +The third way of creating a core at startup is to use the image extension mechanism explained in the next section. + +## Using Docker Compose + +With Docker Compose you can create a Solr container with the index stored in a named data volume. +Create a `docker-compose.yml` like: + +``` +version: '2' +services: + solr: + image: solr + ports: + - "8983:8983" + volumes: + - data:/opt/solr/server/solr/mycores + entrypoint: + - docker-entrypoint.sh + - solr-precreate + - mycore +volumes: + data: +``` + +and just run `docker-compose up`. + +## Extending the image + +The docker-solr image has an extension mechanism. At run time, before starting Solr, the container will execute scripts +in the `/docker-entrypoint-initdb.d/` directory. You can add your own scripts there either by using mounted volumes +or by using a custom Dockerfile. These scripts can for example copy a core directory with pre-loaded data for continuous +integration testing, or modify the Solr configuration. + +Here is a simple example. With a `set-heap.sh` script like: + +```console +#!/bin/bash +set -e +cp /opt/solr/bin/solr.in.sh /opt/solr/bin/solr.in.sh.orig +sed -e 's/SOLR_HEAP=".*"/SOLR_HEAP="1024m"/' /opt/solr/bin/solr.in.sh +grep '^SOLR_HEAP=' /opt/solr/bin/solr.in.sh +``` + +you can run: + +```console +$ docker run --name solr_heap1 -d -P -v $PWD/docs/set-heap.sh:/docker-entrypoint-initdb.d/set-heap.sh solr +$ sleep 5 +$ docker logs solr_heap1 | head +/opt/docker-solr/scripts/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/set-heap.sh +SOLR_HEAP="1024m" + + +Starting Solr on port 8983 from /opt/solr/server +``` + +With this extension mechanism it can be useful to see the shell commands that are being executed by the `docker-entrypoint.sh` +script in the docker log. To do that, set an environment variable using Docker's `-e VERBOSE=yes`. + ## Distributed Solr -You can also run a distributed Solr configuration, with Solr nodes in separate containers, sharing a single ZooKeeper server: +You can also run a distributed Solr configuration. -Run ZooKeeper, and define a name so we can link to it: +The recommended and most flexible way to do that is to use Docker networking. +See the [Can I run ZooKeeper and Solr clusters under Docker](https://github.com/docker-solr/docker-solr/blob/master/Docker-FAQ.md#can-i-run-zookeeper-and-solr-clusters-under-docker) FAQ, +and [this example](docs/docker-networking.md). -```console -$ docker run --name zookeeper -d -p 2181:2181 -p 2888:2888 -p 3888:3888 jplock/zookeeper -``` - -Run two Solr nodes, linked to the zookeeper container: - -```console -$ docker run --name solr1 --link zookeeper:ZK -d -p 8983:8983 \ - solr \ - bash -c '/opt/solr/bin/solr start -f -z $ZK_PORT_2181_TCP_ADDR:$ZK_PORT_2181_TCP_PORT' - -$ docker run --name solr2 --link zookeeper:ZK -d -p 8984:8983 \ - solr \ - bash -c '/opt/solr/bin/solr start -f -z $ZK_PORT_2181_TCP_ADDR:$ZK_PORT_2181_TCP_PORT' -``` - -Create a collection: - -```console -$ docker exec -i -t solr1 /opt/solr/bin/solr create_collection \ - -c collection1 -shards 2 -p 8983 -``` - -Then go to `http://localhost:8983/solr/#/~cloud` (adjust the hostname for your docker host) to see the two shards and Solr nodes. +You can also use legacy links, see the [Can I run ZooKeeper and Solr with Docker Links](Docker-FAQ.md#can-i-run-zookeeper-and-solr-clusters-under-docker) FAQ. # About this repository