Solr Integration

Anserini, using IndexCollection, generates Lucene index files that we can load into Solr. Solr is a search engine build on Lucene that has desirable tools such as an interface to perform queries on Lucene (Anserini) indices.

Docker

In order to integrate Anserini and Solr, we'll be using Docker - make sure this is setup on your machine before continuing.

Additionally, ensure that the Docker SDK for Python is installed via pip install docker

Overview

Loading a Lucene index into Solr is fairly straightforward as Solr is built on top of Lucene. In a nutshell, the following needs to happen:

Create the Solr core (index) that will hold our data.
Copy the Lucene index files into the <my_core>/data/index/ directory of the Solr server.
Update the schema (<my_core>/conf/managed-schema) file to match the fields in our index.
Reload the core.

This has been automated through a number of scripts to automatically load core17 and mb11 collection indices into Solr.

Instructions

Build Anserini and copy the fatjar (important) artifact into the root directory of the SolrAnserini repo, changing the name to anserini.jar.

Edit the config.json file to point to the index and config locations on the host machine.
Run the Python script to build the Docker image with index and config volumes mounted.
- python run.py (optionally specifying --config <config_location>)

Name	Latest commit message	Commit time
Failed to load latest commit information.
configsets	Add mb13, robust04, and wash18 collections. (#7 )	Nov 5, 2018
.gitignore	Replace bash script with more flexible Python script (#3 )	Oct 15, 2018
Dockerfile	Update to Solr 7.6 (#11 )	Dec 23, 2018
README.md	Update README.md (#6 )	Nov 5, 2018
config.json	Add mb13, robust04, and wash18 collections. (#7 )	Nov 5, 2018
run.py	Restart container after setting up the indexes (#5 )	Nov 5, 2018

castorini/SolrAnserini

Join GitHub today

Clone with HTTPS

Launching GitHub Desktop...

Launching GitHub Desktop...

Launching Xcode...

Launching Visual Studio...

README.md

Solr Integration

Docker

Overview

Instructions