Solr Basics
Solr is an open source search tool written in Java and based on the Apache Lucene Library. It is meant to be a stand alone web application. It exposes REST like endpoints which can also be extended on (you can set up your own endpoints).
Solr can also be embedded in software.
Solr is not a database, it is an index of data i.e. information (data, meta data, fields etc) in a list with references to where the data came from.
Check out this good intro tutorial in 5 minutes.
Installing and running Solr locally
For development I start by downloading Solr and running a local instance. You can download Solr from the Solr website. Make sure you have Java installed too.
Once you have installed it, navigate to the core and run:
java -Dsolr.solr.home=path/to/your/core -jar start.jar
Then in a browser go to:
http://localhost:8983/solr/#/
Running Queries
Using pagination we can set the start and rows parameters. Think of start as the page number, and the rows as the number of records per page:
curl -X GET "http://localhost:8983/solr/paintings/select?q=*:*&start=0&rows=0&wt=json&indent=true"
add example entity
curl 'http://localhost:8983/solr/paintings/update?commit=true&wt=json' -H 'Content-type:application/json' -d ' [ { "uri" : "http://en.wikipedia.org/wiki/Mona_Lisa", "title" : "Mona Lisa", "museum" : "unknown" } ]'
add same as above with more fields
curl 'http://localhost:8983/solr/paintings/update?commit=true&wt=json' -H 'Content-type:application/json' -d '[ { "uri" : "http://en.wikipedia.org/wiki/Mona_Lisa", "title" : "Mona Lisa", "artist" : "Leonardo Da Vinci", "museum" : "Louvre" } ]'
find out what is on the index
curl 'http://localhost:8983/solr/paintings/select?q=*:*&commit=true&wt=json' -H 'Content-type:application/json'
list all the fields with the csv output
curl -X GET 'http://localhost:8983/solr/paintings/select?q=*:*&rows=0&wt=csv'
to post pdf documents
curl -X POST 'http://localhost:8983/solr/pdfs_1/update/extract?extractFormat=text&literal.annotation=The+Wikipedia+Page+About+Apache+Lucene&commit=true' -F 'Lucene.pdf=@Lucene.pdf'
Spellcheck and autosuggest
curl -X GET 'http://localhost:8983/suggest?spellcheck.build=true&wt=json&email=jon.french@uk.dk.com&api_key=07617a9af105ea8015d2d68d0fb9eeb8&core_name=DKFO
Running a spellcheck
http://localhost:8983/suggest?spellcheck.q=new+y&wt=json&spellcheck.build=true
A good article on google like spell checks is on opensolr.com.
To enable auto suggest and spellcheck follow these steps on opensolr.
Once data is indexed, BUILD your SPELLCHECK dictionary by visiting this url:
Queries with weighting
http://localhost:8983/select?q=text:moon
&fq=language:en-gb
&wt=json
&start=0
&rows=1
&mlt=on
&mlt.qf=language:en-gb
&mlt.fl=language,title,subject
&mlt.mindf=1
&mlt.mintf=1
&mlt.minwl=3
&mlt.count
&fl=language,url,title,introSummary,themeImage,score
&sort=score+desc
&wt=json
&omitHeader=true
ATOMIC UPDATES
You must include all the unique attributes, in this case its "id", "title", "url". For example:
[{
"id": "7645",
"title" : "Marks test",
"introSummary" : {
"set" : "test test test"
}
}]
For example:
curl http://localhost:8983/solr/update?commit=true -H 'Content-type:application/json' -d '[{ "id": "1422034025023", "title": { "set" : "Marks test change" } }]'
curl http://localhost:8983/solr/update?commit=true -H 'Content-type:application/json' -d '{ "id": "1422034025023", "title": { "set" : "Marks test change" } }]'
To delete
Using Curl POST XML:
curl -X POST 'http://localhost:8983/solr/update?commit=true' -H 'Content-Type: text/xml' --data-binary '<delete><query>*:*</query></delete>'
Using Curl with Json:
curl 'http://localhost:8983/solr/update?commit=true' -H 'Content-type:application/json' -d '{ "delete" : { "query" : "*:*" } }'
Committing a change
curl -X POST 'http://localhost:8983/solr/update' --data '<commit/>' -H 'Content-type:text/xml; charset=utf-8'
Useful links:
Resources
Query resources: