Retrieve and Rank (R&R), if you hadn’t already heard about it, is IBM Watson’s new web service component for information retrieval and question answering. My colleague Chris Madison has summarised how it works in a high level way [here][1].
R&R is based on the Apache SOLR search engine with a machine learning result ranking plugin that learns what answers are most relevant given an input query and presents them in the learnt “relevance” order.
Some of my partners have found that getting documents in and out of retrieve and rank is a little bit cumbersome using CURL and json files from the command-line. Here I want to demonstrate a much easier way of managing your SOLR documents with [solrpy][2]– a wrapper around Apache SOLR in Python. Since R&R and SOLR are API compatible (until you start using and training the custom ranker) it is perfectly fine to use solrpy – in R&R with a few special tweaks.
## Getting Started
**You will need
** An R&R instance with a cluster and collection already configured. I’m using a schema which has three fieldsfields –id, title and text.
Firstly you’ll want to install the library -normally you could do this with pip. Unfortunately I had to make a small change to get the library to work with retrieve and rank so you’ll need to install it from my github repo:
In python you should try running the following (I am using the interactive python shell [IDLE][3] for this example)
<pre>>>> import solr
>>> s = solr.Solr("https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/<CLUSTER_ID>/solr/<COLLECTION_NAME>", http_user="<USERNAME>", http_pass="<PASSWORD>")
>>> s.search("hello world")
<em><strong><solr.core.Response object at 0x7ff77f91d7d0></strong></em></pre>
If this worked then you will see something like_**<solr.core.Responseobjectat0x7ff77f91d7d0>**_as output here. If you get an error response – try checking that you have substituted in valid values for <CLUSTER\_ID>, <COLLECTION\_NAME>, <USERNAME> and <PASSWORD>.
From this point onwards things get very easy. solrpy has simple functions for creating, removing and searching items in the SOLR index.
To add a document you can use the code below:
<pre>>>> s.add({"title" : "test", "text" : "this is a test", "id" : 1})
And you can use SOLR queries too (but importantly note that this does not use the retrieve and rank rankers – this only gives you access to the SOLR rankers.)
<pre>>>> r = s.select("test")
>>> r.numFound
<strong>1L
</strong>>>> r.results
<strong>[{u'_version_': 1518020997236654080L, u'text': [u'this is a test'], u'score': 0.0, u'id': u'1', u'title': [u'test']}]</strong>
</pre>
## Querying Rankers
Provided you have [successfully trained a ranker][4]and have the ranker ID handy, you can also query your ranker directly from Python using solrpy too.
<pre>>>> import solr
>>> s = solr.Solr("https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/<CLUSTER_ID>/solr/<COLLECTION_NAME>", http_user="<USERNAME>", http_pass="<PASSWORD>")