20 Jun 2012
A while ago I blogged on testing Solr with Maven on the synyx blog. In this post I will show you how to setup a similar project with Gradle that can start the Solr webapp and execute tests against your configuration.
Read more
16 Jun 2012
Sometimes when using Lucene you might want to retrieve all term values for a given field. Think of categories that you want to display as search links or in a filtering dropdown box. Indexing might look something like this:
Read more
07 Jun 2012
Berlin Buzzwords is an annual conference on search, store and scale technology. I've heard good things about it before and finally got convinced to go there this year. The conference itself lasts for two days but there are additional events before and afterwards so if you like you can spend a whole week.
Read more
12 May 2012
Sometimes you need access to the content of documents, be it that you want to analyze it, store the content in a database or index it for searching. Different formats like word documents, pdfs and html documents need different treatment. Apache Tika is a project that combines several open source projects for reading content from a multitude of file formats and makes the textual content as well as some metadata available using a uniform API. I will show two ways how to leverage the power of Tika for your projects.
Read more
08 May 2012
I am working on a search solution that makes some of the content I am producing available through one search interface. One of the content stores is the blog you are reading right now, which among other options makes the content available here using Atom.
Read more