Solr Stemming

There are four types of stemming strategies:

  • Porter: or Reduction stemming — A transforming algorithm that reduces any of the forms of a word such as "runs, running, ran", to its elemental root e.g., "run". Porter stemming must be performed both at insertion time and at query time.
  • Lucene-Hunspell aims to provide features such as stemming, decompounding, spellchecking, normalization, term expansion, etc. taking advantage of the existing lexical resources already created and widely-used in projects like OpenOffice. This is still alpha-version but with an impressive list of supported languages
  • Expansion stemming — Takes a root word and 'expands' it to all of its various forms — can be used either at insertion time or at query time. One way to approach this is by using the SynonymFilterFactory
  • KStem an alternative to Porter for developers looking for a less agressive stemmer.
