Wednesday, March 2, 2016

Facets with Sitecore lucene search

With search engines, faceting is a concept which allows users to filter the results set to give them more relevant results to what they are attempting to search. A common example would be an online clothing store; when searching for clothing they would have facets on type (mens, womans or childrens) and even sizing (small, medium large, etc.). These facets are great from a users perspective, because it allows them to filter out results that are not relevant to them (to use that clothing store example again, I would only be interested in mens clothing in my size).

With Sitecore search using Lucene, facets are simple to implement and make for a much better search experience for users.

In this example, both web page and PDF content is being indexed by Lucene, therefore the facet will be based on content type: either web page or document. Please note that if your facet value stored in the index has spaces, then you will need to read this post on facet phrases or sentences with spaces.

Computed field definition

A facet will require an indexed field in the Lucene search index configuration XML to actually perform the facet on. In this case we will use a computed index field to store whether a given item in the index is a PDF document or a web page.
namespace MyProject
{
    public class IndexType : IComputedIndexField
    {
        /// <inheritdoc />
        public string FieldName { get; set; }
        /// <inheritdoc />
        public string ReturnType { get; set; }

        /// <inheritdoc />
        public object ComputeFieldValue(IIndexable indexable)
        {
            Item item = indexable as SitecoreIndexableItem;

            if (item != null) 
            {
                try
                {
                    if (item.Paths.IsMediaItem)
                    {
                        return "Doc";
                    }
                    else
                    {
                        return "Page";
                    }
                }
                catch (Exception)
                {
                    return null;
                }
            }
           
            return null; // Return null if nothing to index
        }
    }
}
<!-- Type for faceting-->
  <field fieldName="typefacet" storageType="yes" indexType="untokenized"
  patch:after="field[last()]">MyProject.IndexType, MyProject</field>
A pretty simple computed field where we check if the item is a media item or not. Notice that the index type is untokenized because we want the result stored as a single value.

Search code

Implementing the facet into your search code is relatively simple, especially if you are building your search query using predicate logic
public static Expression<Func<SearchModel, bool>> GetSearchPredicate(string searchTerm, string type)
{
    var predicate = PredicateBuilder.True<SearchModel>(); // Items which meet the predicate
            
    // Search the whole phrase - LIKE
    predicate = predicate.Or(x => x.DispalyName.Like(searchTerm)).Boost(1.2f);
    predicate = predicate.Or(x => x.PageDescription.Like(searchTerm)).Boost(1.2f);

    // Search the whole phrase - CONTAINS
    predicate = predicate.Or(x => x.DispalyName.Contains(searchTerm)).Boost(2.0f);
    predicate = predicate.Or(x => x.PageDescription.Contains(searchTerm)).Boost(2.0f);

    // Search the individual words
    foreach (var t in searchTerm.Split(' '))
    {
        var tempTerm = t;

        predicate = predicate.Or(x => x.DispalyName.Contains(t)).Boost(20);
        predicate = predicate.Or(x => x.PageDescription.Contains(t)).Boost(20);
    }

    // Only show items which are not excluded from search
    predicate = predicate.And(x => x.ExcludeFromSearch == false);

    // Type filtering
    predicate = predicate.And(GetFacetPredicate(type));

    return predicate;
}

public static Expression<Func<SearchModel, bool>> GetFacetPredicate(string type)
{
    var predicate = PredicateBuilder.True<SearchModel>(); // Items which meet the predicate

    if (type != null)
    {
        predicate = predicate.Or(x => x.Type == type);
    }

    return predicate;
}
/// <summary>
/// Search item mapped to Lucene index
/// </summary>
public class SearchModel
{
    [IndexField("_name")]
    public string ItemName { get; set; }

    [IndexField("_displayname")]
    public string DispalyName { get; set; }

    [IndexField("_templatename")]
    public string TemplateName { get; set; }

    [IndexField("urllink")]
    public string Url { get; set; }

    [IndexField("page_description")]
    public string PageDescription { get; set; }

    [IndexField("exclude_from_search")]
    public bool ExcludeFromSearch { get; set; }

    [IndexField("typefacet")]
    public string Type { get; set; }
}
var searchIndex = ContentSearchManager.GetIndex("MySearchIndex"); // Get the search index
var searchPredicate = GetSearchPredicate(searchTerm, "Page"); // Build the search predicate

using (var searchContext = searchIndex.CreateSearchContext()) // Get a context of the search index
{
    var searchResults = searchContext.GetQueryable<SearchModel>().Filter(searchPredicate); // Search the index for items which match the predicate
    var searchFacets = searchContext.GetQueryable<SearchModel>().Filter(searchPredicate).FacetOn(x => x.Type).GetFacets(); // Gets facets - useful when you get paged results
    
    // This will get all of the results, which is not reccomended
    var fullResults = searchResults.GetResults();
                
    // This is better and will get paged results - page 1 with 10 results per page
    //var pagedResults = searchResults.Page(1, 10).GetResults();
}
As you can see the build predicate logic has the standard or statements to search the actual content. We then combine this with another predicate logic which selects items which are of a specific type. if no type is selected, all of the results that meet the search phrase would appear as normal.

If you had multiple facet selections (for example type web page, PDF and word document selected), the type predicate builder would loop through each selected type and build an or predicate. Thus the results would return where there is a search term match and any of the selected type facets are met.

Getting facets with count for a given search

In the example above, there were only two possible outcomes for the facet (web page or PDF), so the front end logic used a switch to display the facets. However in cases where there could be any number of facets, it is useful to display them on the page with a count - much like the date facets on this blog (each year and month shows a count of articles).
var searchFacets = searchContext.GetQueryable<SearchModel>().Filter(searchPredicate).FacetOn(x => x.Category).GetFacets();
var categoryFacets = searchFacets.Categories.Where(x => x.Name == "categoryfacet").FirstOrDefault();
var facets = new List<SearchFacet>();

if(categoryFacets != null)
{
    foreach (var facet in categoryFacets.Values)
    {
        facets.Add(new SearchFacet
        {
            Count = facet.AggregateCount,
            Value = facet.Name
        });
    }
}
public class SearchFacet
{
    public string Value { get; set; }
    public int Count { get; set; }
}
Now we have a list of facets and their counts, and we can display them on the front-end with checkboxes or a simple list.

No comments:

Post a Comment