“In this post, without getting too technical, I overview how SharePoint’s Search Ranking Model works and what can be customized to improve results relevance”
Just another Monday as a SharePoint developer, this time I found myself researching about OOTB options to customize SharePoint 2010 search results ranking and sorting. Although I found out there is not much that can be done through configuration by a Power User, I did learn a good deal about how SharePoint Search determines results’ relevance. Also, for admins, developers or users who feel comfortable with PowerShell and XML configurations, SharePoint 2010 has features to allow changing or creating ranking models. In this post, without getting too technical, I overview how SharePoint’s Search Ranking Model works and what can be customized to improve results relevance. In a following post , I will cover the customization options and Ranking Model improvements in SharePoint 2013. For those interested more on the technical side of customizing SharePoint Ranking Models, I’ve listed some good resources on the subject at the end of this post.
SharePoint Ranking Model
First, what is a ranking model? Microsoft’s definition:
“A ranking model determines recall (which items are displayed in the search results) and rank (the order in which search results are displayed)”
Everything is pretty clear now, right? Yeah, I thought so. Let’s dive a little deeper. So, what components are involved in Search Ranking? when running a search query, SharePoint uses a complex algorithm to determine which items are most relevant. Relevance is calculated by evaluating all properties and even the content of documents against the user query. Then, relevance is measured in terms of a numeric scores. There are two types of scores used
Dynamic Scores – Query Dependent
How well a document matches a user’s query. This happens real-time on the Query Servers and depends on terms or keywords matching.
You can customize the effect of the dynamic rank points (relevance) by changing the weight of different components in the rank profile through SharePoint’s Object Model.
- Property Weighting
– SharePoint uses managed properties to rank(weight) results. The Title field is ranked pretty high.
– For example, if you want to promote result items of a specific type such as PDF documents, you can give search items with the file extension .pdf additional points.
- Query authority (click-through)
– Weight is calculated based on when a search word matches search items associated with previously performed searches.
– SharePoint ranks a document higher (more valuable) if a document in a search results set was visited (click-through) from a search results page.
– Points given based on the age (last modified) of the item compared to the time of the search. A high value for freshness weight will favor fresh documents in the ranking.
– The time resolution used (second/minutes/hour/day/year) is important when calculating rank. The smaller the resolution (for example Minute) means that smaller time differences will have a greater impact on ranking.
- Authority (anchor text)
– Evaluates the text that describes a target (<a> tags). Points are given based on when a search word occurs in the link (anchor text) of a link pointing to the search item.
Static Scores – Query Independent
How popular or valuable a document is. This ranking occurs at index time so it is independent, not impacted by the content, of a user’s search query.
Document location, file types, authoritative pages, and content language are things you can manipulate to improve a document’s relevance.
- Click Distance
– Number of links between a content item and an Authoritative Page. The more links the search crawler travels from an authoritative page to a document, the lower the relevance score.
– Authoritative Pages can be configured in SharePoint Central Administration. You can also demote sites you want to be ranked lower than all other sites.
– For example, you may demote URLs of sites that contain outdated information but are retained for record-keeping. Or, set your HR page URL as an Authoritative Page to promote resources linked from there.
- URL Depth
– Rank points designed to boost shorter URLs. This is determined by reviewing the number of slash (‘/’) characters in the URL.
– The more slashes (deeper in the site structure) in the URL of the document the less valuable it is.
- File Type Biasing
– By default, Web Pages and Word files are considered more relevant than any other file type (E.g. Excel, plain text).
References and Resources
- Improving SharePoint Search Relevance
- Customizing SharePoint Search Ranking Modes
- Components Involved in SharePoint Search Ranking
- SharePoint Search Built-in Ranking Algorithms
- SharePoint 2010 Custom Ranking Models
- Technet: Fast Search Server 2010 Ranking
- Configuring SharePoint Authoritative Pages