Related Content Widgets
From developer.newsgator.com
Related Content widgets are a different type of search widget based on an understanding of the content of the posts instead of just keywords.
Watch related content screencasts:
Topics
Related Content widgets are based around topics. Topics are "who, what and where" type data, things you could have a conversation about or write an article about, for example. Typically a single post will have multiple topics. For example a post might cover: cars, trucks, Chevy, Ford and racing. Another post might have topics like politics, Washington, election and John McCain.
When a Related Content widget is shown on a page we will find the matching post in our system, determine the topics for that post, and then find other posts on the same topics. This allows you to automatically offer related content to your readers without the need to manually choose it.
Related Content Search vs Keyword Search
The keyword search widgets just look for posts containing the words you specify, without any real understanding of what the words mean. So if you search for "Paris Hilton" you might get back posts about hotels in France. With a Related Content widget you will instead get back posts about celebrities. We will also return the list of topics so that you can use them in the widget.
Whitelists
A whitelist is a list of feeds that you select to appear in your widget. Only posts from those feeds will be displayed in the widget, even if there are posts from other feeds that have the same topics. This ensures that the posts shown in your widget are only from sources that are appropriate and provide high quality content. You may also only show other articles published by your company or organization.
Note that the whitelist you select will also be used to restrict the results from the secondary search, if a secondary search is required.
Feed Requirements
Related Content widgets work by analyzing posts that we find in RSS or ATOM feeds. We do not screen scrape or otherwise analyze web pages that are not in RSS feeds. So if the feed contains very little text then we will not be able to extract topics.
We determine which post to use by reading the URL of the page from the browser and comparing it to the Link/HTML URL of the posts in the NewsGator system. The URL comparison we use is an exact, case-sensitive match of the full URL including any query string arguments. If the widget is shown on a page that does not match a post in the NewsGator system then we will display the Secondary Search results.
We are comparing the URL displayed in the browser to the URL in the RSS feed. This does not take into account any redirects that might take place in between. So if the URLs in your RSS feed cause a redirect to the "real" page URL, we may not be able to find the post. Similarly, some feed republishing services will rewrite the links to your posts so they can track clicks on them. We are still able to work with feeds republished through Feedburner, but other services will not work if they rewrite the post links.
URL Override
If it is not practical for the page URL to exactly match the post URL from the RSS feed then you can specify a URL in the widget script block. You do this by adding the query string argument url. Note that because the value of this is a URL you should be sure to apply URL Encoding. For example:
<script src="http://nmp.newsgator.com/ngbuzz/buzz.ashx?buzzId=1234&apiToken=abcd&url=http%3A%2F%2Fwww.example.com%2Fsomepage.html"></script>
Backfill
It's possible that we won't be able to find anything related to your post, or that we won't find enough related posts (eg, the widget asked for 5, but we only found 3). Of course you'd still like to show your users related stories, so we provide two backup mechanisms to fill out the search results.
Topic Hint
You can request posts relating to a specific topic by specifying the query string argument topic. If we cannot find a post for the page the widget is on, or if we didn't find enough articles related to that post, then we will fill out the results with posts relating to this topic. The value of this argument should be a topic that will match some posts in your widget's whitelist. The topic is not case-sensitive. Ideally it should be URL encoded as well, but typically the browser will handle this correctly for you. Example:
<script src="http://nmp.newsgator.com/ngbuzz/buzz.ashx?buzzId=1234&apiToken=abcd&topic=John%20McCain"></script>
Secondary Search
If there still are not enough posts then we'll do a standard keyword search to fill out the result set. You will be required to provide the keywords for this search when you are building your Related Content widget. The search terms should be as specific as possible, but still general enough to apply to all the places where you expect the widget to be used.
Note that the Secondary Search will also use the whitelist specified in your widget. So the Secondary Search will only find posts from feeds in that whitelist.
Summing Up
To get posts for the Related Content widget we:
- Figure out the source post (see substeps)
- Find the URL to use. This is either the URL Override, or the URL of the host page.
- Look for posts with that exact same URL
- Look up the topics for the post we found in the previous step
- Find other posts with at least one of the same topics, ignoring posts that are not in the whitelist
- If there are not enough posts, look up more posts using the Topic Hint, ignoring posts that are not in the whitelist
- If there still aren't enough posts, use the Secondary Search term to perform a keyword search, ignoring posts that are not in the whitelist
- Display the posts in the widget.
Topic Tokens
In addition to the normal post-related tokens, Related Content widgets expose the list of topics we found for the original post. These are exposed via the new top level token Topics, which is an array of strings. You can print them out by pasting this code in the HTML pane:
{for topic in Topics}
${topic} <br />
{forelse}
No topics found
{/for}
Note that the topics are not available if the widget performed a secondary search. You should be sure to code for this case. This was the purpose of the {forelse} block in the example above.
Troubleshooting
Related Content widgets can be finicky and difficult to implement successfully. The following steps should help you troubleshoot your widget.
Is the Source Feed in NewsGator?
The first step is finding topics for the source post. In order to do that the post must be in a feed that is in the NewsGator system. You can ensure NewsGator knows about a feed by adding it to a whitelist, adding it to a widget, or subscribing to it in NewsGator Online.
Does the Source Feed Provide Full Text?
We need the full text of the post to determine its topics. More text is better, but generally we need at least several paragraphs.
Do the Links in the Source Feed Match the URL In The Browser Address Bar?
We find the source post based on the URL in the address bar (with some exceptions, as mentioned above). The URLs in the feed must use the exact same URL. The comparison is case-sensitive. Note that some services such as FeedBurner will rewrite the links in your feed, which will always break the Related Content widget. You can either publish a feed with correctly matching links, or else use the URL Override function to specify a matching URL.
Does the Source Feed Use Feedburner?
Go back and read the previous troubleshooting question, "Do the Links in the Source Feed Match the URL In The Browser Address Bar?" Note that some feeds redirect to Feedburner, so even if the URL of the feed doesn't say "feedburner.com" in it, it may still use Feedburner. Paste the feed XML Url into a browser and see what you get.
Check the Topics for the Source Post
Check to make sure we're finding topics for the source post. Currently the best way to do this is by writing them out using the Topics Tokens.
Do You Have an Appropriate Whitelist?
You should be sure that the feeds in your whitelist publish posts that are relevant to your source feed. If your source post is about politics but all the feeds in your whitelist talk about photography, you won't have much luck.
Do the Whitelist Feeds Provide Full Text?
We also need full text from the whitelist feeds to determine their topics. If you subscribe to the feeds using NewsGator Online, FeedDemon or another reader you should see full articles. If you only see a sentence or two, then the feeds are not appropriate for a Related Content widget.
More Information
Check out these helpful articles on the NewsGator Technical Blog:
