Advanced Interaction:
Intelligent Browsing
Investigators: Russell Beale
Associated grants: Microsoft, Mobile Devices - EPSRC Capital Equipment
Finding the right things on the internet is not easy - returning things that are of relevance to the user, whilst realising that the user works in many different situations and the notion of relevance varies from context to context. One approach to this problem is to employ intelligent user modelling within the search systems.
The Mitsikeru system we are developing addresses these issues in two ways. Firstly, browsing is done via a proxy server that acts as a cache for pages, and we bias any search towards pages that have been recently browsed. This means that pages that are an equally good keyword match but have been recently looked at are ranked much higher in the returned results. We augment this approach by building up a task-sensitive user model to determine the relevance of particular material, loosely based on latent semantic indexing and Bayesian statistics. This is used to cross-match pages and provide metrics on their similarity. This information is then clustered over time, building up areas of interest that the user has. These clusters provide us with information about the tasks in themselves; together they form a profile of the user's interests.
Into the looking glass
Mitsikeru is interesting in that it is also forward-looking. It looks at current pages and pre-fetches and analyses subsequent ones. The analysis of these future pages allows us to determine which are relevant to the task in hand (i.e. which fall into the current cluster), which are relevant to other tasks the user may have (i.e. those that fall into recent clusters), and so on. We then use this information to annotating the current page to provide guidance about which links are directly related to the current subject and which are relevant to the task in hand. This is achieved in a non-intrusive manner, through colour-coding the links, which provides subtle but immediate feedback.
How it works
Mitsikeru tracks the pages the user accesses through the proxy that sits between the user's browser and the internet. Pages that the user requests are stripped of HTML and a record of the words that occur is kept. This is represented as a frequency table, identifying words that are common and those that are not. This table, for every page ever accessed, represents the global frequency of occurrence of words. We also keep a record of pages accessed recently, giving a more immediate record of commonly seen words. As more pages are accessed, individual tables for each page can be compared to the current immediate record - if similar, they are integrated into the current session, and if dissimilar, they can form the start of a new session. In looking ahead, the system analyses the word table for the page, and compares it to both the global frequency table and the sessional one. We are looking for words that are 'interesting' - those that are not globally common, but which occur in the current session and in the linked page we are analysing. We can use Bayesian statistics to provide us with a quantifiable measure of 'interesting', which is essentially a combination of local commonality coupled with global surprise at seeing those words.
Each link on the current page therefore has a numeric value that represents the likelihood that it is interesting to the user, based on what they have just been looking at. Going to a closely related link pushes the system to incorporate that page into the current session, whilst going to an apparently unrelated page (or typing in a new URL) tends to trigger the start of a new session representing the likely appearance of a new task for the user. This numeric value is used to adjust the link colours, and add DHTML annotations to the page. This means that, for example, we can provide a brief snapshot of the next page when the user hovers over a link. This gives the user the opportunity to quickly flick over the potential pages before moving to the one that is most interesting. It also provides a non-destructive, subtle presentation of the information, which ensures that the errors that the system makes in interpreting what is interesting are not critical and do not significantly hinder the interaction.
By adding annotations and subtly altering the appearance of the currently viewed web page, the Mitsikeru system has no direct interface itself - it acts in the background and does not interrupt the user's web behaviours in any direct way. We view this as an important aspect of the system, in that we are supporting the task without imposing an application interface that has to be explicitly learnt.
Where next
Mitsikeru is currently going through extensive user trialling, but it forms only the first in a sequence of AI-added tools to improve internet usage.
Followup projects include:
- semantic search - looking for what is meant, not simply what was typed
- collaborative browsing and shared user models
- more effective proxy techniques
- web clipping