domingo, 22 de julho de 2012

Information seeking: Convergence of search, recommendation and advertising

This paper is one of those that try to give a broader view on a recent research problem. In particular, this paper is focused on the idea of integrating search, recommendation, and advertising into a unique system. The authors are from Stanford and include Hector Garcia-Molina who is pretty famous in the Database Management research community.

The motivation for the convergence among search, recommendation, and advertising is the so called "deluge of data" (or information overload). In this scenario, a fundamental problem consists of identifying objects that satisfy a user's information needs. The authors highlight three important mechanisms related to information providing:

  • Search: Return a set of objects from a collection based on a user query;
  • Recommendation: Return a set of object of the interest of a user based on contextual information.
  • Advertisement: Similar to recommendation but specifically for products and services.
The authors argue that these three mechanisms are strongly related and, as a consequence, an integrated view that takes them into account may bring several benefits. In fact, these mechanisms have as common goal matching a context (e.g., a query) to a collection of information objects. The following figure illustrates this unified view.



As means to characterize search, recommendation and advertisement, the paper describes them in terms of the following properties:


The pull delivery mode refers to those scenarios where there is an explicit user request. On the other hand, in the push delivery mode there is no explicit user request. The unexpected good aspects occurs whenever a novel result may be beneficial. Collective knowledge is related to the use of a historical database in order to provide a better service.

A short coverage of the search, recommendation, and advertising mechanisms is given. I will summarize some of the main points presented by the authors.

Search: It is the most mature between the three technologies. Can be divided in to two main steps: filtering and ranking. The filtering step is usually based on a query and the ranking step aims to sort the filtered results in terms of relevance.

Recommendation: Can be classified by the strategy employed (content-based or collaborative filtering) and the recipient of the recommendations (a user or a group). Content-based strategies recommends objects that are similar to those the user liked in the past. Collaborative filtering strategies identify interesting associations between users and objects consumed by them in order to make recommendations. Moreover, there is a distinction between memory and model-based collaborative filtering techniques. While the first apply statistical measures in to match users with similar tastes, the second build a model that captures such relationships. Regarding the recommendation recipient, recommendation strategies can be divided into those that recommend objects to a single user and those that recommend objects to a group of users (e.g., a movie).

Ads: Sponsored search is the mechanism by which relevant ads are delivered as part of the search experience. Therefore, sponsored search must satisfy both user need for relevant results and the advertiser's desire for qualified traffic to their websites. The usual approach applied by search engines is offering keywords through auctions. The price of a keyword depends on its relevance. Whenever the user submits a query, the search engine matches the query against the keywords and returns the ads the are relevant w.r.t the query.

Currently, the aforementioned information providing mechanisms are separated because they cater to different needs. Moreover, these mechanisms usually bring important performance issues that may prevent their integration in real applications. However, the authors present several points showing that several technologies are evolving towards this integration, which they call Information Consolidation Engine (ICE). An ICE provides a single interface for all information needs, sharing technologies across search, recommendation, and advertising, and covering a broad set of objects.  In the backend, an ICE should merge functionalities of these three mechanisms, supporting better modularization and reuse.

The main challenges in the development of ICEs are:
  • Complex targets: ICEs may return a set (or package) of objects in order to satisfy the users' needs.
  • Constraints: The user may provide a set of filtering, package, sequencing and prohibitive constraints over the set of objects of interest.
  • Personalization: The more information the user has about the user, the better are the recommendations it is able to provide. However, it is known that personalization suitable only for active users and special types of queries.  Moreover, personalization narrows the view of users and bring important privacy issues.
  • Metrics: Evaluating ICEs is a challenge because it involves highly subjective answers. Existing evaluation metrics are not powerful enough to measure the quality of ICEs. In particular, different metrics may be required for different domains.
  • Other challenges: Other challenges include trust (from both the system and user side), explainability, and user feedback gathering. 
The paper finishes with a short case study describing Courserank, which is a social site where Stanford students can review courses and plan their academic program. The site provides course recommendations considering university requirements, among other services. Their current research has being focused on formalizing a language for personalization of recommendations, providing consumption plans based on sequencing information, and using tags to created a unified interface to browse and receive search results and recommendations.

I did not like this paper very much. In fact, I was expecting better examples and concepts related to the convergence of the aspects covered in the paper. Also, I believe that social media plays a key role in the integration of search, recommendation, and advertising. However, the authors do not emphasize the role of social media in the paper.

Link: http://ilpubs.stanford.edu:8090/963/

Nenhum comentário:

Postar um comentário