Advertising papers [Listed by problems](uncompleted)

More than 100 related papers listed in this pages. Most of them are from WWW, SIGIR, KDD, WSDM and CIKM.

Note that this paper list might be incomplete. If your paper is not listed, please let us know paullzn@gmail.com.

Query Analysis

  • Clustering Query Refinements by User Intent
    Author:
    Eldar Sadikov, eldar@cs.stanford.edu
    Abstract:
    The paper introduce a new method to solve the query suggestions problem by combining document click and session co-occurrence information by performing multiple random walks on a Markov graph that approximates user search behavior. The motivation is to be represent distinct information that the user may need, and improve related-query suggestions across user sessions.
  • Exploiting Query Reformulations for Web Search Result Diversification
    Author:  Rodrygo L. T. Santos, rodrygo@dcs.gla.ac.uk
    Abstract:
    When aWeb user’s underlying information need is not clearly specified from the initial query, an effective approach is to diversify the results retrieved for this query. In this paper, we introduce a novel probabilistic framework for Web search result diversification, which explicitly accounts for the various aspects associated to an underspecified query. In particular, we diversify a document ranking by estimating how well a given document satisfies each uncovered aspect and the extent to which different aspects are satisfied by the ranking as a whole.
  • An Optimization Framework for Query Recommendation
    Author: Aris Anagnostopoulos,  aris@cs.brown.edu
    Abstract:
    In this paper, we present a formal treatment of the problem of query recommendation. In our framework we model the querying behavior of users by a probabilistic reformulation graph, or query-flow graph [Boldi et al. CIKM 2008]. A sequence of queries submitted by a user can be seen as
    a path on this graph. Assigning score values to queries allows us to define suitable utility functions and to consider the expected utility achieved by a reformulation path on the query-flow graph. Providing recommendations can be seen as adding shortcuts in the query-flow graph that “nudge” the reformulation paths of users, in such a way that users are more likely to follow paths with larger expected utility.
  • Large Scale Query Log Analysis of Re-Finding
    Author: Sarah K. Tyler, skt@soe.ucsc.edu
    Abstract: Although Web search engines are targeted towards helping people find new information, people regularly use them to re-find Web pages they have seen before. Researchers have noted the existence of this phenomenon, but relatively little is understood about how re-finding behavior differs from the finding of new information. This paper dives deeply into the differences via analysis of three large-scale data sources: 1) query differs from the finding of new information. This paper dives deeply into the differences via analysis of three large-scale data sources: 1) query logs (queries, clicks, result impressions), 2) Web browsing logs (URL visits), and 3) a daily Web crawl (page content).
  • Query Reformulation Using Anchor Text
    Author: Van Dang, vdang@cs.umass.edu
    Abstract:
    Query reformulation techniques based on query logs have been studied as a method of capturing user intent and improving retrieval effectiveness. The evaluation of these techniques has primarily, however, focused on proprietary query logs and selected samples of queries. In this paper, we suggest that anchor text, which is readily available, can be an effective substitute for a query log and study the effectiveness of a range of query reformulation techniques (including log-based stemming, substitution, and expansion) using standard TREC collections. Our results show that log based query reformulation techniques are indeed effective with standard collections, but expansion is a much safer form of query modification than word substitution. We also show that using anchor text as a simulated query log is as least as effective as a real log for these techniques.
  • Query Clustering using Click-Through Graph
    Author: Jeonghee Yi, Jeonghee@yahoo-inc.com
    Abstract:
    In this paper we describe a problem of discovering query clustersfrom a click-through graph of web search logs. The graph consists of a set of web search queries, a set of pages selected for the queries, and a set of directed edges that connects a query node and a page node  clicked by a user for the query. The proposed method extracts all maximal bipartite cliques (bicliques) from a click-through graph and compute  an equivalence set of queries (i.e., a query cluster) from the maximal bicliques. A cluster of queries is formed from the queries in a biclique. We  present a scalable algorithm that enumerates all maximal bicliques from the click-through graph. We have conducted experiments on Yahoo web  search queries and the result is promising.
  • Context-Aware Query Classification
    Author: Huanhuan Cao, caohuan@ustc.edu.cn
    Abstract:
    Understanding users’ search intent expressed through their search queries is crucial to Web search and online advertisement. Web query classification (QC) has been widely studied for this purpose. In this paper, we incorporate context information into the problem of query classification by using conditional random field (CRF) models. In our approach, we use neighboring queries and their corresponding clicked URLs (Web pages) in search sessions as the context information. We perform extensive experiments on real world search logs and validate the effectiveness and efficiency of our approach. We show that we can improve the F1 score by 52% as compared to other state-of-the-art baselines.
  • Click-Through Prediction for News Queries
    Author: Arnd Christian Konig, chrisko@microsoft.com
    Abstract:
    In this paper, we consider the problem of estimating the click-through rate for dedicated news search results. For queries for which news results have been displayed repeatedly before, the click-through rate can be tracked online; however, the key challenge for which previously unseen queries to display news results remains. In this paper we propose a supervised model that offers accurate prediction of news click-through rates and satisfies the requirement of adapting quickly to emerging news events.
  • Efficient Query Expansion for Advertisement Search
    Author: Haofen Wang, whfcarter@sjtu.edu.cn
    Abstract:
    In this paper, we propose an efficient ad search solution relying on a block-based index able to tackle the issues associated with query expansion. Our index structure places clusters of similar bid phrases in corresponding blocks with their associated ads. It reduces the number of merge operations significantly during query expansion and allows sequential scans rather than random accesses, saving I/O costs. We adopt flexible block sizes according to the clustering results of bid phrases to further optimize the index structure for efficient ad search. The pre-computation of such clusters is achieved through an agglomerative iterative clustering algorithm. Finally, we adapt the spreading activation mechanism to return the top-k relevant ads, improving search precision.
  • Multiple Approaches to Analysing Query Diversity
    Author: Paul Clough, p.d.clough@sheffield.ac.uk
    Abstract:
    In this paper we examine user queries with respect to diversity: providing a mix of results across different interpretations. Using two query log analysis techniques (click entropy and reformulated queries), 14.9 million queries from the Microsoft Live Search log were analysed. We found that a broad range of query types may benefit from diversification. Additionally, although there is a correlation between word ambiguity and the need for  diversity, the range of results users may wish to see for an ambiguous query stretches well beyond traditional notions of word sense.
  • Analysis of long queries in a large scale search log
    Author:
    Michael Bendersky, bemike@cs.umass.edu
    Abstract:
    We propose to use the search log to study long queries, in order to understand the types of information needs that are behind them, and to design techniques to improve search effectiveness when they are used. In this paper we analyze the long queries in the search log with the aim of identifying the characteristics of the most commonly occurring types of queries, and the issues involved with using them effectively in a search engine. In addition, we propose a simple yet effective method for evaluating the performance of the queries in the search log using a combination  of the click data in the search log with the existing TREC corpora.

Revenue Optimization

  • Online Learning of Assignments
    Author:
    Matthew Streeter, mstreeter@google.com
    Abstract:
    Which ads should we display in sponsored search in order to maximize our revenue? How should we dynamically rank information sources to maximize the value of the ranking? These applications exhibit strong diminishing returns: Redundancy decreases the marginal utility of each ad or information source. We show that these and other problems can be formalized as repeatedly selecting an assignment of items to positions to maximize a sequence of monotone submodular functions that arrive one by one. We present an efficient algorithm for this general problem and analyze it in the no-regret model. We empirically evaluate our algorithm on two real-world online optimization problems on the web: ad allocation with submodular utilities, and dynamically ranking blogs to detect information cascades.
  • Optimizing Search Engine Revenue in Sponsored Search
    Author: Yunzhang Zhu, v-yuazhu@microsoft.com
    Abstract:
    In this paper, we address a new revenue optimization problem and aim to answer the question: how to construct a ranking model that can deliver high quality ads to the user as well as maximize search engine revenue? We introduce two novel methods from different machine learning perspectives, and both of them take the revenue component into careful considerations. The algorithms are built upon the click-through log data with real ad clicks and impressions. The extensively experimental results verify the proposed algorithm that can produce more revenue than other methods as well as avoid losing relevance accuracy. To provide deep insight into the importance of each feature to search engine revenue, we extract twelve basic features from four categories. The experimental study provides a feature ranking list according to the revenue benefit of each feature.

Click Behavior Analysis

  • Inferring Search Behaviors Using Partially Observable Markov Model with Duration (POMD)
    Author: Yin He, Samuel.HY@gmail.com
    Abstract:
    This paper presents Partially Observable Markov model with Duration (POMD), a statistical method that addresses the challenge of understanding sophisticated user behaviors from the search log in which some user actions, such as reading and skipping search results, cannot be observed and recorded. POMD utilizes not only the positional but also the temporal information of the clicks in the log. In this work, they treat the user engagements with a search engine as a Markov process, and model the unobservable engagements as hidden states.
  • A Novel Click Model and Its Applications to Online Advertising
    Author: Zeyuan Allen Zhu, zhuzeyuan@hotmail.com
    Abstract:
    Recent advances in click model have positioned it as an attractive method for representing user preferences in web search and online advertising. Yet, most of the existing works focus on training the click model for individual queries, and cannot accurately model the tail queries due to the lack of training data. Simultaneously, most of the existing works consider the query, url and position, neglecting some other important attributes in click log data, such as the local time. Obviously, the click through rate is different between daytime and midnight. In this paper, we propose a novel click model based on Bayesian network, which is capable of modeling the tail queries because it builds the click model on attribute values, with those values being shared across queries. We called our work General Click Model (GCM) as we found that most of the existing works can be special cases of GCM by assigning different parameters. Experimental results on a large-scale commercial advertisement dataset show that GCM can significantly and consistently lead to better results as compared to the state-of-the-art works.
  • Beyond DCG- User Behavior as a Predictor of a Successful Search
    Author: Ahmed Hassan, hassanam@umich.edu
    Abstract:
    Web search engines are traditionally evaluated in terms of the relevance of web pages to individual queries. However, relevance of web pages does not tell the complete picture, since an individual query may represent only a piece of the user’s information need and users may have different information needs underlying the same queries. We address the problem of predicting user search goal success by modeling user behavior. We show empirically that user behavior alone can give an accurate picture of the success of the user’s web search goals, without considering the relevance of the documents displayed.
  • A Dynamic Bayesian Network Click Model for Web Search Ranking
    Author: Olivier Chapelle, chap@yahoo-inc.com
    Abstract:
    As with any application of machine learning, web search ranking requires labeled data. The labels usually come in the form of relevance assessments made by editors. Click logs can also provide an important source of implicit feedback and can be used as a cheap proxy for editorial labels. The main difficulty however comes from the so called position bias — urls appearing in lower positions are less likely to be clicked even if they are relevant. In this paper, we propose a Dynamic Bayesian Network which aims at providing us with unbiased estimation of the relevance from the click logs. Experiments show that the proposed click model outperforms other existing click models in predicting both click-through rate and relevance.
  • Click Chain Model in Web Search
    Author: Fan Guo, fanguo@cs.cmu.edu
    Abstract:
    It is commonly believed that web search click logs are a gold mine for search business, because they reflect users’ preference over web documents presented by the search engine. Click models provide a principled approach to inferring user-perceived relevance of web documents, which can be leveraged in numerous applications in search businesses.  We present the click chain model (CCM), which is based on a solid, Bayesian framework. It is both scalable and incremental, perfectly meeting the computational challenges imposed by the voluminous click logs that constantly grow.
  • How Much Can Behavioral Targeting Help Online Advertising
    Author: Jun Yan, junyan@microsoft.com
    Abstract:
    Behavioral Targeting (BT) is a technique used by online advertisers to increase the effectiveness of their campaigns, and is playing an increasingly important role in the online advertising market. I this paper, we provide an empirical study on the click-through log of advertisements collected from a commercial search engine. From the experiment results over a period of seven days, we draw three important conclusions: (1) Users who clicked the same ad will truly have similar behaviors on the Web; (2) Click-Through Rate (CTR) of an ad can be averagely improved as high as 670% by properly segmenting users for behavioral targeted advertising in a sponsored search; (3) Using short term user behaviors to represent users is more effective than using long term user behaviors for BT.
  • Spatio-Temporal Models for Estimating Click-through Rate
    Author: Deepak Agarwal, dagarwal@yahoo-inc.com
    Abstract:
    We propose novel spatio-temporal models to estimate click-through rates in the context of content recommendation. We track article CTR at a fixed location over time through a dynamic Gamma-Poisson model and combine information from correlated locations through dynamic linear regressions, significantly improving on per-location model. Our models adjust for user fatigue through an exponential tilt to the first-view CTR (probability of click on first article exposure) that is based only on user-specific repeat-exposure features. We illustrate our approach on data obtained from a module (Today Module) published regularly on Yahoo! Front Page and demonstrate significant improvement over commonly used baseline methods.
  • Adaptation of Offline Vertical Selection Predictions in the Presence of User Feedback
    Author: Fernando Diaz, diazf@yahoo-inc.com
    Abstract: Web search results often integrate content from specialized corpora known as verticals. Given a query, one important aspect of aggregated search is the selection of relevant verticals from a set of candidate verticals. One drawback to previous approaches to vertical selection is that methods have not explicitly modeled user feedback. However, production search systems often record a variety of feedback information. In this paper, we present algorithms for vertical selection which adapt to user feedback. We evaluate algorithms using a novel simulator which models performance of a vertical selector situated in realistic query traffic.
  • Good Abandonment in Mobile and PC Internet Search
    Author: Jane Li, janeli@google.com
    Abstract:
    Query abandonment by search engine users is generally considered to be a negative signal. In this paper, we explore the concept of good abandonment. We define a good abandonment as an abandoned query for which the user’s information need was successfully addressed by the search results page, with no need to click on a result or refine the query. We present an analysis of abandoned internet search queries across two modalities (PC and mobile) in three locales. The goal is to approximate the prevalence of good abandonment, and to identify types of information needs that may lead to good abandonment, across different locales and modalities. Our findings imply that it is a mistake to uniformly consider query abandonment as a negative signal. Further, there is a potential opportunity for search engines to drive additional good abandonment, especially for mobile search users, by improving search features and result snippets.
  • When More is Less-The Paradox of Choice in Search Engine Use
    Author: Antti Oulasvirta, Helsinki University of Technology TKK
    Abstract:
    In numerous everyday domains, it has been demonstrated that increasing the number of options beyond a handful can lead to paralysis and poor choice and decrease satisfaction with the choice. Were this so-called paradox of choice to hold in search engine use, it would mean that increasing recall can actually work counter to user satisfaction if it implies choice from a more exten-ive set of result items. The existence of this effect was demonstrated in an experiment where users (N=24) were shown a search scenario and a query and were required to choose the best result item within 30 seconds. Having to choose from six results yielded both higher subjective satisfaction with the choice and greater confidence in its correctness than when there were 24 items on the results page. We discuss this finding in the wider context of “choice architecture”—that is, how result presentation affects choice and satisfaction.
  • Large-Scale Behavioral Targeting
    Author:
    Ye Chen, yechen1@ebay.com
    Abstract:
    Behavioral targeting (BT) leverages historical user behavior to select the ads most relevant to users to display. The state-of-the-art of BT derives a linear Poisson regression model from fine-grained user behavioral data and predicts click-through rate (CTR) from user history. We designed and implemented a highly scalable and efficient solution to BT using Hadoop MapReduce framework. With our parallel algorithm and the resulting system, we can build above 450 BT-category models from the entire Yahoo’s user base within one day, the scale that one can not even imagine with prior systems. Moreover, our approach has yielded 20% CTR lift over the existing production system by leveraging the  well-grounded probabilistic model fitted from a much larger training dataset.
  • Predicting Bounce Rates in Sponsored Search Advertisements
    Author: D. Sculley, dsculley@google.com
    Abstract:
    This paper explores an important and relatively unstudied quality measure of a sponsored search advertisement: bounce rate. The bounce rate of an ad can be informally defined as the fraction of users who click on the ad but almost immediately move on to other tasks. A high bounce rate can lead to poor advertiser return on investment, and suggests search engine users may be having a poor experience following the click. In this paper, we first provide quantitative analysis showing that bounce rate is an effective measure of user satisfaction. We then address the question, can we predict bounce rate by analyzing the features of the advertisement? An affirmative answer would allow advertisersand search engines to predict the effectiveness and quality of advertisements before they are shown. We propose solutions to this problem involving large-scale learning methods that leverage features drawn from ad creatives in addition to their keywords and landing pages.
  • PSkip-estimating relevance ranking quality from web search clickthrough data
    Author: Kuansan Wan, Microsoft Corporation
    Abstract:
    In this article, we report our efforts in mining the information encoded as clickthrough data in the server logs to evaluate and monitor the relevance ranking quality of a commercial web search engine. We describe a metric called pSkip that aims to quantify the ranking quality by estimating the probability of users encountering non relevant results that cost them the efforts to read and skip. A search engine with a lower pSkip is regarded as having a better ranking quality. A key design goal of pSkip is to integrate the findings from two sets of user studies that utilize eye-tracking devices to track users’ browsing patterns on the search result pages, and that use specially instrumented browsers to actively solicit users’ explicit judgments on their search activities.
  • Efficient Multiple-Click Models in Web Search
    Author: Fan Guo, fanguo@cs.cmu.edu
    Abstract:
    Many tasks that leverage web search users’ implicit feed-back rely on a proper and unbiased interpretation of user clicks. Previous eye-tracking experiments and studies on explaining position-bias of user clicks provide a spectrum of hypotheses and models on how an average user examines and possibly clicks web documents returned by a search engine with respect to the submitted query. In this paper, we attempt to close the gap between previous work, which studied how to model a single click, and the reality that multiple clicks on web documents in a single result page are not uncommon. Specifically, we present two multiple-click models: the independent click model (ICM) which is reformulated from previous work, and the dependent click model (DCM) which takes into consideration dependencies between multiple clicks.
  • Tailoring Click Models to User Goals
    Author: Fan Guo, fanguo@cs.cmu.edu
    Abstract:
    Click models provide a principled way of understanding user interaction with web search results in a query session and a statistical tool for leveraging search engine click logs to analyze and improve user experience. In this paper, we present how to tailor click models to user goals in web search through query term classification. We demonstrate that better predicative power could be achieved by fitting two click models for navigational queries and informational queries respectively, as evidenced by the likelihood and perplexity evaluation results on a subset of the MSN 2006 RFP data which consists of 121,179 distinct query terms and over 2.8 million query sessions.

Relevance Improvement

  • Improving Ad Relevance in Sponsored Search
    Author:
    Dustin Hillard, dhillard@yahoo-inc.com
    Abstract:
    We describe a machine learning approach for predicting sponsored search ad relevance. Our baseline model incorporates basic features of text overlap and we then extend the model to learn from past user clicks on advertisements. We present a novel approach using translation models to learn user click propensity from sparse click logs. Our relevance predictions are then applied to multiple sponsored search applications in both oine editorial evaluations and live online user tests. The predicted relevance score is used to improve the quality of the search page in three areas: ltering low quality ads, more accurate ranking for ads, and optimized page placement of ads to reduce prominent placement of low relevance ads. We show signi ficant gains across all three tasks.
  • A Search-based Method for Forecasting Ad Impression in Contextual Advertising
    Author: Xuerui Wang, marcusf@yahoo-inc.com
    Abstract:
    In this paper, we address the problem of forecasting the number of impressions for new or changed ads in the system. Producing such forecasts, even within large margins of error, is quite challenging: 1) ad selection in contextual advertising is a complicated process based on tens or even hundreds of page and ad features; 2) the publishers’ content and traffic vary over time; and 3) the scale of the problem is daunting: over a course of a week it involves billions of impressions, hundreds of millions of distinct pages, hundreds of millions of ads, and varying bids of other competing advertisers. We tackle these complexities by simulating the presence of a given ad with its associated bid over weeks of historical data.
  • Smoothing Clickthrough Data for Web Search Ranking
    Author: Jianfeng Gao, jfgao@microsoft.com
    Abstract:
    Incorporating features extracted from clickthrough data (called clickthrough features) has been demonstrated to significantly improve the performance of ranking models for Web search applications. Such benefits, however, are severely limited by the data sparseness problem, i.e., many queries and documents have no or very few clicks. The ranker thus cannot rely strongly on clickthrough features for document ranking. This paper presents two smoothing methods to expand clickthrough data: query clustering via Random Walk on click graphs and a discounting method inspired by the Good-Turing estimator. Experimental results show that the ranking models trained on smoothed clickthrough features consistently outperform those trained on unsmoothed features. This study demonstrates both the importance and the benefits of dealing with the sparseness problem in clickthrough data.
  • Translating Relevance Scores to Probabilities for Contextual Advertising
    Author:
    Deepak Agarwal, dagarwal@yahoo-inc.com
    Abstract:
    Information retrieval systems conventionally assess docu-ent relevance using the bag of words model. Consequently, relevance scores of documents retrieved for different queries are often difficult to compare, as they are computed on different (or even disjoint) sets of textual features. Many tasks, such as federation of search results or global thresholding of relevance scores, require that scores be globally  comparable. To achieve this aim, we propose methods for non-monotonic transformation of relevance scores into probabilities for a contextual advertising selection engine that uses a vector space model. The calibration of the raw scores is based on historical click data.

Bid Phrases and Keyword Auctions

  • Adaptive Weighing Designs for Keyword Value Computation
    Author: John W. Byers, byers@cs.bu.edu
    Abstract:
    We introduce the channelization problem: how do we adaptively assign keywords to channels over the course of multiple days to quickly obtain accurate VPC estimates of all keywords? We relate this problem to classical results in weighing design, devise new adaptive algorithms for this problem, and quantify the performance of these algorithms experimentally. Our results demonstrate that adaptive weighing designs that exploit statistics of term frequency, variability in VPCs across keywords, and flexible channel assignments over time provide the best estimators of keyword VPCs.
  • Automatic Generation of Bid Phrases for Online Advertising
    Author: Sujith Ravi, sravi@isi.edu
    Abstract:
    Our study aims towards the automatic construction of online ad campaigns: given a landing page, we propose several algorithmic methods to generate bid phrases suitable for the given input. Such phrases must be both relevant (that is, reflect the content of the page) and well-formed (that is, likely to be used as queries to a Web search engine). To this end, we use a two phase approach. First, candidate bid phrases are generated by a number of methods, including a (monolingual) translation model capable of generating phrases not contained within the text of the input as well as previously “unseen” phrases. Second, the candidates are ranked in a probabilistic framework using both the translation model, which favors relevant phrases, as well as a bid phrase language model, which favors well-formed phrases.
  • Advertising Keyword Generation Using Active Learning
    Author:
    Hao Wu, haowu@zju.edu.cn
    Abstract:
    This paper proposes an efficient relevance feedback based interactive model for keyword generation in sponsored search advertising. We formulate the ranking of relevant terms as a supervised learning problem and suggest new terms for the seed by leveraging user relevance feedback information. Active learning is employed to select the most informative samples from a set of candidate terms for user labeling. Experiments show our approach improves the relevance of generated terms significantly with little user effort required.
  • Competitive Analysis from Click-Through Log
    Author: Gang Wang, gawa@microsoft.com
    Abstract:
    Existing keyword suggestion tools from various search engine companies could automatically suggest keywords related to the advertisers’ products or services, counting in simple statistics of the keywords, such as search volume, cost per click (CPC), etc. However, the nature of the generalized Second Price Auction suggests that better understanding the competitors’ keyword selection and bidding strategies better helps to win the auction, other than only relying on general search statistics. In this paper, we propose a novel keyword suggestion strategy, called Competitive Analysis, to explore the keyword based competition relationships among advertisers and eventually help advertisers to build campaigns with better performance.
  • General Auction Mechanism for Search Advertising
    Author: Gagan Aggarwal, gagana@google.com
    Abstract:
    In sponsored search, a number of advertising slots is available on a search results page, and have to be allocated among a set of advertisers competing to display an ad on the page. In this paper, we model advertising auctions in terms of an assignment model with linear utilities, extended with bidder and item specific maximum and minimum prices. Auction mechanisms like the commonly used GSP or the well-known Vickrey-Clarke-Groves (VCG) can be interpreted as simply computing a bidder-optimal stable matching in this model, for a suitably defined set of bidder preferences, but our model includes much richer bidders and preferences. Our main technical contributions are the existence of bidder-optimal matchings and strategyproofness of the resulting mechanism, and are proved by induction on the progress of the matching algorithm.
  • Hybrid Keyword Search Auctions
    Author: Ashish Goel, ashishg@stanford.edu
    Abstract:
    Search auctions have become a dominant source of revenue generation on the Internet. Such auctions have typically used per-click bidding and pricing. We propose the use of hybrid auctions where an advertiser can make a per-impression as well as a per-click bid, and the auctioneer then chooses one of the two as the pricing mechanism. We assume that the advertiser and the auctioneer both have separate beliefs (called priors) on the click-probability of an advertisement. We first prove that the hybrid auction is truthful, assuming that the advertisers are risk-neutral. We then show that this auction is superior to the existing per-click auction in multiple ways. As Internet commerce matures, we need more sophisticated pricing models to exploit all the information held by each of the participants. We believe that hybrid auctions could be an important step in this direction.
  • Towards Intent-Driven Bidterm Suggestion
    Author: William Chang, wchang@isi.edu
    Abstract:
    In online advertising, pervasive in commercial search engines, advertisers typically bid on few terms, and the scarcity of data makes ad matching difficult. Suggesting additional bidterms can significantly improve ad clickability and conversion rates. In this paper, we present a large-scale bidterm suggestion system that models an advertiser’s intent and finds new bidterms consistent with that intent. Preliminary experiments show that our system significantly increases the coverage of a state of the art production system used at Yahoo while maintaining comparable precision.

Presentation Bias Analysis

  • Beyond Position Bias- Examining Result Attractiveness as   a Source of Presentation Bias in Clickthrough Data
    Author: Yisong Tue, yyue@cs.cornell.edu
    Abstract: In this paper, we examine result summary attractiveness as a potential source of presentation bias. This study distinguishes itself from prior work by aiming to detect systematic biases in click behavior due to attractive summaries inflating perceived relevance. Our experiments conducted on the Google web search engine show substantial evidence of presentation bias in clicks towards results with more attractive titles.

Personalized Click Prediction

  • Matchbox-Large Scale Online Bayesian Recommendations
    Author: David Stern, dstern@microsoft.com
    Abstract:
    We present a probabilistic model for generating personalised recommendations of items to users of a web service. The Matchbox system makes use of content information in the form of user and item meta data in combination with collaborative filtering information from previous user behavior in order to predict the value of an item for a user. Users and items are represented by feature vectors which are mapped into a low-dimensional ‘trait space’ in which similarity is measured in terms of inner products. The model can be trained from different types of feedback in order to learn user-item preferences.
  • Discovering and Using Groups to Improve Personalized Search
    Author: Jaime Teevan, teevan@microsoft.com
    Abstract:
    Personalized Web search takes advantage of information about an individual to identify the most relevant results for that person. To better understand whether groups of people can be used to benefit personalized search, we explore the similarity of query selection, desktop information, and explicit relevance judgments across people grouped in different ways. The groupings we explore fall along two dimensions: the longevity of the group members’ relationship, and how explicitly the group is formed. We find that some groupings provide valuable insight into what members consider relevant to queries related to the group focus, but that it can be difficult to identify  valuable groups implicitly. Building on these findings, we explore an algorithm to "groupize" (versus "personalize") Web search results that leads to a significant improvement in result ranking on group-relevant queries.
  • Personalized Click Prediction in Sponsored Search
    Author: Haibin cheng, hcheng@yahoo-inc.com
    Abstract: The objective of this paper is to present a framework for the personalization of click models in sponsored search. We develop user-specific and demographic-based features that reflect the click behavior of individuals and groups. The features are based on observations of search and click behaviors of a large number of users of a commercial search engine. We add these features to a baseline non-personalized click model and perform experiments on offline test sets derived from user logs as well as on live traffic. Our results demonstrate that the personalized models significantly improve the accuracy of click prediction.