learning user behaviors for advertisements click prediction n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Learning User Behaviors for Advertisements Click Prediction PowerPoint Presentation
Download Presentation
Learning User Behaviors for Advertisements Click Prediction

Loading in 2 Seconds...

play fullscreen
1 / 25

Learning User Behaviors for Advertisements Click Prediction - PowerPoint PPT Presentation


  • 521 Views
  • Uploaded on

Learning User Behaviors for Advertisements Click Prediction . Chieh-Jen Wang & Hsin-Hsi Chen National Taiwan University Taipei, Taiwan. Introduction. The commercial value of advertisements on the web depends on whether users click on the advertisements

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Learning User Behaviors for Advertisements Click Prediction' - salena


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
learning user behaviors for advertisements click prediction

Learning User Behaviors for Advertisements Click Prediction

Chieh-Jen Wang & Hsin-Hsi Chen

National Taiwan University

Taipei, Taiwan

introduction
Introduction
  • The commercial value of advertisements on the web depends on whether users click on the advertisements
  • Predicting potential advertisement clicks of users before target advertisements are displayed is important
    • advertisement recommendation
    • advertisement placement
    • presentation pricing
  • Problem specification
    • Given a current search session (q1, q2, ..., q(i-1)), we will predict if there is an ad click event when query qi is submitted.
related work
Related Work
  • Advertisiment click prediction model
    • Feature representation
      • text features (Richardson et al., 2007)
      • demographics features (Cheng & Cantú-Paz, 2010)
      • mouse trajectory features (Guo & Agichtein, 2010)
    • Machine learning algorithm
      • logistic regression (Richardson, Dominowska, & Ragno, 2007)
      • maximum entropy (Cheng & Cantú-Paz, 2010)
      • support vector machines (Broder et al., 2008)
      • conditional random field (Guo & Agichtein, 2010)
related work1
Related Work
  • User search intent
    • navigational, informational and transactional (Broder, 2002)
    • noncommercial/commercial & navigational/informational (Ashkan et al., 2009)
    • research & purchase (Guo & Agichtein, 2010)
    • receptive & not receptive (Guo & Agichtein, 2010)
      • “receptive” (i.e., an advertisement click is expected in a future search within the current session)
      • “not receptive” (i.e., not any future advertisement clicks are expected within the current session)
microsoft adcenter logs
Microsoft AdCenter Logs
  • Time: 2007-08-10 ~ 2007-11-01(84 days)
  • The Microsoft AdCenter logs include:
    • 101 million impressions
    • 7.82 million clicks
    • 40.6 million sessions (5.06 million sessions contain at least one click)
  • An impression is defined as a single search results page described by a set of attributes
  • A session is defined by a repeated search engine usage of intervals of 10 minutes and less, with a total session not longer then 8 hours
data purify
Data Purify
  • For the purposes of promotions, some specific queries are issued or advertisements are clicked by software robots
  • Filter criteria
    • issue queries more than 7 times in any 10 second interval
    • issue queries at two distinct places at the same time
    • click an advertisement more than one time in any 5 second interval
    • duplicated impression IDs
  • Data partition
    • Training: sessions which contain at least one advertisement click in the first 56 days
    • Testing: sessions in the last 28 days
feature extraction
Feature Extraction
  • Feature representation
    • Every impression qi (1in) in session s = (q1, q2, ..., q(i-1), qi, q(i+1), ..., qn) is represented as a feature vector
    • qi itself (Current Impression Level)
    • the first impression q1 (First Impression Level)
    • the previous n impression q(i-n) (Previous n Impression Level)
    • all the contextual impressions q1, q2, ..., q(i-1) in s (Contextual Impression Level)
  • Labeling
    • click if impression qi contains at least one advertisement click, otherwise non-click.
feature extraction from current impression level
Feature Extraction from Current Impression Level
  • These features aim to capture query information, users’ intent and the similarity between current query an previous one
  • QC (query category)
    • 14 categories (exclusive of “Regional” and “World”) on the 2nd level of the Open Directory Project (ODP) ontology to represent query categories
  • QIntent (query intent)
    • 4,020 intent clusters are learned from MSN Search Query Log excerpt (Wang et al., 2010)
    • QIntent is specified by the distribution of the top 100 similar intent clusters
feature extraction from first impression level
Feature Extraction from First Impression Level
  • These features aim to capture an initial search goal of a session.
feature extraction from previous n impression level
Feature Extraction from Previous n Impression Level
  • These features aim to capture the advertisements clicks information of the previous n impression.
  • In our experiments, n is set to 1 and 2
feature extraction from contextual impression level1
Feature Extraction from Contextual Impression Level
  • These features represent a sequence of users’ behaviors
  • Weight of intent types of submitted queries (CTQIntent) and clicked advertisements (CTAdIntent) in the access history is defined as:
    • Pm is a probability of the type m intent
    • wjdenotes a query or a clicked advertisement in qj
  • Weight of ODP categories (CTQC & CTAdC)

Jelinek-mercer smoothing

click prediction model
Click Prediction Model
  • Four learning algorithms
    • Conditional Random Fields (CRF)
    • Support Vector Machine (SVM)
      • kernel function (RBF, linear kernel)
      • parameter optimization (grid algorithm for c and g)
    • Decision Tree
      • C4.5 Tree
    • Back-Propagation Neural Networks
      • Hidden Layer =2
      • Learning rate = 0.8
      • Momentum = 0.2
feature selection algorithm
Feature Selection Algorithm
  • Random Subspace Method (RS)
    • an ensemble classifier that consists of several classifiers
    • prediction is through a majority vote from the classifiers
  • F-Score (FS) & Information Gain (IG)
    • greedy inclusion algorithm
    • retain a number of the best terms or features for use by the classier
performance of advertisements click prediction
Performance of Advertisements Click Prediction
  • Metrics
    • accuracy (Acc), precision (Prec), recall (Rec), and F-measure (F1)
  • Baseline
    • guessing the majority class (non-click) is one baseline.
    • Markov Model (MM), formulated by query transition.
conclusion and future work
Conclusion and Future Work
  • We explore the effects of various intent-related features on advertisements click prediction
  • CRF model performs better than two baselines and SVM significantly
  • When random subspace method is introduced to feature selection, the precision of click prediction is increased from 0.1663 to 0.1721
  • In the future, we plan to expand our model to consider fine-grained user intent and user interactions
  • In addition, we will extend this approach to predict which advertisements will be clicked
slide25

Thank You

Q & A