challenges in computational advertising n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Challenges in Computational Advertising PowerPoint Presentation
Download Presentation
Challenges in Computational Advertising

Loading in 2 Seconds...

play fullscreen
1 / 74

Challenges in Computational Advertising - PowerPoint PPT Presentation


  • 149 Views
  • Uploaded on

Challenges in Computational Advertising. Deepayan Chakrabarti (deepay@yahoo-inc.com). Online Advertising Overview. Pick ads. Ads. Advertisers. Ad Network. Content. User. Examples: Yahoo, Google, MSN, RightMedia , …. Content Provider. Advertising Setting. Sponsored Search. Display.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Challenges in Computational Advertising


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Challenges in Computational Advertising DeepayanChakrabarti(deepay@yahoo-inc.com)

    2. Online Advertising Overview Pick ads Ads Advertisers Ad Network Content User Examples:Yahoo, Google, MSN, RightMedia, … Content Provider

    3. Advertising Setting Sponsored Search Display Content Match

    4. Advertising Setting Sponsored Search Display Content Match Pick ads

    5. Advertising Setting • Graphical display ads • Mostly for brand awareness • Revenue based on number of impressions (not clicks) Sponsored Search Display Content Match

    6. Advertising Setting Sponsored Search Display Content Match Content match ad

    7. Advertising Setting Sponsored Search Display Content Match Text ads Pick ads Match ads to the content

    8. Advertising Setting • The user intent is unclear • Revenue depends on number of clicks • Query (webpage) is long and noisy Sponsored Search Display Content Match

    9. Advertising Setting Sponsored Search Display Content Match Search Query Sponsored Search Ads

    10. This presentation • Content Match [KDD 2007]: • How can we estimate the click-through rate (CTR) of an ad on a page? CTR for ad j on page i ~109 pages ~106 ads

    11. This presentation • Estimating CTR for Content Match [KDD ‘07] • Traffic Shaping for Display Advertising [EC ‘12] Display ads Article summary click Alternates

    12. This presentation • Estimating CTR for Content Match [KDD ‘07] • Traffic Shaping for Display Advertising[EC ‘12] • Recommend articles (not ads) • need high CTR on article summaries • + prefer articles on which under-delivering ads can be shown

    13. This presentation • Estimating CTR for Content Match [KDD ‘07] • Traffic Shaping for Display Advertising [EC ‘12] • Theoretical underpinnings[COLT ‘10 best student paper] • Represent relationships as a graph • Recommendation = Link Prediction • Many useful heuristics exist • Why do these heuristics work? Goal: Suggest friends

    14. Estimating CTR for Content Match • Contextual Advertising • Show an ad on a webpage (“impression”) • Revenue is generated if a user clicks • Problem: Estimate the click-through rate (CTR) of an ad on a page CTR for ad j on page i ~109 pages ~106 ads

    15. Estimating CTR for Content Match • Why not use the MLE? • Few (page, ad) pairs have N>0 • Very few have c>0 as well • MLE does not differentiate between 0/10 and 0/100 • We have additional information: hierarchies

    16. Estimating CTR for Content Match • Use an existing, well-understood hierarchy • Categorize ads and webpages to leaves of the hierarchy • CTR estimates of siblings are correlated • The hierarchy allows us to aggregate data • Coarser resolutions • provide reliable estimates for rare events • which then influences estimation at finer resolutions

    17. Estimating CTR for Content Match Level 0 • Region= (page node, ad node) • Region Hierarchy • A cross-product of the page hierarchy and the ad hierarchy Level i Region Ad classes Page classes Page hierarchy Ad hierarchy

    18. Estimating CTR for Content Match • Our Approach • Data Transformation • Model • Model Fitting

    19. Data Transformation • Problem: • Solution: Freeman-Tukey transform • Differentiates regions with 0 clicks • Variance stabilization:

    20. Model • Goal: Smoothing across siblings in hierarchy[Huang+Cressie/2000] Level i Each region has a latent state Sr yr is independent of the hierarchy given Sr Sr is drawn from its parent Spa(r) Sparent latent S3 S1 S4 Level i+1 S2 y1 y2 y4 observable 20

    21. Model wpa(r) Spa(r) variance wr Vpa(r) βpa(r) ypa(r) upa(r) Sr variance Vr coeff. βr ur yr 21

    22. Model • However, learning Wr, Vr and βrfor each region is clearly infeasible • Assumptions: • All regions at the same level ℓ sharethe same W(ℓ) and β(ℓ) • Vr = V/Nr for some constant V, since wr Spa(r) Sr Vr βr yr ur

    23. Model • Implications: • determines degree of smoothing • : • Sr varies greatly from Spa(r) • Each region learns its own Sr • No smoothing • : • All Sr are identical • A regression model on features ur is learnt • Maximum Smoothing wr Spa(r) Sr Vr βr yr ur

    24. Model • Implications: • determines degree of smoothing • Var(Sr) increases from root to leaf • Better estimates at coarser resolutions wr Spa(r) Sr Vr βr yr ur

    25. Model • Implications: • determines degree of smoothing • Var(Sr) increases from root to leaf • Correlations among siblings atlevel ℓ: • Depends only on level of least commonancestor wr Spa(r) Sr Vr βr ) yr ur ) > Corr( , Corr( ,

    26. Estimating CTR for Content Match • Our Approach • Data Transformation (Freeman-Tukey) • Model (Tree-structured Markov Chain) • Model Fitting

    27. Model Fitting • Fitting using a Kalman filtering algorithm • Filtering: Recursively aggregate data from leaves to root • Smoothing: Propagate information from root to leaves • Complexity: linear in the number of regions, for both time and space filtering smoothing

    28. Model Fitting • Fitting using a Kalman filtering algorithm • Filtering: Recursively aggregate data from leaves to root • Smoothing: Propagates information from root to leaves • Kalman filter requires knowledge of β, V, and W • EM wrapped around the Kalman filter filtering smoothing

    29. Experiments • 503M impressions • 7-level hierarchy of which the top 3 levels were used • Zero clicks in • 76% regions in level 2 • 95% regions in level 3 • Full dataset DFULL, and a 2/3 sample DSAMPLE

    30. Experiments • Estimate CTRs for all regions R in level 3 with zero clicks in DSAMPLE • Some of these regions R>0 get clicks in DFULL • A good model should predict higher CTRs for R>0 as against the other regions in R

    31. Experiments • We compared 4 models • TS: our tree-structured model • LM (level-mean): each level smoothed independently • NS (no smoothing): CTR proportional to 1/Nr • Random: Assuming |R>0| is given, randomly predict the membership of R>0 out of R

    32. Experiments TS Random LM, NS

    33. Experiments • MLE=0 everywhere, since 0 clicks were observed • What about estimated CTR? Variability from coarser resolutions Close to MLE for large N Estimated CTR Estimated CTR Impressions Impressions No Smoothing (NS) Our Model (TS)

    34. Estimating CTR for Content Match • We presented a method to estimate • rates of extremely rare events • at multiple resolutions • under severe sparsity constraints • Key points: • Tree-structured generative model • Extremely fast parameter fitting

    35. Traffic Shaping • Estimating CTR for Content Match [KDD ‘07] • Traffic Shaping for Display Advertising [EC ‘12] • Theoretical underpinnings [COLT ‘10 best student paper]

    36. Traffic Shaping Which article summary should be picked? Ans:The one with highest expected CTR Which ad should be displayed? Ans:The ad that minimizes underdelivery Article pool

    37. Underdelivery • Advertisers are guaranteed some impressions (say, 1M) over some time (say, 2 months) • only to users matching their specs • only when they visit certain types of pages • only on certain positions on the page • An underdelivering ad is one that is likely to miss its guarantee

    38. Underdelivery • How can underdelivery be computed? • Need user traffic forecasts • Depends on other ads in the system • An ad-serving systemwill try to minimizeunder-delivery on thisgraph Demand dj Supply sℓ j ℓ Forecasted impressions(user, article, position) Ad inventory

    39. Traffic Shaping Which article summary should be picked? Ans:The one with highest expected CTR Which ad should be displayed? Ans:The ad that minimizes underdelivery Goal: Combine the two

    40. Traffic Shaping • Goal: Bias the article summary selection to • reduce under-delivery • but insignificant drop in CTR • AND do this in real-time

    41. Outline • Formulation as an optimization problem • Real-time solution • Empirical results

    42. Formulation Ad delivery fraction φℓj ℓ j Demand dj Traffic shaping fraction wki i Supply sk CTRcki k k:(user) j:(ads) i:(user, article) ℓ:(user, article, position)“Fully Qualified Impression” Goal: Infer traffic shaping fractions wki

    43. Ad delivery fraction φℓj Formulation Traffic shaping fraction wki A CTRcki • Full traffic shaping graph: • All forecasted user traffic X all available articles • arriving at the homepage, • or directly on article page • Goal: Infer wki • But forced to infer φℓjas well B C Full Traffic Shaping Graph

    44. Formulation sk wki cki i k ℓ j underdelivery (Satisfy demand constraints) demand Total user traffic flowing to j (accounting for CTR loss)

    45. Formulation i k ℓ j (Satisfy demand constraints) (Bounds on traffic shaping fractions) (Shape only available traffic) (Ad delivery fractions)

    46. Key Transformation • This allows a reformulation solely in terms of new variables zℓj • zℓj = fraction of supply that is shown ad j, assuming user always clicks article

    47. Formulation • Convex program  can be solved optimally

    48. Formulation • But we have another problem • At runtime, we must shape every incoming user without looking at the entire graph • Solution: • Periodically solve the convex problem offline • Store a cache derived from this solution • Reconstruct the optimal solution for each user at runtime, using only the cache

    49. Outline • Formulation as an optimization problem • Real-time solution • Empirical results

    50. Real-time solution Cache these Reconstruct using these All constraints can be expressed as constraints on σℓ