Notes
Slide Show
Outline
1
Functional Faceted Web Query Classification
  • Bang V. NGUYEN and Min-Yen KAN


  • Department of Computer Science
  • School of Computing
  • National University of Singapore
2
Web queries and their analysis
  • aloha airlines


  • color blindness


  • kazaa lite


  • house document
    no. 587


  • What can we see?
  • Different types of queries!
3
Related Work
  • Theory: earlier Classifications
    (Broder, 2000; Rose and Levinson, 2004)
    • Originally to uncover latent user intent
    • Largely mutually exclusive


  • Practice: Automated classification
    • Kang and Kim (2003)
    • Lee et al. (2004)
    • and others …

  • A clear mismatch between theory and practice
4
Functional Faceted Classification
  • Functional: meaning that an automated engine can use the results to act appropriately


  • Faceted: meaning that queries can manifest different, fine grained behavior in combination
    • Mostly independent facets
5
Outline
  • How to formulate such a classification scheme?


  • How does such a classification relate to previous work?
  • How are web queries distributed given this scheme?
  • How well do human subjects classify against this scheme?


  • Just briefly: how to automate such a classification?
6
A Faceted Classification Scheme
  • Goals of the classification:
  • Should have high coverage
  • Have discrete values
    • For ease of replication and analysis
  • Facets should be largely independent
    • Different combinations of facets possible
  • Classification leads to action


  • Went through several iterations, settled on four facets
    • Others considered but dropped
7
Query: Stanford University

  • Ambiguity = Specific
  • Authority sensitive? = Yes
  • Temporally sensitive? = No
  • Spatially sensitive? = No


  • Most facets have been proposed before by others
  • We integrate them into one classification scheme
  • Let’s look at each facet in more detail


8
Ambiguity
  • Modeled after a library agent’s interpretation of ambiguity (Stojanovic, 2005)


  • Actionable strategies:
    • Disambiguation /  Categorization
    • Suggest modifiers for general queries



  • Examples:
  • Polysemous: contain many senses
    • “mustang”
  • General: cover multiple sub-categories
    • “health”, “travel”
  • Specific: address a coherent set of relevant information
    • “apple iphone”
9
Authority Sensitivity
  • Does the query refer to a well-known concept or an authoritative answer?


  • Actionable strategies:
    • Jump directly to or extract
      authoritative results
    • Favor prestige factors (e.g. PageRank) in ranking results
  • Examples
  • Yes
    • “hsbc internet banking”
    • “U.S. independence day”
  • No
    • “laptop harddrives”
    • “science supplies”
10
Temporal and Spatial Sensitivity
  • Should the query results change with respect to the user’s temporal or spatial context?


  • Lots of related work; just a sample:
    • Categorizing web queries according to geographical locality (Gravano et al , 2003)
    • Information diffusion through blogspace (Gruhl et al, WWW ‘04)
    • Temporal relation between queries (Chien et al , WWW ‘05)


  • Actionable strategies:
    • Suggest temporal or spatial modifiers
    • Make users aware of spatially / temporally changing
    • Take locality of users into account
  • Temporal Examples
  • Yes
    • “university ranking”,
      “u.s. president”
  • No
    • “Toyota Canada”,
      “gravity forces”


  • Spatial Examples
  • Yes
    • “transport services”,
      “pizza order”
  • No
    • “email signature”,
      “how to run fast”
11
Outline: Research questions
  • How to formulate such a classification scheme?


  • 1. How does such a classification relate to previous work?
  • 2. How are web queries distributed given this scheme?
  • 3. How well do human subjects classify against this scheme?


  • Just briefly: how to automate such a classification?
12
1. Relate to R&L’s taxonomy
  • Observations:


  • Previous work only dealt with specific queries
  • Previous work often not actionably differentiated – have same facet signature
13
2. Actual Query Distribution
  • Manual classification
    • 100 queries (limited, small indicative sample)
    • Randomly chosen
    • Reject non-English, offensive queries
    • From AllTheWebTM, 2002


  • Judged by authors
    • Use only the query string and search results as evidence
    • Other data (e.g., clickthrough data) intentionally left out
      • Broader impact
14
Observations
  • Specific queries make up a majority.
  • Almost half of require authoritativeness
  • Temporally and spatially sensitive queries amount to about 1/5th.
    • Percentage growing with mobile devices?
15
3. Replicability by human subjects
  • Subsampled 75 of the 100 queries
  • Asked 25 human volunteers to judge queries
  • Each query is judged by 5 evaluators.
    • Capture the average rating of all users on each facet on Likert 5-point scale
    • (1 sensitive, 5 insensitive)
    • (1 specific, 5 ambiguous)
  • Goals:
  • A. Is our classification replicatable / understandable?
  • B. Are the coarse granularity of facet values ok?
16
Results: ANOVA and polarity
  • ANOVA: do people agree?
  • P-value > .01 people do not differ significantly


  • Conclude:
  • Ambiguity, Temporal and Spatial Sensitivity OK
  • Authority needs work, not consistent
  • Bipolar (1 or 5) vs.
    Non-bipolar (2, 3 or 4)



  • Conclude:
  • Subjects satisfied with coarse grained values
  • Validates our value choices
17
Outline: Research questions
  • How to formulate such a classification scheme?


  • 1. How does such a classification relate to previous work?
  • 2. How are web queries distributed given this scheme?
  • 3. How well do human subjects classify against this scheme?


  • Just briefly: how to automate such a classification?
18
Automatic classification
  • Most facets already have prior work for automated classification


  • Ambiguity:
    • Analyzing content topical distribution; lexical databases
  • Temporal and spatial sensitivity: mining of query log to find temporal modifiers
    • “U.S. president” and “U.S. president 1966
    • measure changing results via content publishing (e.g. RSS)


  • However, authority sensitivity has not yet been explored
19
Auto Authority Sensitivity Classification
  • Hypothesis: authoritative queries often have answers repeated in search results


  • Idea: Look for repeating information
    (c.f., trusted question answering)
    • Examine summary snippets for query
    • Extract keywords and measure their distribution
    • Capture keyword histogram’s skewness:
20
Preliminary Results
  • Use just this one feature
  • Learn a classification boundary
    • SVMlight
  • 78%, 60%, 68% for Precision, Recall, F1


  • Conclude (Indicative)
  • Automatic classification is possible for authority sensitivity
21
Conclusions
  • Proposed a faceted actionable scheme with four facets:
    • Ambiguity, Authority, Temporal and Spatial Sensitivity
  • Related our classification to previous work
  • Showed reliability: authority facet perhaps not well defined
  • Gave indicative distribution based on small query set


  • Future work:
    • Implement a fully automated classification system
    • Larger scale evaluation and distributional analysis
    • Investigate possible expansion of the taxonomy
      • Requirement for special collection?
22
Backup Slides
  • Tuesday, May 8
  • Location: Beatty?
  • 25 minute slot?
  • 20 minute talk
    • 2 Motivation
    • 8 Facets
    • 7 RQ
    • 2 Automated
    • 1 Conclusion