Notes
Slide Show
Outline
1
Digital Libraries
  • Patterns of Use


  • Week 10 Min-Yen Kan
2
Two parts:
  • Integrating information seeking and HCI
    in the context of:


  • Digital Libraries


  • The Web
3
Digital Libraries
  • What uses do we commonly use the library for?
    Accounting for


    • Different age groups?


    • Different professions?


    • Public or private access points?
4
One dimension: expertise
  • “What better contribution could a scholar make than an article which could … provide a clear, but vivid argument to the [secondary school student] but which, if unraveled, could provide the rigor demanded by the most crusty specialist?” Crane (of the Perseus DL)


  • Question: How do DL designers support this in terms of HCI?
  • Answer: Creating different document layers.
    Allow users to “fold” the document to see the only the relevant portions.
5
"Overview + Details shown as..."
  • Overview + Details shown as best
    • (Hornbaek & Frokjaer 01)
    • Fisheye distortion unsatisfactory
    • Shown better for QA but not for whole document understanding
6
The scientific article
  • How do we use articles?
  • Answer these in groups:
  • Do we use scientific articles as a whole? Or specific components?


  • How do you (personally) determine the relevance of an article?
  • When do you decide to read an article?
  • (Harder) What parts of an article do you use, and for what purpose / task?
  • How do you categorize or label the articles that you read?
  • Typical critical reading patterns:


  • Read the title and the abstract
    • If you still don’t know what this paper is about, then this is a poorly-written paper.
  •  Read the conclusion
    • Are you now sure you know what this paper is about? If not, throw it away.


  • Read the introduction
  • Read the section headings
  • Read tables and graphs and captions


7
Usage lifecycle of an article
  • Being found as relevant
  • Assessing relevance
  • Document surrogate
  • “Information finding”
    • Browsing for exploration
    • Searching for specific bits
  • Conveying knowledge not easily rendered in words
8
Being found as relevant
  • Advanced features of search not often used
    • “Just to be safe”, use full text
    • Common and well-understood UI (legacy effect)
    • When features failed, users often don’t try them again


    • Features thus need:
      • To be properly introduced / understood (scaffolding)
      • To have well-understood error messages
9
Searching for specific bits
  • One-shot queries rare:
    • Tip of the larger iceberg of an information seeking pattern

  • I look for specific surface tensions, experimental measurements


  • Looking for best efficiency of electric motors.
    • Ended up reading tons of documents for electric motor


  • I sometimes want to look specifically at other’s methods and theories


  • I often need multiple copies of a specific piece, like a table, for class


  • I need to keep up to date on my research area
10
Browsing
  • Why do people browse?
    • Semi-directed / Undirected learning
    • Initial Exploration


  • Collection Evaluation
    • What’s in this collection?  Is it relevant to my objectives?
  • Subject Exploration
    • How well does this collection cover my area of interest?
  • Query Exploration
    • What kind of queries will succeed in this area?  How can I access this collection?
11
Using the article
  • Reading has different purposes too:
    • General Learning
    • Identification
    • Skimming
    • Answer questions
    • Defend position
    • Cross-Reference
    • Editing or critical review
12
Using the article (2)
  • Biased to particular user and task
    • Current researcher’s work as “lens” to view the work
    • Different workflow for different users
      • Beginning researchers
      • Seasoned veterans
      • E.g., when to do annotation? Read references?
  • Writing goes hand in hand with reading:
    • Three levels: Creating, note-taking and annotation
    • Annotation serves not so much to add to an article:
      • But to extract / filter important nuggets from an article (e.g., highlighting)
      • Adding a “document layer” to be used to view the document
      • Also inter-document annotation (e.g., labeling)
13
Using Multiple DLs

  • Question: What’s the most common failure when using multiple DLs?
    • A: different layout of UI
    • B: different query operators
    • C: authorization problems
    • D: different materials in collection

  • Same problem in heterogeneous data integration
  • What’s a possible solution?
14
Public or Private?
  • Question: Easier to do information seeking in a public or private place?
    • Need good support of note taking, annotation
    • Access to customization
    • Hardware support
    • Information professions support
15
Teh tarik break time
  • Yay! See you later…
16
Patterns of use on the web
  • How do people use query the web?


  • How do they use the web browser?


  • How can we build a better web browser?


17
Web query types (revisited)
  • What features best for web searches?
  • Discriminate using Mutual Information for 2+ word queries
    • P(x,y) / P(x) P(y) – collocation corrected for chance
    • High MI corresponds to navigational task


  • Navigational (Known item, Home page finding)
    • Relevant pages are mostly entry (root) pages
    • Anchor text and URL information

  • Informational (Topic relevance)
    • Relevant pages are mostly nested pages
    • Content information (e.g., TF ´ IDF)
18
User behavior
  • Users tend not to use monitoring steps
    • Sign up for email alerts, create a channel


  • Even in a formal search mode
    • Users use simple keyword search, not advanced
    • Don’t revise their queries often (75% of all searches)
    • Don’t access help

  • Users don’t seem to have strongly repetitive patterns within a cluster of pages
    • No consistent paths
    • Longest repeated sequence analysis fails


  • Larger volume of queries
    • Higher percentage of repetition
    • Caching is a good strategy
19
Page navigation types
  • ~40% by following hyperlinks
  • ~20-50% by back button navigation
  • 11% new window
  • 10% other (pop-ups count here)
    • Should be counted in hyperlink following
  • 2.5% by bookmarks
  • 0.8% by history
20
URL Vocabulary
  • Observed linear growth, not power law
    • Why?
21
Modes of web browsing
  • Tauscher and Greenberg (1997):


  • First time visit: new URLs observed
  • Revisits: reading in depth (e.g., course notes), flicking to previous page(s)
  • Authoring of pages: reload heavily used
  • Using web-based applications: form submissions
  • Hub-and-spoke: central page Þ specific page and back
  • Guided Tour: Viewing a many-page article
22
Scenario
  • You went to a website this afternoon to do some fact-finding for a project that you’re working on.


  • After going through many sites, some reading you’re currently doing reminds of a link that would be useful to visit on a page that you visited sometime in the last hour or two.


  • How would you go about finding it?


  • Your answers:
23
The Back  Button
  • Takes you to the previous page
    • With a reverse-order of chronological pages; i.e. a stack
    • Extremely simple and easy to use
  • How would you improve upon this?
  • A UI feature of web browsers that have made it into operating systems
24
Temporal model of revisiting
  • Promote a previously visited page to the top of the stack if:
    • I go back to visit it and
    • I take a different hyperlink from there
25
The navigation hub
  • Hub: a page that was promoted in the previous algorithm
  • Study shows hubs revisited 1.8 times


  • Ideally, predict which pages would be revisited
26
Algorithm for finding hubs
  • Safari Browser: Search Engine and typed URLs as hubs
  • Previous revisit of a page indicates hub
    • Even across sessions (“new window” commands)
    • Points to per-user customization

  • SmartBack
27
To think about
  • Traditional use studies are very comprehensive
    • But with new IT, new conclusions yet to be drawn

  • What DL use patterns have correlations in the Web?  What patterns are unique to the web?  To the DL?
  • How do you think web browsers and DL interfaces can be improved in the near future?