Notes
Slide Show
Outline
1
Optimizing predictive text entry for
mobile phone short messages (SMS)
  • Yijue How and Min-Yen Kan

    kanmy@comp.nus.edu.sg
    School of Computing, National University of Singapore


2
Short Message Service
  • Over 24 billion in 2002
  • 100 million sent on 2005 Lunar New Year Eve in China alone


  • Problem
  • Input is difficult
  • How to make input easier?
    • Make keystrokes more efficient
    • Ease cognitive load
3
Problem Statement
  • Write English messages using only 12 keys
    • 1-to-1 mapping of letters to keys not possible
      • Need more than one keystroke to type a letter

  • We review current approaches and propose improvements using corpus-based methods
    •  Key remapping
    •  Word prediction

  • Key point: how to measure performance?
    • Keystroke Level Model
    • (Better) Operation Level Model
    • On actual SMS text
4
Current approaches
  • Many approaches.  Among most popular:


  • Multi-tap
    • Press key multiple times to reach desired letter
    • 3 ×          “c” + wait +           “a” +            “t” = “cat”

  • Tegic T1
    • Use frequency of English words to place most likely alternatives first
    • Use a  next  key to indicate next alternative
    • 2 ×          “ba”  +           “act” + next = “cat”


  • Common feature:
  • Use one key for space (e.g., 0), another for symbols (e.g., 1), so less than 12 keys
5
Outline
  • Corpus Collection
  • Evaluation: KLM vs. OLM
  • Benchmark entry methods
    •  Key Remapping
    •  Word Prediction
6
SMS Corpus
  • Formal English is not SMS text
    • Closer to chatroom language
  • Most published research uses English text
    • Lack of publicly available corpora
  • NUS SMS corpus
  • Medium scale (10K) messages
  • Demonstrates breadth and depth
  • Corpus of messages from college students


7
Evaluation Models
  • Keystroke Level Model (Card et al. 83)
    • Used previously in SMS (Dunlop and Crossan 00, Kieras 01)
    • Problem: keystrokes are weighted equally

  • We developed an Operation Level Model
    • Similar to (Pavlovch and Stuerzlinger 04)
    • Tie keystrokes to one of 13 operation types
    • (e.g.,
    • enter a symbol = MPSymK,
    • directional keypad move = MPDirK,
    • press a different key to enter a letter = MPAlphaK
    • press a same key to enter a letter = RPAlphaK
8
Using OLM to derive times
  • Reach home @ ard 930


  • Reach_ 5 MPAlphaK, 1 RPAlphaK


  • home_ 4 MPAlphaK, 1RPAlphaK, 1 MPNextK


  • @_ 1 1MPAlphaK , 1 MPSymK, 1 MPDirK, 1MPSelectK


  • ard_ 1 InsertWord, 4 MPAlphaK,
  • 2 RPAlphaK


  • 930 3 MPHAlphaK
  • Derive timings for each operation by videotaping novice and expert users








  • Chose messages with wide variety of operations
9
Outline
  • Corpus Collection
  • Evaluation: KLM vs. OLM
  • Benchmark:
    • Baseline: Tegic T1
    • Improvement: Key Remapping
    • Improvement: Word Prediction
10
Methodology and Baseline
  • For each of the 10K messages:
    • Calculate KLM and OLM timing for message entry
  • Average over total for both novices and experts



  • Baseline: Tegic T1 (based on 2004 Nokia phone)
  • Need to know order of alternative words
    • E.g., 6334 = “good” next “home”
    • Reverse-engineered dictionary
  • Results:
    • 74 keystrokes (average KLM)
    • 74 seconds (average OLM)
      • 59.7 and 149.56 for expert / novice OLM
11
Key Remapping
  • Shuffle the keyboard (similar to Tulsidas 02)
  • Too many combinations: ~1.5 x 1019
  • Use Genetic Algorithms to search space
    • Swapping letter to key assignments per generation
    • Keep “best” keyboards (e.g, have lowest average input times by OLM)
  • Result:
    • Average 15.7% reduction in time needed
    • Due to reduction in next key presses

12
Predictive Word Completion
  • Allows completion of partially-spelled word
  • Similar to ZiCorp’s eZiText


  • Our model:



  • Select w with highest conditional
    probability given evidence from:
    •  Current word’s key sequence
    •  Previous word
  • Display a single prediction only when confident
    • Cycle through completions based on confidence
13
Example and Result
  • Writing: Meet at home later
  • So far:  Meet at in



  • 46* = in, go, got, how, god, good, home, ink, hold, holiday …
  • P (home | at, 46) > threshold
  • P (in | at, 46) < threshold
  • …


  • Display:  Meet at in
  •                                  home


  • Result: 14.1% savings in time (OLM)
  • Compare with 60% in early work on PDAs (Masui 98)


14
Combining methods
  • Both methods complement each other
  • Allows up to 21.8% average time savings
  • Remapping improves slightly more than word completion
    • May be caused by conservative word completion strategy
15
Future Work
  •  Doesn’t account for cognitive load
    • Remapping is hard to learn


  •  Codec in development
    • Regular Text to SMS / chat Text


  • Speeding up Named Entity entry
    • =   People, places, times and dates
16
Conclusions
  • Can save 20+% time in entering SMSes
  • Use corpus to drive and benchmark optimization
  • Evaluation using OLM (finer than KLM)
  • Public SMS corpus available (ongoing work)


  • See Yijue How’s thesis for more details and additional experiments
  • Google: “SMS Corpus”
17
Backup Slides
18
Guidelines for talk
  • 15 minutes
  • 2 to 3 minutes for questions