To get started with this blank TiddlyWiki, you'll need to modify the following tiddlers:
* SiteTitle & SiteSubtitle: The title and subtitle of the site, as shown above (after saving, they will also appear in the browser title bar)
* MainMenu: The menu (usually on the left)
* DefaultTiddlers: Contains the names of the tiddlers that you want to appear when the TiddlyWiki is opened
You'll also need to enter your username for signing your edits: <<option txtUserName>>
These InterfaceOptions for customising TiddlyWiki are saved in your browser

Your username for signing your edits. Write it as a WikiWord (eg JoeBloggs)

<<option txtUserName>>
<<option chkSaveBackups>> SaveBackups
<<option chkAutoSave>> AutoSave
<<option chkRegExpSearch>> RegExpSearch
<<option chkCaseSensitiveSearch>> CaseSensitiveSearch
<<option chkAnimate>> EnableAnimations

Also see AdvancedOptions

This tiddler was automatically created to record the details of this server
meetings: tzeyun, hari
5246: qs, letor pptx to ppt, slides w0 w4
gh repair shower
ws proposal: follow up emails
antho: C08, W08, full volumes, query lrec and ranlp again, trial L08, W08 and C08
emails: alicia forward parents trip
andrew keyboard: ip
acl09: pub emails, blog setup
reviews: thang rw, irj
tenure prep
dossier: x2 pub references, appendix e, labels, teaching portfolio, deliver mo
renew aam
parscit distro to peter
hwz: meeting
hyp/urop: alloc 4+, urop proposals finalized, hyp selected updated
fc: redo.rb, moved zoning in, ingested headers, corrected bug in headExtract generating tmpfiles, meurlin
isteaching, cerg review
lrec08: robert's edits in, bgibson tasked
taiwan: started copy
ijdl: proofread x2
proofread: trinhhoa x1
serviced mxtag, not yet daemonized
sigir readings
review: cerg
emails to bonnie, dina, jimmy lin
sree: interview, task
shower replacements
updated chimetext
5246: 5qs, demos x11, instancePRF, redo, answer x 20, last lecture
forecite: service fix
icadl: leave
review: andy, eacl09
travel: visas,
hdd: helpdesk
thang: employment - ip
bring 2nd battery back
check in
emails: irf, psu chaser
hendra defense
tenure documents: visit cdtl for tenure app
off vacation
teaching, service portfolio
main dossier: future work
print flyers for acl09
annual report, research portfolio
mysavings transf
cas proposal: draft x3, done
thesis: wanggang
review hendrase, irj 260, calit2
parshed, pdfbox73 attachment to fc services as 10556, 10557
dAnth: redo muc98, email for last
fixed input from url and text
msra abstract, slides
revisit basic git, rails
given to su - Term Extraction Through Unithood and Termhood Unification
cny lunch
bring speakers to office for su
lrec flights - in progress
fix d dbs/ocbc
tuesday thursday leave, us leave
jcdl review prep
snkim exp payments
wongls meetings
wing: ppl page update
antho: credit fixes, report to nght, ali email
mailman fixes
sigir reg payment for hung, su
eddie meeting, simkc & lunch
final and reviews meetings: liewgm, qiul, hendra, jesse, ziheng, yeefan
csit: genre processing, threading, cf/rs, lecture 1
5246: next lecture, rr wk 4 first half, rr 7
reviews: tuan and luan 3108, tony pres, fntir
group meeting
grp: ziheng, jesse grp

download ruby-mode for mac, set up emacs mode
emma, wang shuo jie yang invite/rec - in progress
ijdl: kristine email, cousins email, heiko email
acl antho: bernie rous - C94-fix, pinged authors, enlg 07 pushed out, index.html updated
anthology email x2
wingnews out
acl reviewers: user accounts, bidding email, conflicts in progress
disk and server reimbursement reconcile
bought textmate
paypal utilities
submit 10 to cny lunch teresa
fc: ruby load balancer, parscit, optparse for client,server,broker
ia: added ihl seeds@mnthly, surt rules for nus
fac reviews x2
mir: annual report
tinupc interview
review: hendra thesis, qiul thesis, qiul slides
sigir: transition check, wks nght email.
yahoo pub, email, sched
model for matteo
redo ppt for chuats
hwee tou sigir past organizing dinner
tomorrow morning
email to students
Callan email - paper and class info, kevin duh, evgeniy
ACL-IJCNLP 2009 posters
bouquet for siew foong
sched baidu, calit2, yahoo, yahoo taxi, yahoo catering, pn ref and phone interview
qiul defense arrangement
chimetext announcements
do slides calit2

msra: thankus, pres
coloring: resume coding, rpnlpir listing
interns: markus interview, himanshu 800 
ask about wireless acceess, and tablet pc
ask github for edu account
insure: payment - ip
1102: pub website, redirect, upload files.
lvc: hung
reviews: yeefan - ip
tenure: bonnie letter, reminders
csit outline
reviews: tois
antho: acm dois for W02, X* (partial)
parscit: retrain, bug fix, email matteo
acl:pub budget
icadl reg
emails: wingnews
pick up forms from cindy
ACL JCDL attendance
email fallout

hwz: proposal sent, edited
simone cup holders sent
teaching review: finished
fc: meurlin, fixing 2nd demo of parsCit
taiwan: copying
arc: omnipage, interlink
antho: eacl03 workshops - 3/6
reviewing: irj, hendra, trinhhoa
internship: abs, offer
acl: reviewing assignments, emailed out x2
parsCit: flux-cim eval, lrec prep, cora re-eval
jcdl: sp05 cam done
rec: qiul
presentation:unimelb, q: annotation, online scraping, hosting reviews, privacy
somuk talk: xml
dongwon paper
thankus: melb
pics, expenses
check discover bill payment
icadl: flights
student payments: hung, yeefan
5246: wk2 notes, survey, lecture, room change, hw1
proposals: ping, pietroban biosketch
leave: uncle, alan
emails: alan
acl review reminder
jcdl reviews: 4/5
dAnth: muc 98 entry
AI service agreement
webcredible - hwz / prop sent
msra workshop / travel prep / followups
reviews: xuan
personal message

presentation: lrec08b, lrec08
reviews: mmies, inrt
employment: tanyeefa, hung
buy arch file
op: c series
fc: meeting
tac: meeting
jcdl: reg
jesse: exp data
su notified of wangkai's grp
lori nist aquaint2
acl: flight extend 1 day, rri stay 2 nights, hyatt 1 night
sigir: reg report, keeyn check early/reg interface, draft email
amex pay
git commits for ingest
rebuild forecite dbs
lrec: aclijcnlp pub email
sunjun softcopy to thang
coordinate meeting off danth

antho: e series
antho: bernie, thompson follow up, rajeev, elda/lrec
review: intro hendra, thang mm
group meeting: ziheng to org, room booking, sub tony/jesse
sigir: post, refund, free regs email for Yahoo, Google
emails: lrec
sigir2008: yahoo/ms invitation
drago: follow up with LDC
book as6 media lab 2a
chimetext: emails, tim slides, acl slides
kf: submit form, ask gold, send out
tim visit
ieee: check on receipt
danth: email final coord
parents visit
review: tac paper x2, dang eval, eacl09, 3108b
emails: code4lib
premia: renew
ziming letter
nght: flyers emnlp
5246: reply students,  tut and post closing notes,
postdoc app: xiaoyan
chuats: sigir, quarterly
antho: emnlp release, acm dois, 

holiday: photo uploading, personal bills, email clearing
email: scopus, hyp feedback, ellwyn, acl pub, violin
blog post: 3
lrec: flight again, invoice to finance
review: guo min abstract, xuan ch 5, malindo
slides for dina
sigir08: html pages, session preferences, online up.
review: hendra chap x3
hyp: comments, email to new batch, grading, pictures
tim trip planning - su, letter
pdf:liu manhua postpone to 1 jun
3206s grading
ms comp apps
give website feedback
antho: ingest CL
ieee invoice: ip
pick up monitor
anth: dlp upload, acl sigs email, J86, credits
aclie: conflict res, initial assignments, sent out review forms
jesse, qiul, jin, yf feedback
ijdl feedback: cousins [sent], heiko, chaser
lib: get mm books, pay fine
tosa talk, tyl meeting, emma meeting
phd reviews: 1st stack, start 2nd
dbs trung, and jesse
SIGIR 2008
email follow ups: chandra, sigirlist
qiul: slides
postdocs: nakov interview sched, checked emails
ras: emma, done (tinupc), done(sushmera), t sree ganesh?
antho: inlg08
reviews: ijdl

insure: payment
icadl claims x3, medical claims
chimetext updates
reviews: inrt
outline fntir - ip
reformat pc
meetings: sheng
citeseer: copy db/code files
postdoc: other apps: amir, bojin
phd student responsibilities - ip x2
eforms renewal and forecite virt host request

antho: DOIs ANLP, W00, W01, P04-?, emnlp 2007 bibtex johnathan may, w08-22
ziming rec letter x2
reviews: huangwh, ziheng grp, jesse grp,
ms apps
worldsci: lecture prep, talk, pres archiving
chimetext: announcement, room books for oct.
5246: forum, misspellings in needs, wn 3.0 install

weihua thesis for exam
firstcycle.org renew
check fidelity card payment
citibank annual fee waive
submit letters: ziheng, yeefan reimb, kf alicia
passport photographs
send disks to isaac
william penn call
acl/jcdl reimbursements
review: dang icadl short, 
ijdl: preface, proofreading
ijclclp: preface edits x2
andy email
antho: ingested emnlp01,04, eacl03workshops
review: liewgm, qiul x1, hendrase x1, yeefan webdb
parscit: peter troubleshoot sect, toc, lrec out!
pei-yun lunch, seminar
lrec cameras x2

5246: hw2 grading, emails, regrading, immsnet, final, final grading, uploading
icadl hotel: ip
nlpmt arrangements
antho: fix importing page
pay csit tax, check payments to dbs for csit
nuss membership: two pics, submit
arora recs - ip
review: eacl 09

meetings: irc, worldsci
5246: resched makeup
uncle visit
check tony
antho: c08, w08

wing portal hr fixing
premia newsletter draft - started
jcdl 2008 all reviews fin - yay
muc98 check definitive
acl08 thanks and arbitration
parscit picture and draft
fc: meeting
tyltheng: sigs
parsCit web page: started
hyp proposals
chimetext: org, web page, seminar annoucement
wk's grp.
alf eaton email, client.rb
reading tanyeefa's rw
edit normal reg deadline to 13 jul - ip
book matlab lab for star challenge - ip
lrec: final slides submit, sq checkin, presentations
antho: finishing c04
sigir: send authors registration reminder, acm member numbers - ip
CHIMETEXT seminar Preslav
coloring.rb coding, release
meetings: exxonmobil
paula abstract as pdf, save to papers
5246: hw1, hmm, lecture, wk11 closing
sig.ir: email juniors
reviews, snkim all, tony, lmthang
emails: yong meng, prof ling 1102, nght csidm
postdoc app: sree

sigir08: more reg, ror fixes, first_sigir
review: qiul x4
lrec: check on hotel hotmail
pc visit: lunch, email, seminar
antho: mt&cl, doi - ip
submitted leave forms for lrec, acl, jcdl
tytheng prop read
gon meeting
grp reading x 2
parcel to hk
glue lamp in bedroom
snkim kp: aye perl, survey prep
redo amex payment
sigir08: schedule
file all ura payments
tax return
jcdl trip planning - air tix
fc: mxpost daemon
paypal fix - scan utility page
hwz, ntu team meeting
yw's seminar
do final report for nlp web q - RG5.doc
jcdl edits: jesse x2 done, gm x3 done, jin x2, tanyeefa x2 done
citeseer disk mount: checking with dpuk, reply to ho
sigir08 reg: asked
xuan, tony feedback
paint car
send out acl review reminders
do ellwyn recommendation
premia: judging
pay amex kf
print widm
pick up passport
take battery / cdrom drive for service
antho: inlg08, ijcnlp08, ijcnlp08 corrections
set up twosocks
derry's thesis entry
interview: preslav, emma
review: emnlp, widm x3, grad apps
emails: follwups, sigir problems, william chang, chuats draft sent
hendra: rec
sched: wing, admin meetings, john tait
5246: fix entry link from ~cs5246, get password
reimbursements: flowers (ip), fuel
think about: Jason Chang argument for lattice based phrase tables, try integrating 1TB ngram corpus for MT use, CAS ICT mobile services translation. raghavendra's corpus bounds for AMT/GWAP ideas, Robert: Logical Structure Recovery - Difficult, KyotoEMBT (example alignment and combination), U compare flow (Kano, Tsujii, PSB 2008)
outline fntir
reviews: yeefan, tois

return sr3 cab key
borneo motors
apple service: www.nus.edu.sg/comcen/notebook/centres.htm
bring vga cable
coloring: re-creation, move annotation pane
postdoc: nancy. cai xiaoyan
1102: classroom response pad (https://ivle.nus.edu.sg/LMS/Faculty/Default.aspx)
review: naacl
driver's license
find out math minor infos and read up on sep/noc/atap for jeremy, xian kun and xuanti.
proposals: pietrobon
parscit: pdf to bibtex, endnote, wsdl, raw text processor, rest api?
obi ontology
antho (st) E03.xml problematic authors, remi zajac w01-0711, redirection
antho: author pages, acm xmls
antho: pierre wvlc 94, bonnie + aravind's c volumes, priscilla, semeval reassign, acm dois, cl from acm
citeseer endnote, bibtex, wsdl
fc: wsdl, su's kp, parshed, why empty url, gravatars for keywords, cover flow style
arc: aclarc brett - ip
blogpulse and xanga

lt chuats: ne demo
az, cfc: tf*idf other features approximated
NetDraw www.analytictech.com/download.htm
submit to nature/science?
chimetext: chanys, and others in html page, announce emnlp practices x2
emails: freecite, nlpir4dl news, nlp4dl invites, acl cfp: corpora, irlist, previous chair cfp, sujian proceedings.
antho: emnlp email & preview, w03-28 done.
mentees meeting: tohmk email, emails
reviews: thang cs3108, 
5246: hw2, ivle, closing notes
flyers for emnlp -ip
500 for tony
csit: prep 1, read nlp

premia newsletter x2: sent out.
antho sig drafts: out, email replies, group post, robots txt
citeseer disk: collected
sigir costs registration packages
fc: ingest done, favicon
webservice reg, ping, parshed, pdfbox wrapping
5246: regrades, hw2 signup, zip file, exam and answers, lect notes 12 and 13 
coloring: hash to array labeling, a href, keyboard, undo, efficiency, image labeling, yaml saving
dhl: account for pradeep
interview: xiaoyan x1
review: thang x2
csit fax
derry scan 2
hep b jab
gh shower

su resign letter
sigir volunteers meeting
review: emnlp hendra, dang icadl short, qiul 678
bang ref report x2
disk for igc
chimetext pub, finalize teohh
antho: elmer I05, ley start
aclarc: mirror link, DVD copies
tenure letters: dradev, bdorr, clgiles, march
load other cameron pictures
students pay
knmnyn renewal
healthcare payment.
chime: mstislav
letters jie yang, chris yang
jiang jing chimetext seminar
do ijdl reviews assigned to self, done!
claims in progress
write philip about registration
print van de sompel paper
correlations to minh
hendra's paper revision
lta/samsung runaround
jie yang invite
gordon mohr wapi, kristine ijdl, brent ho's shipping request emails
set up fin08
lta license renewal
jin annotation: started
sigir reg meetings

lrec hotel
review: hendrase proposal, liewguo ch3, xuan ch implement, ch 3 and 5, hoangminh 1 2 6, qiul x3,
aclarc: danth post, first part creation
antho: new html out, correct eacl03 error, icos ingestion
disk sent back
jcdl travel planning started.
claim forms submitted - yay!
code4lib email: parscit release
toby sushant letters
lrec planning
sigir08 system: ror, poll

reviews: widm x2, emnlp
emails: cas x2 , tat-seng reg para, rgrishman, yahoo api, john tait
emma rec for ra
sched forecite other times.
ijdl - review nelson
record 960 in from ziheng
acl anthology ali broken links, acl08:hlt message
buy black cartridge 5bk
leave appn
trung receipt
check on citeseer 
jin x1, minh x1
nuh, premia meeting
ct seminar bookings
d Anth: hlt 2002 email, muc98, sigdial 03
premia: domain name registration

5246: tut2 upload, w8 w9 w10, hw setup, demos, lect, tut 2
nlpir4dl: emails, website
reviews: tony chap, 
antho: emnlp again.
arrange meetings: ugrad, mohanan
csidm: ppt, word doc
moves jien-chen wu et al. computational analysis of move structures in academic abstracts
cao review
dell monitor order
long practices
qiul related work chapter
prep/pack ijcnlp
Ask Gordon about web services architecture.
do final exam checks 1101 3243 - urgent
ask ziheng, xuan to read sentiment papers. esp taras' paper
pick up laptop from caslyn
write isaac councill
install svn
install rails 2.0.2 and dependencies

antho: dois for p, regen bibs for p, finalize fs, ley
kf calling for reinstate
sps: enter 2008 bibs
rp: text mining intro, review text mining
check kf status
p/t research contact for jatn with limsoon
send slides to info@jcdl2008.org
get hub id from alicia
postdoc corpora list
cs5246: init html, textbook, syl dates
review: hendra ch1-7

nlpmt: slides and talk
bug: attaching collab rank (colon in filename?)
parscit: integration joran nick
chimetext: update chiateek
meetings: sallyjo
icadl: email followthroughs
antho: p05-w software - ip, ijcnlp05
csidm: signed agreement
moderation: 3230 - ip
fcn: slides on pubs
5246: resend grades
acl: mod cfp, send out indiv mailing lists
1102: setup ivle
finish: arora mail rec
mail cd to alan
wing dinner

chime: slides cyl, whns
sigir08: panick queries
review: qiul1, tony icadl
antho: c08, w08-12 to 21, fix P08 entities, regen all indices for full volumes, push to aclweb, johan's sigsem edits, colingbibs
5246: crunch, tried assert but failed, discussion qs, grading criteria
tenure: dossier submitted, virtual dossier, dvd prepared
emails: wendy hall, nigel shadbolt, worldsci abstract
letters: ankit, pragyesh, neeraj
pics for sis: genting
admin: emails to workshop
student claims, equip and op claims
csit outline
antho: e series, ingest p08, w08
karita lunch/dinner
parscit: web service client, clemens email
lrec slides1
omni hotel rebook
chimetext sem, chanys
wingnews: write draft, invite ulrich, siti, paula, tim, michael piotrowski, rohini
pubs: qiul ijcnlp paper
lrec: reimbursements - wait on hsbc bill statement
antho: correct p08 errors, trnsf files to antho, push live
bang's ref report
group meeting
acl08: dAnth meeting
read 17 for srw
jcdl: slides
Hae-Chang Rim

* Model for Evaluation the Quality of user-created documents - our aim: assessing quality of user contributed models
sun jun

path generated elements
sub structure factored into inside/outside and then leaves, trunk and root. (3x2 = 6 factors)
agreement based learning BUG Liang et al. 2006

why PGE structure works well as compared to CFG based rules?
putting prior distribution word-based similarity to bootstrap.
acl-ijcnlp-2009.org Minghui
KY Suh as gen chair
28 Jan - 2 Feb - Priscilla visit
ws-mm conf org
2-7 Aug conf date

* 28 - STB?
* 29 - 
* 30 -  discussion with local arrangement committee
* 1 - afternoons free
* 2 - afternoons free

* handbook / newsletter (hz upali min)
* onsite IT (vlad) 
* academic liasons (proceedings, tech prog) (ht)
* exhibition (ht,kc)
* demo & posters / workshops (ht)
* webmaster (mh)
* finance (wx)
* logistics / secretary general (sl)
* internal PR
* external liason
* HR 
* graphics
* badge / t-shirt / bags
Yang Liu

Lead by Qun Liu
lots of ip
75-85% translation quality (subjective)

machine translation for mobile service: travel expression in real time with limited cpu, memory and storage.

Naoko Tosa (Kyoto U.)

* Computing of Emotion - Romeo and Juliet, cast shakespeare into interactive theatre with gesture, speech and emotion rec 
emotion rec follows basic speech modeling training
script/scenario based dialog / interaction / interruptable / mixed initiative

i.plot - wn + other thes to find freeform word assoc - not really narratives yet, just sentence generation via entropy.

* Cultural Computing - essence of culture (via MIT)
Sesshu - sansui as virt world
kodaiji temple

icon based painting 2 1/2 D vis
painting to classifier to story 

Hitch Haiku - a bit like lin's chinese couplet generator but with more cultural aspects pushed in, less nlp centric

future research: culture synthesis?

Neil Rowe and Brian Frew

zone image using averaging and thresholding, iterative to find optimal number of regions
then use 26 features: color gradient and s.d., size, abs loc to map to feature vec
then use lazy case based reasoning to assign to either 5 or 25 broad classes.

this is interesting: then user neural net to learn "focus" in focus region that corresponds to the caption info
used wordnet to examine nouns, got about 38% right (Baseline random gets 10%)
then incorporated nn foci prediction results into region classification to enhance performance (goes up from 33% to 48%) - basically incorporating simple caption info into it.
errors: human artifacts are easy, but people hard (color varies too much?)
Dipanjan Das, Mohit Kumar and Alexander I. Rudnicky

preprocess with parse tree simplification after srl
two approaches: 
1) also assert-based SRL, ignore freq=1 templates 
2) ROUGE based - use LWCS rules to do pairwise comparison? best result (deterministic clustering): cut score into prec/recall, cluster separately.

q/a: need to create domain specific tagger for this specific dataset - if you need fine-grained tags

using reiter's weather dataset (which is very coherent and generated from machine data)
1. general
2. input
3. output
4. concrete examples links to lit
5. ordering / recipe
Kingman's coalescent as prior p(T)
Bo Wang and Houfeng Wang

associate product features/aspects with opinion words using variant of MI to deal with low-frequency problems.  use iterative process like snowball

q (taras): phone is 3G.  Here 3G is feature as well as opinion.  how to handle?
q: camera looks good, but it's not what we were looking for.  how to handle?
Michiyo Sato

MT as backup system to professional non security based articles, 1st draft as it's quick

when launched got atrocious feedback. but better after mt quality improved.  even good metrics result in bad translation
human computation -> global English.

-schedule of funding for csidm
-30k conference for combined parties (10k per year)
-10k + plus also desktop machines for singaporean
BUT only for CSIDM researcher

20 CASIA researchers / 20 NUS researchers

Direct contribution CSIDM / NUS / Joint
7 space tevan gardens lodge for CSIDM researchers

Research Projects (MDA concern)
* make 100 million chinese communicate with non-chinese
* not just language but communication / mediate

* Integrated projects 1-2yrs later?
* testbed / showcase stuff @ fusionpolis -> think about ways to utilize space, may be replaced, with equipment provided

Co-training proposal
* co-training, should take info from expert not learners.  test on learners, learn from experts only?

3 year target
* 1 spin off
* sing comp using cdism tech - 4
* 300K integration for first 3 years, most in third
* killer app?

5 year target
* 4 spin off in total
* sing comp using cdism tech - 10
Kam Fai Wong

Temporal Info Processing - handling normal verbs via Allen's time framework, but also looking at modal verb events ("I would do X").  Second part for latter for business intelligence.

Opinion Analysis for news, products, stocks - overlap with 

NIL is not Nothing - chinese chat normalization - mining vocabulary and measuring how faddish the terminology is.  

Song classification - sentiment units (polarity + object + sentiment)
Irwin King

chinese readability as playground for web readability
component based on radicals for readability for chinese characters
also readability in terms of idioms

Cospace ? real + virt (aug real|virt games | lifelogging)
10000 students -> 1000 startups -> 100 companys -> 10 RIs.
link spaces?

search in real + virt environment

advanced personalized learning as a co-goal
sensor net / mach translation
YOG: http://www.singapore2010.sg/day/index.htm YO village at NUS Warren
NLP for translation!!

CFP: three types: Innov Apps, Infra (capture, delivery, db), Tools/Services (search, UI, Payment)
deadline for final project prop : 2 Jun 08

fusionopolis: for integration?
- vague on funding limit, no presets, support from industry for $$$, not likely for 100%
- spend limit in SG: 100% w/o compelling rationale
- SXSW or other sw eng conf to come?
First session

NLB - Liau Yi Chin : integrated library experience (virt + aug exp), integrate community, info discovery (YOG related, or live event, ''use rfid for more interesting apps'', footprints, info scent, collab filter)

NexGen - Alvin Yap : mobile game dev, standard 3g / gps / multiplayer. mobile social networking. 

Mediacorp - Goh Kim Soon : mobtv, I-net broadband. pc/tv/mobile display convergence - can already to delivery/stream but looking for large vol tech, like compression, a/v index/search

DSTA - Victor Tay : convergence of field and simulation. decision support system, eval & sim diff courses of action.

ThinkingTub - Hairi Soewarso : platform created/ streaming world-first interactive TV, looking for content?

ETI ServTouch - Kelvin Tan : multimedia on mobile, stream, live bcast p2p of imp events. looking for content and ui support

URA - Colin Lauw : sim-city, too many agencies, caught in the middle?  looking for gallery sim and enhance sim game (10 mins multi play) for showcase

CrystalGraphics - Gabriel Liong : virt real, viz, 3d, ncity plat to serv 3d content to diff output , looking for virt world developers

Stratech - Tay Kok Chin : can diff images from cam sys.  shopping mall: integrate with GElement. 

GElement - Yeow Shin We : 3d world builders, provide 1) 3d spore websys: coremo.  looking for apps to use 3d models 2) in building model tools

Qala/Qmax - Alex Tan : providing location tracking, open api for this

iCell - Ken Chua : wifi, ad serving, edge servers in malls and hotzones

2nd session 

NHB - Philip Chua : Virtual Temasek, National Archives: visual artefacts
14th cent to 2000 sim: looking for timeline viz, app to connect and serv collection

AmazingWorlds - Terence Mak : 3d mirror world, interactive? support many diff lang, looking for tech, marketing

VisualFactory - Shamim Akhtar : auto 3d semi-auto generate from laser and vid data.

FirstMeta - ? : payment serv, int cc + 2nd life LDs. work with 2c2p. looking for merchants, virt currency issues.

SingHealth - Noah Tay : obj: hospital of future. digital ward , self-help ask-and-find, AI

NatlGridOffice - Ken Chuang : daily billing of space and bandwidth.

Playware - market intel for gaming, payment opt for cust, 
Background: #fcf
Foreground: #000
PrimaryPale: #c8f
PrimaryLight: #81f
PrimaryMid: #40b
PrimaryDark: #104
SecondaryPale: #fcf
SecondaryLight: #e8f
SecondaryMid: #b4d
SecondaryDark: #418
TertiaryPale: #88e
TertiaryLight: #66c
TertiaryMid: #449
TertiaryDark: #336
GuoDong Zhou, Fang Kong and QiaoMing Zhu

new -> context sensitive.
augmentation of yang et al. to dynamically include competitor and predicate information into training data.
ablation study shows competitor information most important then predicate info.

q: how about ellipsis
q: how about other langs
q: use semantic info
Yuji Matsumoto

Corpus Annotation - pos tag/morpho/dep parsing/ne/coref in jp

subjects (opinion units), <subject, attribute, evaluation> tuples

demo for restaurants, with opinions and attributes co-located.
summarize using radar chart rep - cool

depend analysis of base nps
id evaluation expression using dict
id evaluated attribute using relation extract, and zero-anaphora (if first fails)
opinionhood/polarity determination (is it really an opinion?)
anaphora resolver works exactly like xiaofeng's model, except anaphoricity classifer put after candidate identified.

Both intra and inter sentential work to identify the attribute and object referent.

q (tsujii): evaluation being different positive or negative given different contexts
q: prior of attributes
Michel Haller

pie menu doesn't work on table surface, come up with handedness specific menus. 
track pen and hand and use projection surface
Zhao Shengdong

part1: elastic hierarchies 
rw: Node-Link Diagram / Treemap

- hybridize both by blending both forms at different representation levels in the hierarchy
- focus + context
dissonance between the two paradigms?

part2: earpod
rw: ivr, visual menu

scaffold novice to expert
q: 107 - why errors
Ed H Chi

Wikipedia (social status)
Slashdot (karma points)
Lostpedia.com - missed the first season that got lots of fans. appealed through IRC session.

lightweight/heavyweight social processing - > counting votes / information faddish

Augment Social Cognition: Supported by systems, the enhancement of the ability of  a group to remember, think, and reason; the system-supported construction of knowledge structures by a group.

Wikpedia controversy -> will forecite end up this way?

mr taggy - tag relevance feedback
wikidashboard: social edit and pivot view

Web 2.0:
Crowdsourcing: collaborative- co-creation
Collective Intelligence: folksonomy
Collective Averaging: social attention
Particpation Architecture: interaction
Expertise finding: social networking

Jong Hyeok Lee

morphology: fixing erroneous words
verb phrase in korean accounts for 40.3% of all functional morphemes

Get his slides and error analysis

Also doing Jp->En patent retrieval/translation in NTCIR

Robert Dale

Logical Structure Recovery - Difficult

VSM with citing sentences

Correcting OCR errors that use language models

Horacio Saggion

HAC clustering using tf*idf and cosine sim
terms: semantic class(mentions) or bow 
extract: from 2 types of summaries or from full documents
type 1) summaries for single document
type 2) biographical summaries given soft patterns

idf too sparse? given only sparse data
result slide 26

ps + dp get much better precision but recall to low
Top level org: 
People (Univ>Lab>Individual) / Papers (Conf>Conf/Journal Issue>Individual) / Areas (Hierarhical area tags)

* citation/citee

Detailed analysis:
* Slideseer
* related work/citation completion
* fine grained indexing
* abstract document zoning
* jump to fine grained comparison work - contextual comparison

(hierarchical, clustering based)
* build survey
* differentiate work
* timeline creation
* tool, method, evidence/method/, performance, datasets
* other

* Self profile
* tags for reading/notes
* Past Citation workbench / suggestion cf
* Areas of Interest
* photograph
* community service in journals and conferences
* grant information
* supervisee/supervisor/collaborator
* timeline 
* affliations

* faceted / editing? simul: layers?
* offline interface

* PDF/PPT/DOC handling

* hierarchical tags as areas/types (tools/dataset/method)
* keywords from extractor

Wendy Hall, Nigel Shadbolt

integrate BBPSC, EPSRC to one ontology
web science = emergence of simple behaviors. anticipate what's next
web ecology

wikipedia, blogosphere = as a case study
what tech (which needs to micro structures), lead to macro structures (and hence blogosphere)?
as causes and factor analysis (social, techno, etc)
what counts as success?  adversarial developments? needing legislation?
wsri affiliated web science lab (as a hub) - must buy-in??
nov epsrc deadline

q: research, though leadership, insight/educators
q: if interdisclipinary, why not other thought leaders (just polysci/lawyers?)
q: dying factors, case studies for deaths
Kam Fai Wong (CUHK) - Networked Informal language - rate charge is 
Mitsuru Ishizuka (U Tokyo) - CDL.nl - more rich than RDF, has discourse network types.
Masaaki Nagata (NTT) - psycholinguistics, first words database for parents to record, lexeed database (proprietary)
Sung-Hyon Myaeng (ICU) - paraphrase and check in the different languages to proof check. ConceptNet [MIT Media Lab and ICU]
Hae-Chang Rim (Korea) - call to understand Asian languages better, develop synergies
Jyun-Sheng Chang (NTU, Taiwan) - mixed code text, web tables, argument for lattice based phrase tables, try integrating 1TB ngram corpus for MT use?
Quan Vu (HCM-VNU) - no collaboration nor resources.
Robert Dale

language logic (by Bateman?)
OpenProof: automated grading usage
translate english sentences into FOL
proc: convert to FOL and match, also back-generate the wrong translation.
1.8 M submissions

FOL to realize many surface form by transformation or edit difference.
graph of transformation edits.

this approach to language learning? as in lingo?
federated data (tibet -> google vs baidu)

grid computing - negative links def here is a different interpretation
adrian cheok
normalization re rank docs

ui - hci visualization - show thumbs of related pages
Key-Sun Choi

5 level nested ontology?
ontoCS version 2.0 - schema generation
populate ontology with object instances from web information
IT-field paper ontology - ask about this later

Knowledge Service Engineering - new department
Yasunari Miyabe, Hiroya Takamura and Manabu Okumura

Previous work: Etoh (05) refines radev's Cross Doc Relations to 14 types
Focus on Equivalence and Transition relations.

cast as binary classification on pair of input sentences for EQ
key point: but separate into three partitions based on uni-, bigram similarity/overlap
then calculate parameters based on this clustering

focus on Variable Noun Phrases (VNPs)
look for change in value verbs and importantly, omit identified EQ sentences
Kenji Hirohata, Naoaki Okazaki, Sophia Ananiadou and Mitsuru Ishizuka
didn't attend

tokens not words: as punctuation found as important.

features: 1) (n-grams) uni, bi-, uni+bi-, 2) chi^2 association and 3) position by
auto acquired stuff from medline using regex

crf +5% over svm
BI annotation system +5% over non-BI

Zeng Gang

1) Breaking down modeling into patchworks for scalability

post stitching to bring them together
patch propagation for holes where points not detected.

2) Facade reconstruct with rect region repetition - quite interesting
q: est depth from multiple observations?

Minoru Etoh


convergence, push to divergence? point to two different user bodies?
programmable cell phones didn't work?

3G and higher standards need to diverge in terms of upload/download because upload transmission energy.

OFDM has PAPR Peak to average power ratio problems
forward progress needs to take into account max peak wattage 1.5W.
Sadao Kurohashi, Toshiaki Nakazawa

Distance based alignment based on consistency *over both pairs*
Much better alignment than Moses b/c use syntax tree at beginning

q: syntax as confusion network?

alignment and translation as combination of examples.
What Video Games Have to Teach Us about Learning and Literacy (Gee, 2003)

game levels designed to learn a specific skill

Other notes: Mad About English (2008)

Seely Brown - Sensemaking
StreamSage and BlinkX

From CSIDM meeting:
Seah Hock Soon
Wolfgang Muller Wittig
Steven Miller, SMU
Chee Yeow Meng
Alex Nayanek - Psych Game Lab

Daisuke Ikeda, Hiroya Takamura, Lev-Arie Ratinov and Manabu Okumura
Dec - icadl trip, yf papers (ijcai09 jan 7), antho, fc
Jan - jin-jcdl, jesse-jcdl-opac, ziheng-acl, lmthang-acl, su-acl, 1102, fc
Feb - 1102, fc
Mar - 1102, fc
Apr - jesse-mm, 1102, fc
May - yf thesis, ACL workshop x 2, fc
Jun -
Jul -
Aug - ACL
Jian Wang

[[Microsoft Research Asia - The first 10 years]]

20 research groups
ext review themes
* internet services
* mobile computing in education
* gaming and graphics
* trustworth computing
* windows core tech & windows embedded sys

* biggest computing platform: connects resources
* biggest database 
* biggest social network

Web 2.0
* data centric computing
* internet/web services  (what about REST?)

leads to Research 2.0?
- Web as research platform
* leveraging community effects - use the biggest social net, use "grassroots" cf Luis von Ahn's work, what's the new publishing paradigm (e.g., pub direct to web?), publish data.
* data centric computing
- Deployment driven research
* infrastructure is critical
* seamless experience
''tsujii'' - need deeper linking - not just url, indexing way: but semantic linkages.
user interaction to make better linkage - made explicit

''yujie zhang'' - chinese ime reconsidered

''key sun choi'' - wikipedia automatically - goal

''sung-hyon myaeng'' - search engines don't care about "collections of pages" - what's WYSIWYG in (search) engines.  Reduce burden of ''using'' data retrieval: task oriented and analyzing results

''gary lee'' - test the qa system, not to ask a real question. how to really make people ask question.

''wen chin peng'' - ntu - brand new world

''irwin king'' - glocalization, social network, adversarial info management, services

''james lee'' - large scale 

''robert dale'' - 5 yr+ 1) people search, 2) user-contributed codified knowledge (get people to get knowledge), 3) intellectual micro credit

''tim baldwin'' - 10 yr+ firing nlp helps ir? - needle searcher as nlp parts
Lolan Song

* Research Collaboration - 1) regional theme projects, 2) 
* Curriculum Development - open windows kernel for teaching purposes
* Talent Curriculum - 2500+ interns since 1998, fellowship for junior students based on academic credentials
* Academic Exchange - Faculty Summit, 21stC Computing Conf. (future comp trends, audience for students), TechFest in Seattle (lab showcase for internals), Theme Workshops

Diversified UR Programs - Great Wall Plan, Internship prog with Korea, 
Data acquisition and sharing

Since most of the current popular approaches on MT are data driven, parallel data, such as parallel sentence pairs and large scale translation dictionaries for different domains, are critical. Do you have any suggestions on how to acquire these kinds of data technically? And please discuss further on how to collaborate to share the data for MT research in the Asian community.  
Hisami Suzuki, Mei-Yuh Hwang


Xiaodong He, et al. EMNLP 2008 HMM-based System Combo: align translations by word (with epsilon deletes) and find best path (like viterbi)
[[pre 2008|http://www.comp.nus.edu.sg/~kanmy/wiki/knmnynWiki.html]]
Hsiao-Wuen Hon

* theme workshop - external research themes = internet services / mobile computing in education
* faculty summit 
* 21st century computing conference - perhaps in sg this year?

Computing Trend
* tim o'reilly web2.0 article

Multimedia - next trend forward?
* lazy snapping -  interactive image cutout - get hint from user 
* deshadow - project background 
* deployment driven?

guanxi msra?

recaptcha - two words - one to do captcha; one to do data collection

Raghavendra Upendra

Mix generation with parallel data mined from corpora.

two stage alignment, low and high yield runs
Ming Zhou

reseach goals: nlp in office and search, esp for mobile & multilingual search

nist smt 08: e-c .4099 bleu (best), c-e .3089 (2nd best)

have HIT key lab search - 200+ students MSRA intern feed
''did summer schools in china 2004-''

what about nlp 2.0? web as research platform:
- use wiki,faq to do evaluation and as KB for training

foci: asian NLP, SMT, IR, QA
- interesting: parallel data mining: q "english version" "chinese version", url patterns
- use web and query data in QA for answer and q paraphrase mining
- couplet generation as character / phrasal smt
- lingo: vertical search engine for english writing, use web-mined parallel data

issue: get data from the web or deploy app to get data from the web?
show rankings to get click feedback as training data

Good. I will now be waiting for an outline from you on which I can give you feedback. The ideal outline would consist of a structure into chapters, sections, and possibly subsections - each possibly containing a few lines that describe what is going to be talked about and each possibly including bibliographic references that are going to be covered; but anything along this line will do.
170 phd
60+ faculty
8 full 24 assoc 16 assistant = 48?

100 journal per year, 250 conference pub per year = 2 per year / 5 per year
President, KAIST
new college of ist
EEWS: energy environment water sustainability

Yu Ge

3500 ugrad, 500+ phd 2000 ms
* what's the angle?
** time savings?
** collaboration?


* format annotation
* rules to auto fix

* help people write and organize better?

MagicCards: error report
InkSeine: microsoft japan (downloadable)
* breadcrumb to support take and act on later
* support range of user needs
* inkseine videos and goodies - for tablets

uist 2006 web-based: browsing + scrolling
* enabling web browsers to augment web sites' filtering and sorting functionalities
* translating keyword commands into executable code

introduce shen to david - abb
send wingnews to shen
user involvement
* not as backend
* but peer-to-peer (human computing platform)

lightweight machine learning
* small footprint
* efficiency not accuracy, better to estimate acc. well
* hierarchical

ui as attention area

data as intel inside? or service as intel inside?

browser based deployment of NLP
* as plugin architecture
* as server connections

death of web as data?
ajax css killing this avenue
user involvement not captured in web pages, but databases.

2.5M total cap for the whole thing 
- this week
- revised 

apprehension - first prototype
- evaluation of website
- math and paper but with consultation

genres of websites
- premier cpm microsoft sponsorship
- search engine marketing - cpc
- content context network - cpc
- blog / social network - cpc
- forum marketing - cpc/cpm
- other?

- inter (spread) / intra ()

two types
* branding campaign (roughly .2%)
* tactical campaign
** lead generation
** acquisition (given x kpi)

* time of day of advertisement
* impressions
* avg click through
* position of ad
* campaign time (given)
* budget (given)
Hsin-Hsi Chen

NTU Sentiment Dictionary
CopeOpi: Chinese opinion extraction system for Opinionated Information
Chen Bin

bridging anaphora study
related work: comparable results?
needs relaxed agreement and binding
use features to indicate name of test in hierarchical fashion: STAT5 a part of STATs.
gene and protein name sometimes co-referant, 
transposition factors.
confidence values when web results are low? shouldn't it be straightforward.
doesn't distinguish punctuation?
accuracy so low? .6 present.
JCDL - YF (full), Jesse (full), Jin (full or 2 short), Guo Min (short)
SIGIR Poster (Minh)
Aravind Joshi
release mid Feb

PDTB - tutorial at http://www.seas.upenn.edu/~pdtb

Explicit: subordinating and coordinating conjunctions
discourse adverbials "YY. As a result, XX" Arg1:YY Arg2:XX.

Implicit: only infer when adjacency given.  annotators asked to annotate "best" guess for what should be the explicit connector.
* Some cases manifest as alternate lexicon

discourse + attribution markup
hypothesize that all discourse relationsh (in any language) is binary
Industry vs Academia career paths
SG as valuable non-innovation path?
NLP/IR integration
coref in the real world
scholarly DL support
forward and backward chaining for evidence support
what is business intelligence
Khe Chai Sim

auto lexicon techniques are not very prevalent -- experience shows doing it with linguistic expert knowledge is pretty good

use minimum phone error instead of max likelihood, put metric into the objective function.
adds phone accuracy to ml estimate in E step

q: why phone level for minimum phone error not word error or state error?

speaker independent models: decide to adapt instead
adaption: max likelihood linear regression MLLR
- regression tree: group features vectors together (cluster) then apply for each cluster a linear transform

precision matrix = inverse of covariance matrix
decoder very expensive: need efficient strategies for decoding.
cannot afford to model entire covariance matrix: if you do, don't have enough data to learn all parameters
thus: learn decorrelation linear transform to make precision matrix that is diagonal (ie, no dependencies, reduce parameteres needed to learn

HLDA-PMM: model global and local variations
1 Feb 2008

Meeting Matters - is event organizer
* negotiation, f/b, onsite logistics
Suntec selected among raffles and grand copthorne
starhub or suntec
regional participation - travel participation - done through afnlp
if emnlp, conf name still stays same
Q: transport to and fro, no - stb to give transport coupons
Q: side trips, get travel agency

Marina Mandarin - high level 
Mid level hotel - tba; 
Student Accomm - a couple for selection (Mount Emily)
4 hotels?

Welcome reception: Pan Pacific / suntec / marina
Banquet: Grand Copthorne / Rasa Sentosa

Splice the videos together or just STB
get sound from suntec

Press Conference 

STB - (for spouse programs, side trips) - swag for other conference display
swag for AMTA/EMNLP - HLT/ACL 2008 - EACL 2009 - SIGIR 2009 - 
ASTAR, ALR6 - local support
Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Dragutin Petkovic, David Steele and Peter Yanker

operates on still and video
video cut to shots and used to generate motion objects.
uses sketches or original images
1. Assessing quality / usefulness

Objective measures about whether the data is pulled out is correct : precision/recall f1 - may be use terms that they are interested in such as sensitivity, specificity

Subjective measures really depend on whether in fact the right articles and databases are mined out or not.  This depends on whether we know which databases can be pulled from and whether those resources are indeed open-source or not.

2. Comparator

Not sure what this means.  What do others use as a yardstick to measure how things stack up.

3. Requires user studies.  That is like pulling teeth.  Better to have objective measures and pilots on some grants that would be useful (like the stated incipient projects)


2. Specific aims

A successful tool would have to incorporate heavy doses of UI and at least least-hasselling correction support.  Manual tools are always best, and automation only part of the answer.  Setting up the proper training data and annotation would need to be a key part of the project, but not for the pilot grant.

Not sure that automated methods are going to be any more accurate.  Maybe more consistent and exhaustive at best

Start by asking policy makers what they need and what level ot support/accuracy they need.  That's needed for the bigger picture in a full proposal.


pilot scope - extract without ui/corrections, canonicalization of HTML, just text (hopefully that's suitable enough to do the study without significant engineering work).  Do pilot on some core areas where searching or pulling is done in an automated manner (saved search).  Don't trace anaphora or data in tables.

expanded scope in pilot: use full text (not abstract, as data is not complete enough).

3.2 section
heuristic localization

OBI ontology/dictionary based information extraction
UMLS database dictionary
provenance when doing extraction
bootstrapping work (in later version of the project?)

5.2 section
5.2.4 prediction measures (sensitivity/specificity)

bmrc(a*star) vs. irg 
300k vs. 1m

Hwee Tou Ng

need data!! supervised preferably: data acquisition
sense priors different between train and test, need domain adaptation
smt integration

q(cyl): smt models also consider context dep word selection

Quan Vu

misspelling correction, accent recovery
Zhou Aoying (software engineering institute ? SEI)

web research infrastructure
storage: bigTable, UTab
use Google File System, mapreduce.
going to do research in recommendation, CF
online ads, e-commerce
mining patterns, intrusion
''also doing academic DL stuff''

useful to them: school architecture and committees, transparency
John Liu, STC Director of Development

130+ ppl, r&D (big D), bridging between R&D.
live search as primary product, escalation chain to unblock issues
live search in asia still minor product in product share, starting regional help.
launch was bad; slow and unusable.

web results in asia is not important, user involvement, posts much better placed.

GuoDong Zhou, JunHui Li, LongHua Qian and QiaoMing Zhu

combine LP advantage (to use unlabeled data)
boostrapped support vectors (to ease computation training)
use svm to select/weight which already labeled data is critical to do LP

Idea: LP for other problems? comparison with semi supervised?
Tony Hoare

trace data flow (irrespective of storage or data channel)
four types of seq: identity *, sequence ;, parallel ||, choice('box') []
separation allows us to cut program trace into two parts
Jun'ichi Tsujii

U compare flow (Kano, PSB 2008): UIMA based.
Tang Chengjie

CS, Soft Eng schools

research directions: computer image & graphics (You Zhi-Seng), Networking (Li Tao), Data mining (Tang Chengjie), Information fusion (You Zhi-Seng)

Monojit Choudhury, Animesh Mukherjee, Niloy Ganguly
Han Guoqiang

104 phd students, 700 masters, 1200 ugrads
web intelligence
Su Nam Kim

MWE: types: 1) Noun Compounds, 2) VPC, #) LVC, 4) Idiom, 5) determinerless-PP (e.g., by and large)

- semantic interpretation - no standard set of semantic relations for this, same features used
35% of verbs are VPC, 

* mwe taxonomy (Sag 02)
** lexicalized phrase 
*** ?
*** ?
*** ?
** institutionalized phrase (collocations)

stat features: 1) co-occurrence, 2) substitutability, 3) distributional similarity, 4) semantic similarity, 5) ellipsed predicates (levi 79), 6) linguistic properties (frames)

cotton bag - made of or for cotton?

semeval 07 - ijcnlp08 - head/mod distribution is different.

VPC detection mostly uses parser information but this isn't reliable.

q) noun compounds - easier to identify using base np chunkers
q) domain dependent mwes? 
q) prior information (VPC, semantic relations)
Xiaoyan Zhang, Ting Wang and Huowang Chen

for short documents?
two stories in a pair can be viewed as extensions if temporally related (cf what about different sources)?

only consider pairs, not long chains.
iterative clustering building method -- updating centroid; 2 methods:
1) incremental - add only new unseen terms into centroid vector
2) average - just update centroid 
both seem to do about the same

miss due to articles on same topic but different perspective
false positive due to shared vocabulary

Zhou Aoying (Fudan / East China Normal)

00 MoU / 01 2 phd per year to SoC 
direct admission from ugrad no entrance exam
SoC must waive TOEFL/GRE
Docs on Teaching

- Philosophy: stmt including goals and inno teach methods
- History: modules taught, research students supervised, part in theses and oral exam committees
- Performance Indicators 1) leadership in devel modules and curriculum 2) contributions to textbooks, teaching materials, software, articles, 3) teaching awards
- Future Plans

Docs on Research

- Programme: stmt major areas of research and accomplishments
- Contributions: 1) full list from SPS, stmt citing up to five sig publications and explain significance, 3) stmt on contributions to co-authored pubs 
- Performance Indicators: 1) citations impact analysis, highlight highest and creative works, 2) research grants, 3) research awards, prizes, 4) members of institutional boards, national/intl advisory boards, 5) editorial boards, 6) conference committees 7) reviewer service, 8) appoint as external assessor, 9) invited presentations and workshops
- Future Plans

Docs on Services

- List and elaborate on impact of services to 1) dept/fac/univ, 2) intl acad community, 3) professional/industry/ 4) national and int'l agencies

Supp Dossier
- Teaching Portfolio
- Research Portfolio
Achieve this by using a combination of heuristic rules as well as well-founded text mining.
Rules are used to identify the scope of the applicable sections.
A separate machine learner to learn the pattern instances from marked up data.

Heuristic rules to be compiled by identifying header sections.  Scope of these rules to be dependent on recall/precision trade-off.  In pilot, only treat some versions of standard xml that is worth processing.  Do we need to do some document typing to decide whether or not the document should be classified -- a preprocessing step?

Identified section will then have sentences demarcated.  Use each sentence as decision point, taking into account global features (position, section headers, already identified information?) Soft patterns / ngram tile matching to do sentence classification, then CRF to do segmentation?

Not clear exactly what data gets pulled out.  Tests, dataset descriptions over varying levels of detail?  
Canonicalization/mention of same papers?  Dataset identifiers have specific orthographic patterns (capitals, numbers, versions), acronym expansions, table and figure pointers, citations, segmentation.  Use POS(or tool *is* a customized POS tagger) to throw away certain words for now.  factors as comma lists of items.

What is the coverage of these methods over actual documents?  Need to show that it would be at least somewhat useful for a sizeable amount of papers.

Point to acronym expansion (our old paper)

Each sentence to go through a classifier.  May end up using CRF++/SVM based training.
Sun Maosong
also see [[Tsinghua Univ.]]
semantic word-formation underlying Lexicon
Timothy Baldwin

Deep Linguistic Resources:

Parsing errors: mostly constructional and lexical gaps (40, 40%)

DLP app #1: finding specific solutions (rather than problem specs) for QA troubleshooting

Li Sheng

User-centric IR, personalization, understanding their requirement or expectation.

User interest-centered IR model - based on PRF to build prob model distribution over query, to create a user profile and interest

Social network analysis of a given user can help.  Is part of the DM of a user to build profile.

Ming zhou, Mu Li, Xiaohua Liu

Woodpecker (COLING 08) - check point mt
- send to Thang

Drag tabs together for comparison
- send to Zaw Lin / Jesse
Ricardo Baeza-Yates (Yahoo! VP Research)

Revised from ECSW 2007

Search as object search with attributes that can be missing, noisy, incorrect.
also infer intention of the object
difficult task: inferring trust/reliability?

nav queries have 95% precision in 1-click nav.  suggest pushing user to that page?

2007 - already clickthrough data is very big (18GB/day) but much smaller than total typed content (170TB), but growing

searchmonkey - query specific search results?

query network structure gives power law distribution. click structure can yield folksonomy / synonyms through query re-writing.
Sun Maosong

120 faculty, top 1 in china, 6 divisions (like 
have joint research labs with sohu, other mnc
one top level org: school of software engineering + 4 departments (CS is one)

web data mining - structured / semi structured / unstructured

hci / media integration - image animation and synthesis for facial generation

state key lab of intelligent tech and systems: 
nlp / trec(vid) / sohu joint lab with them
biomed text extraction
Defensive Design for the Web - Matthew Linderman, Jason Fried
The Practical Guide to Information Design - Ronnie Lipton
Gary Geunbae Lee
continuous classification rather than binary
over unsupervised data using single positive vocabulary instance.
used it168.com corpus again

a) classifier for zones (orthographic as extraction points
b) classifier for vocabulary

objective vs. subjective
objective classifier didn't work well, replace with sentiment dictionary worked well.

idea:seq of zone analysis: they tried longest seq
Saima Aman and Stan Szpakowicz

use elkman's 6 classification axes
sentence based classification.
on blogs retrieved from web using seeds!
adds corpus based feature
Chin Yew Lin

how to share data as a corp entity?
how to we turn DATA to VALUE

case in point: qa on web, powerset for nlp search
data is there? www.searchlab.com.cn 

knowledge distillation & dissemination: unstructured, (semi-)structured
use humans to get answers to hard questions
summarization as applied to answers to queries
check WWW 08 on question recommendation
P.S. I'd also like to add you to my group's news mailing list. This is a very low (4 emails per year) mailing list (with the subject [WINGnews]) documenting recent work from my research group, WING. If you're interested, just reply to this message and let me know, otherwise, no worries — I'll only add you to the list if you explicitly agree to it.
21 - connect to IA web api soap, crawler work, importing data acl / citeseer, semi structure work, jcdl
28 -

May 5 -
12 -
19 - late LREC
26 - 
Jun 2 - 
9 - 
16 - mid JCDL/ACL USA
23 - 
30 - 
Jul 7 - end Sigir
14 - 
21 -
28 -
Aug - term starts
Sep - tenure app / GH move?

parsehed parscit
Slideseer integration
webpage parse integration
keyword integration
editing integration
az integration
singapore poly
tech semiconductor
in house 
univ. of newcastle
inkiti.com - php mysql xml
asp perl/CGI

3 years
2 modules


source code is available: svn co http://svn.citeulike.org/svn/ citeulike
some data available.  Need to see whether the data can be imported into FC.
fully backed up nightly.  A real production site.  
Imports metadata from publishers using plugins or policy
Helpful for groups working together in collab framework, but seems low traffic; may need better collab features


Lots of data, possible to scrap by webservice?
List of publications by date
can really use a UI refresh, links hard to see
shows too much data.
quite fast, running tomcat serving jsp.  (saw this by generating an error)
not very current; we need to proactively mine conferences on the web.
also links don't always work, need to spider our own site to ensure accessibility


offers tagging but puts suggested tags up front (to read, read, classic, tutorial)
has topics compiled by topic model work of PI
rexa raw link shows metadata captured by their system
creates bibtex on fly

preprocess: pp2post, rawText2pp, rawText2mappedSection
az: pp2az, pp2cfc
ingest: spiderSingleWebPage, spiderPDF, PDF2rawText
slideseer: ppt2rawText, alignS2D

ruby load balancer connected to gmond
xmltape arcfile for metadata store?  Or at least interface for it?
read oai pmh again
factoid question answering:

- in english

text classification problem

- useful
- in english
- not just topical
- uses links?

- target age?
- genre specific

corporate websites?

1/2 - prelim study

1 - baseline creation - html study, look and feel, link analysis (partial)

1 1/2 - baseline finish, product / study advertiser's speciality considerations

2 - implementation 

2 1/2 - evaluation of first stage / implementation 2

3 - report write-up
goal: relatively dense (ask for 100 papers)
- secondary data analysis
- open access
- geography: asia - australasia
- perhaps? domain - surgery
Thanks for your interest in my research group.  Unfortunately at this
time I don't have any funding for open positions.  However, if you are
interested coming to NUS, you might apply directly for an internship
position  through the human resources department.  The pertinent
details can be found on our school's website.

total paid from grant
j - 3854.34
z - 3731.98

paid back
j - 960
z - 960
buy textmate

otherwise do final report
hp pouch
swatch watch strap
cute stickers
wild yam progesterone cream
clipper dandelion tea
gatorade powder
Glenny's Lowfat Soy Crisps, in Creamy Ranch flavor 
aveda shampoo
yankee candel - mcintosh apple, clean cotton
brazil nuts unopened
v secrets body wash (2 for S38)
friday before - bag stuffing
name tags must be ready
lanyard - chua sample 
plastic + print double side so flip ok
banquet tickets
cocktail everyone?
3 colors for tutorial + workshop + main
li haizhou - local - get stbs

hosf: payment / visa letter

registration: website
target: research papers

next meeting: march - 24 monday

new profil:e
* pressure ulcer, pressure sore, decubitipus ulcer, pressure reducing mattress, turning, abrasion score, contributing factors, nursing, adult, moisture, friction, bedridden

*freemedicaljournals.com - try / age and aging
*jpi - blackwell
*national libraries - proquest
*esbco host
*web archives - internet archive

examples : ebp
- Knowledge and use of evidence-based practice  by allied health and health science professionals in the United Kingdom
Upton and Upton
- Conceptions of evidence, evidence-based medicine, evidence-based practice and their use in nursing: independent nurse prescribers' views
- Attitudes and knowledge of primary care professionals towards evidence-based practice: a postal survey
- Evidence-based medicine in general practice: beliefs and barriers among australian GPs
Young, Ward
Physician' attitudes towards evidence based obstertric practice: A questionnaire
Olufemi A Olatunbosun, Edouard, Pierson

* combining words
* project organized
* query expansion
* OAI-PMH for metadata from NLM

<<sparkline 230 320 201>>
ms thesis linguistics
wsd/pos tagger telegu
indic -> sov

timesheets for aug
op 16
admin for computers in nwop
sle as aye?
user id sync on aye? antho as spamfilter?
403,404 errors?

todo: ppt. shelf, back door of rack?
ws = worldsci 
co-found imperial college press 
meeting matters, global publishing, stallion press, ws printers
innovation = joint w nus

wsnet.com = ejournal
wsnetarchives.com = abstracts full text with doi / crossref
- part in crossref
- e proceedings

dr thio - 
sam ge (ece) robotics - multimedia integrative format
chiew ying oi - ws research information manager

rick lee chi wai - mis / content part manager 

use bookmaster running on as100
4 ppl on infrastructure
approx 20K/mnt 3/4th salary

- STM - science,tech,med: mm constrained; only deal with generic pdf