10
Yee Fan Tan, Min-Yen Kan and Dongwon Lee: Search Engine Driven Author Disambiguation
ACM/IEEE Joint Conference on Digital Libraries 2006
Weighting: Inverse Host Frequency (IHF)
•We notice that using hostnames alone may be problematic
–Especially when a host has multiple hostnames or is represented by an IP address with dissimilar distributions
–e.g. www.informatik.uni-trier.de, ftp.informatik.uni-trier.de and 136.199.54.185 are the same host
•Therefore, we also experimented with
–Domain (e.g. uni-trier.de)
–Resolving hostnames to IP addresses