|
|
|
|
|
Title: Entity Resolution: Overview and Challenges
|
|
Abstract:
Entity resolution is a problem that arises in many information
integration scenarios: We have two or more sources containing records
on the same set of real-world entities (e.g., customers). However,
there are no unique identifiers that tell us what records from one
source correspond to those in the other sources. Furthermore, the
records representing the same entity may have differing information,
e.g., one record may have the address misspelled, another record may
be missing some fields. An entity resolution algorithm attempts to
identify the matching records from multiple sources (i.e., those
corresponding to the same real-world entity), and merges the matching
records as best it can. In this talk I will give an overview of the
Stanford SERF Project, that is building a framework to describe and
evaluate entity resolution schemes. I will discuss some open
problems, including some related to information privacy. (This is
joint work with Qi Su, Tyson Condie, Nicolas Pombourcq, and Jennifer
Widom.)
|
|
|
Introduction:
Hector Garcia-Molina is the Leonard Bosack and Sandra Lerner
Professor in the Departments of Computer Science and Electrical
Engineering at Stanford University, Stanford, California. He is the
chairman of the Computer Science Department since January 1, 2001.
From 1997 to 2001 he was a member the President's Information
Technology Advisory Committee (PITAC). From August 1994 to December
1997 he was the Director of the Computer Systems Laboratory at
Stanford. From 1979 to 1991 he was on the faculty of the Computer
Science Department at Princeton University, Princeton, New Jersey.
His research interests include distributed computing systems, digital
libraries and database systems. He received a BS in electrical
engineering from the Instituto Tecnologico de Monterrey, Mexico, in
1974. From Stanford University, Stanford, California, he received in
1975 a MS in electrical engineering and a PhD in computer science in
1979. Garcia-Molina is a Fellow of the Association for Computing
Machinery and of the American Academy of Arts and Sciences; is a
member of the National Academy of Engineering; received the 1999 ACM
SIGMOD Innovations Award; is a member of the Computer Science and
Telecommunications Board (National Research Council); is on the
Technical Advisory Board of eGuanxi, Kintera, Metreo Markets,
Morhsoft, TimesTen, Verity, Yahoo Search & Marketplace;
and is a member of the Board of Directors of Oracle and Kintera.
|
|
|
|
Title: Towards a Statistically Semantic Web
|
|
Abstract:
The envisioned Semantic Web aims to provide richly annotated
and explicitly structured Web pages in XML, RDF, or
description logics, based upon underlying ontologies and thesauri.
Ideally, this should enable a wealth of query processing
and semantic reasoning capabilities using XQuery and logical inference engines.
However, I believe that the diversity and uncertainty of
terminologies and schema-like annotations will make precise
querying on a Web scale extremely elusive if not hopeless, and the
same argument holds for large-scale dynamic federations of
Deep Web sources.
Therefore, ontology-based reasoning and querying needs to be
enhanced by statistical means, leading to relevance-ranked lists as
query results.
This talk presents steps towards such a "statistically semantic" Web
and outlines technical challenges. I discuss how statistically
quantified ontological relations can be exploited in XML retrieval,
how statistics can help in making Web-scale search efficient, and
how statistical information extracted from users' query logs and click streams
can be leveraged for better search result ranking.
I believe these are decisive issues for improving the quality of next-generation
search engines for intranets, digital libraries, and the Web,
and they are crucial also for peer-to-peer collaborative Web search.
|
|
|
Introduction:
Gerhard Weikum is a Research Director at the Max-Planck Institute of Computer Science in Saarbruecken, Germany. Earlier affiliations include the University of the Saarland in Germany,ETH Zurich in Switzerland, MCC in Austin, Texas, and, during a sabbatical, Microsoft Research in Redmond, Washington. Gerhard is co-author of more than 100 refereed publications, and he has written a textbook on Transactional Information Systems, published by Morgan Kaufmann. He received the 2002 VLDB ten-year award for his work on automatic tuning. His current research interests include intelligent search on semistructured data, combining database technologywith information retrieval techniques, and "autonomic" peer-to-peer information management. Gerhard serves on the editorial boards of ACM TODS and IEEE CS TKDE, and he is the program committee chair for the 2004 SIGMOD conference and the current president of the VLDB Board of Trustees.
|
|
Invited Talk: Xiao Ji
|
|
Title: The Application and Prospect of Business Intelligence in Metallurgical Manufacturing Enterprises in China
|
|
Abstract:
This paper introduces the application of Business Intelligence (BI) technologies
in metallurgical manufacturing enterprises in China. It sets forth the development
procedure and successful cases of BI in Shanghai Baoshan Iron & Steel Co., Ltd
(Shanghai Basteel in short), and puts forward the methodology adaptable to the
construction of BI systems in the metallurgical manufacturing enterprises in China.
Finally, it prospects the next generation of BI technologies in Shanghai Baosteel.
It should be mentioned as well that it is the Data Strategies Dept of Shanghai
Baosight Software Co., Ltd (Shanghai Baosight in short) and the Technology Center
of Shanghai Baoshan Iron & Steel Co., Ltd., that supports and does research works
on BI solutions in Shanghai Baosteel.
|
|
|
Introduction:
Dr. Xiao Ji, received his PhD degree on Computer Science from Northeast University, Shenyang,
Liaoning, and is presently a Senior Engineer and the General Manager of Data Strategies
Dept., Shanghai Baosight Software Co., Ltd. He has been engaged in theoretical study and
practical development of data warehousing, data mining and Business Intelligence for
about ten years. He is also an academic member of the Database Society of China Computer
Federation (CCF) and the Associate General Secretary of Shanghai Information Association.
He has been acted as principal investigator for nearly thirty engineering and research
projects. He has won several awards given by Chinese Metallurgy Bureau and Shanghai
BaoSteel due to his excellence on science and technology. He has published a book
titled "Data Warehouse Engineering Methodology". He was invited as postgraduate
supervisor from IT enterprises by Shanghai Jiaotong University, Northeast University,
Shenyang, and East China Normal University, Shanghai and has supervised more than
ten postgraduate students. Meanwhile, he has published more than 20 papers on
various academic journals and international conferences on database and
information systems.
|
|