Tutorials
กก
กก | Time | Title | Speaker |
Tutorial A | 12 April (AM) | Database Watermarking | Radu Sion |
Tutorial B | 12 April (PM) | Multilingual Database Systems | Jayant R. Haritsa |
Tutorial C | 14 April (PM) | Video Sequence Indexing and Query Processing | Xiaofang Zhou |
Download the slides:
Title:
Database Watermarking
Abstract:
Information, as an expression of knowledge
is probably the most valuable asset of humanity today.
By enabling relatively cost-free, fast, and accurate access
channels to information in digital form, computers have
radically changed the way we think and express ideas. As
increasingly more of it is produced, packaged and delivered
in digital form in a fast, networked environment, one of
its main features threatens to become its worst enemy:
zero-cost verbatim copies. The inherent ability to produce
duplicates of digital Works at virtually no cost can be now
misused e.g. for illicit profit. This dramatically increases
the requirement for effective rights protection mechanisms.
Different avenues are available, each with its advantages
and drawbacks. Enforcement by legal means is usually
ineffective, unless augmented by a digital counter-part such
as Information Hiding. Digital Watermarking deploys
Information Hiding as a method of Rights Protection to
conceal an indelible .rights witness. (watermark) within
the digital Work to be protected. The soundness of such
a method relies on the assumption that altering the Work in
the process of hiding the mark does not destroy the value of
the Work, and that it is difficult for a malicious adversary
(.Mallory.) to remove or alter the mark beyond detection
without destroying the value of the Work. The ability to
resist attacks from such an adversary (mostly aiming at
removing the embedded watermark) is one of the major
concerns in the design of a sound watermarking solution.
Rights Protection for relational data is important in areas
where sensitive, valuable content is to be outsourced. A
good example is a data mining application, where data is
sold in pieces to parties specialized in mining it (e.g. sales
patterns database, oil drilling data, financial data). Other
scenarios involve for example online B2B interactions
(e.g. airline reservation and scheduling portals) or sensor
streams, in which data is made available for direct, possibly
interactive use. In [5, 6, 7] Sion et. al., and in [1, 2]
Kiernan, Agrawal et. al. explore rights protection solutions
for numeric relational data through watermarking. In [4, 8]
Sion introduces the problem of resilient rights proofs for
categorical data. Additionally, in [3] Li et. al. extend
the work by Kiernan, Agrawal et. al. [1, 2] to provide
for multi-bit watermarks in a direct domain encoding. In
this tutorial we explore these and other related research
efforts. We analyze their resilience and ability to provide
court-time rights proofs. We discuss deployment scenarios
and provide implementation recommendations. We explore
future associated directions. Time permitting, we intend
to also provide a demonstration of one of the software
packages discussed in [5].
References:
[1] Rakesh Agrawal, Peter J. Haas, and Jerry Kiernan. Watermarking relational data:
framework, algorithms and analysis. The VLDB Journal, 12(2):157.169, 2003.
[2] J. Kiernan and R. Agrawal. Watermarking relational databases. In Proceedings
of the 28th International Conference on Very Large Databases VLDB, 2002.
[3] Yingjiu Li, Vipin Swarup, and Sushil Jajodia. A robust watermarking scheme
for relational data. In Proceedings of the Workshop on Information Technology
and Systems (WITS), pages 195.200, 2003.
[4] Radu Sion. Proving ownership over categorical data. In Proceedings of the IEEE
International Conference on Data Engineering ICDE, 2004.
[5] Radu Sion. wmdb.*: A suite for database watermarking (demo). In Proceedings
of the IEEE International Conference on Data Engineering ICDE, 2004.
[6] Radu Sion, Mikhail Atallah, and Sunil Prabhakar. Rights protection for relational
data. In Proceedings of the ACM Special Interest Group on Management of Data
Conference SIGMOD, 2003.
[7] Radu Sion, Mikhail Atallah, and Sunil Prabhakar. Relational data rights
protection through watermarking. IEEE Transactions on Knowledge and Data
Engineering TKDE, 16(6), June 2004.
[8] Radu Sion, Mikhail Atallah, and Sunil Prabhakar. Ownership proofs for
categorical data. IEEE Transactions on Knowledge and Data Engineering
TKDE, 2005.
About the Speaker: Radu Sion is an Assistant Professor of
Computer Sciences at Stony Brook University. He
received his PhD (2004) in Computer Sciences from Purdue
University. While at Purdue, Radu was affiliated with
the Center of Education and Research in Information
Assurance and with the Indiana Center of Database
Systems. In most of 2004 Radu visited with the IBM
Almaden Research Center, while on leave from Stony
Brook.
Radu Sion's current research interests are centered
around inter-connected entities that access data and need to
do so with assurances of security, privacy, and functionality,
preferably fast. His research lies at the intersection of
security, databases and distributed systems. Applications
include: authentication, rights protection and integrity
proofs, trusted reputation and secure storage in peer to
peer and ad-hoc environments, data privacy and bounds
on illicit inference over multiple data sources, security in
computation/data grids, detection of intrusions by access
profiling for on-line web portals.
กก
Title:
Multilingual Database Systems
Abstract:
Efficient storage and query processing of data spanning multiple
natural languages are of crucial importance in today's globalized
world.
A primary prerequisite to achieve this goal is that the defacto
standard data repositories -- relational database systems -- should
efficiently and seamlessly support multilingual data. In this
tutorial, we will first present a detailed assessment of how good
today's database systems (both commercial and public-domain) are
with regard to the storage, management and processing of
multilingual data. Our results will show that there are
significant performance inefficiencies for languages based on scripts
other than Latin (such as Devanagari, Kanji, Cyrillic, etc.).
We will also outline techniques for alleviating these problems.
With regard to functionality, a major limitation of SQL is that it
does not support querying of data across different natural
languages, that is, cross-lingual queries. To address this lacuna,
we will propose two new SQL operators that support phoneme-based
matching of names, and ontology-based matching of concepts, in the
multilingual world.
An algebra for integrating these new operators with relational
systems will be defined as well as the associated cost models,
selectivity estimators, and access methods. Our experience with a
prototype implementation of these operators on PostgreSQL will be
highlighted.
In a nutshell, this tutorial will present practical approaches
towards realizing the ultimate goal of "natural-language-neutral"
database engines.
Duration:
3 hours
About the Speaker:
Jayant R. Haritsa is on the faculty of the Supercomputer Education
& Research Centre and the Department of Computer Science &
Automation at the Indian Institute of Science, Bangalore. He
received the BTech degree in Electronics and Communications
Engineering from the Indian Institute of Technology (Madras), and
the MS and PhD degrees in Computer Science from the University of
Wisconsin (Madison). His research interests are in database
systems. He is a recipient of the Swarnajayanti Fellowship from the
Government of India, and the Sir C V Raman Young Scientist Award
from the Government of Karnataka. He is an Associate Editor of the
International Journal of Real-time Systems.
Title:
Video Sequence Indexing and Query Processing
Abstract:
Effective and efficient multimedia data retrieval has attracted extensive attention in the last decade.
Among media types, video presents the most complex data, including a sequence of frames (or feature vectors),
audio, motion, meta-data, and many others. With ever more heavy usage of video devices and advances in video
processing technologies, the amount of video data has grown rapidly and enormously for various usages, such
as advertising, news video broadcasting, video surveillance, personal video archive, and medical video data.
Interestingly, the popularity of WWW enables enormous video data to be published and shared. Web search engines
provide users convenient ways for finding videos, of their interests. Due to the high complexity of video data,
retrieving the similar video content with respect to a user's query from a large database requires: (a) effective
and compact video representations, (b) efficient similarity measurement, and (c) efficient indexing on the compact
representations. Given such indispensable demand, very recently, indexing video sequences for fast retrieval has
attracted much attention in database community, with and without considering temporal, spatial, and alignment features.
In this tutorial, we focus on video's sequence feature of frames, each of which is a high-dimensional image feature vector.
The number of frames is typically in the range of hundreds or more, depending on the length of video. We propose to visit
the recent video feature representation models and their similarity measures. Understanding these themes and the
intuition behind them helps to construct effective sequence indexing structures and develop fast search techniques.
We then move to discuss the state-of-the-arts methods for video sequences indexing in high-dimensional space.
We will also discuss open issues and challenges and the potential research trends for video search.
With emerging complex queries in Web search engines, indexing video, the most powerful communication media,
will be in the limelight. This tutorial covers a wide spectrum of topics in video search from the database
point of view.
Duration:
3 hours
Target audience: researchers and practitioners in the area of multimedia databases and information retrieval.
About the Speaker:
Dr Xiaofang Zhou is a Professor in School of Information Technology and Electrical Engineering, The University
of Queensland, Australia. He received his BSc and MSc degrees in Computer Science from Nanjing University,
and his PhD degree in Computer Science from the University of Queensland in 1994.His research interests include
spatial databases, multimedia databases and high performance query processing. He has published over 90 research
papers, including those at SIGMOD, VLDB, ICDE and the VLDB Journal.