| SMA CS
Research
Supporting Task-Dependencies
on the ALiCE Grid
Student : Asankha Chamath Perera
SMA Supervisors : Assoc Prof Teo Yong Meng (Singapore)
& Prof Alan Edelman (MIT)
Abstract:
This dissertation investigates supporting parallel
Java applications with task-dependencies in the distributed shared
memory ALiCE Grid. Programming in ALiCE is supported by a set of
programming templates, which hide the complexities of the Grid infrastructure.
The main templates available within the ALiCE system support the
definition of computationally intensive and parallelizable Tasks;
and Task Generators which controls the application flow. The Task
Generators and Tasks of an application executes within the different
Java virtual machines of the Producer nodes that volunteer computational
resources to the ALiCE Grid. A new set of programming templates
allows programmers to develop applications with task-dependencies,
with elegant and efficient code, while still maintaining backward
compatibility to support existing applications. The new enhancements
to the templates, which are modeled similarly to Java threads, allows
programmers who are new to Grid computing to easily harness the
powerful capabilities presented, with minimal and simple code.
Support has been implemented to allow tasks to
spawn sub tasks, and wait for their completion, synchronize at barriers,
and optionally collect their results and detect exceptions encountered
during remote execution. The enhanced runtime system supports the
concurrent execution of multiple tasks and task generator instances
at Producer nodes, allowing blocking jobs to remain active until
they are resumed, with minimum overhead to the system. The new templates
allow the submission of the same application in batch and interactive
modes of execution without code duplication or recompilation. Due
to on-line output file replication support implemented for interactive
mode job submissions, the implementation of the visualization component
has been separated from the Grid application.
SVM
Active Learning and Applications in Image Retrieval Systems
Student : Deng Kun
SMA Supervisors : Assoc Prof Lee Wee Sun (Singapore)
& Prof Leslie Pack Kaelbling (MIT)
Abstract:
An SVM active learning based image retrieval system
is implemented for research purpose. Some of the interesting properties
of the system are studied and experiments show active learning method
outperforms normal SVM learning and many other relevance-feedback
algorithms if the query concept is “separable”. Some
algorithms with different SVM sampling criteria are tried out. They
are threshold based random sampling and most-orthogonality based
sampling methods. However experiments also show engineering effective
active learning querying components is still a subtle task as many
seemingly reasonable sampling criteria for choosing a pool instance
can even perform worse than random sampling.
Reinforcement
Learning in Optimal Ventilator Control
Student : Ding Yang
SMA Supervisors : Assoc Prof Leong Tze Yun (Singapore)
& Prof Leslie Pack Kaelbling (MIT)
Abstract:
This dissertation reports using reinforcement learning
and other artificial intelligence techniques to get an optimal policy
in controlling ventilators. It provides a general method to help
decision-makings during ventilation in hospitals. Ventilator control
is a complex and challenging problem which involves lots of attributes,
both patient physiological features and mechanical settings. My
work is focused on Assistant Control ventilation mode, which is
simpler and most widely used in clinical practice. A new system
structure to cope with this problem is described, with a physiological
model and a policy learner as two major parts in my work. Instead
of developing the respiratory system functions, my work is based
on the clinical data from Intensive Care Units in a local hospital.
Necessary processing to the raw data is suggested, which is decided
by medical domain knowledge. I use a simple statistical model to
estimate human respiratory model. Interestingly, this physiological
model fits the testing data well. However, another approach using
artificial neural networks gets even better results. I adopt the
neural network model in the second part of my work. Q-learning is
used to get an optimal policy, which aims to control the ventilator
setting. The ventilator control problem is converted to a model-based
problem in reinforcement learning. Value iterations are used to
determine the optimal control policy. Experiments have been done
to show the feasibility of this approach. Due to some limitations
such as expensive computation, this method doesn't extend effectively
to large state space. The advantages and disadvantages with possible
improvements are also discussed.
Distributed
Matlab on the ALICE Grid
Student : Lee Yih
SMA Supervisors : Assoc Prof Teo Yong Meng (Singapore)
& Prof Alan Edelman (MIT)
Abstract:
Current technology trends show an increasing interest
in parallel computing and distributed computing. In particular,
grid computing is widely believed to be one of the emerging technologies
that will change the world. The ALiCE grid technology being developed
here at NUS is based on distributed shared memory (DSM) architecture.
While there are currently at least 27 parallel Matlab implementations,
there are so far no Matlab implementations on a grid system. This
thesis describes the design and implementation of Matlab*G, a distributed
Matlab on the ALiCE grid. The design draws on the idea of Matlab*P,
a parallel Matlab developed at MIT. This thesis also describes Matlab*J,
a distributed Matlab on DSM. Although Matlab*J is a component of
Matlab*G, it functions independently from ALiCE and can be used
on its own without ALiCE. Matlab*J is a first distributed Matlab
on a DSM system. Although the implementation of Matlab*J is on Sun
JavaSpaces, the design can be modified to use other DSM systems
such as GigaSpaces. The ALiCE system used in this thesis is also
based on Sun JavaSpaces but it also supports other DSM systems.
Component
Based Query Processor
Student : Poorna Subhash Pullipudi
SMA Supervisors : Assoc Prof Tan Kian Lee (Singapore)
& Prof Stuart Madnick (MIT)
Abstract:
The present day commercial database systems are
evolved with time by adding many features and functionality. Therefore,
these database systems are very large in size and monolithic in
structure. In this project we considered component based approach
for developing database systems. We centered our focus on the query
processor subsystem of the database system. We proposed a component
based layered architecture for query processor with the granularity
of the components at operator level. We also proposed guidelines
in developing the components and composing the database system from
off the shelf components. We evaluated the ease of the approach
by implementing few components and composing the system. the component
based approach for query processor guarantees minimum footprint,
predictability and extendibility.
Sized
Regions for Real-Time Java
Student : Sin Mong Leng
SMA Supervisors : Assoc Prof Chin Wei Ngan (Singapore)
& Assoc Prof Martin Rinard (MIT)
Abstract:
This report introduces Sized Regions for Real-Time
Java, which is a big bring together of sized typed analysis, regions
and Java. Annotated Sized Regions is used to check the correctness
of regions usage and detect potential dangling pointers due to bad
programming. Sized regions use sized types to check annotated size
information thus paving the way for size inferencing at a later
stage. In this report we identify a subset of Java for our analysis.
Annotated region syntax is checked for correctness. Sized annotations
are introduced for more checking and gathering of size information.
Various issues like fixed-point analysis and structural aliasing
are discussed briefly in this report.
Dynamic
Optimization via Hyper-Threading
Student : Varun Talwar
SMA Supervisors : Assoc Prof Wong Weng Fai (Singapore)
& Prof Saman Amarasinghe (MIT)
Abstract:
Dynamic linking, translation and compilation have
enabled software to dynamically adapt to new environments. This
is becoming increasingly important with the distribution of software
via the internet. Dynamic optimization is an attempt to optimize
code as it is run. This does away with the need to access the full
source code of the application and also enable optimizations across
dynamically linked code. Since dynamic optimization is done at runtime,
the main challenge in engineering such a system is to keep the overhead
involved minimal so that it doesn’t outweigh any benefits
that may be gained from the optimizations. This thesis introduces
the concept of sideline optimization whereby a low priority thread
is spawned to perform code optimization while the main application
runs under minimal interference from the dynamic optimization runtime
system. A multithreaded extension to the dynamic optimization framework
DynamoRIO is proposed. The performance of this system on a traditional
symmetric multiprocessor system as well as on a simultaneous multithreading
architecture, as realized by the Intel hyperthreading processors,
is studied. The experiments show that for some applications, a speedup
of up to 36% over the original statically optimized code is achievable.
Intelligent
Information Integration for Colorectal Cancer Management
Student : Wei Songjie
SMA Supervisors : Assoc Prof Leong Tze Yun (Singapore)
& Prof Stuart Madnick (MIT)
Abstract:
A significant proportion of the information required
for biomedical research is recorded in the on-line citation databases
such as MEDLINE. Such information is important for clinical decision
analysis. Here we present an automatic approach to extract semantic
relation from different data sources to facilitate decision model
construction for colorectal cancer management. We make use of the
Unified Medical Language System (UMLS) for both overcoming concept
heterogeneity and constructing candidate semantic relations. Each
relation is evaluated by sending query back to the data source and
getting the statistical evidence about this relation. The abstracts
of citations are also parsed to find supporting evidence. Decision
elements and possible relationships among them may be derived from
the semantic relations to support construction of a clinical decision
model. The final relation diagram can also help to construct multilevel
influence diagram, which is useful for decision-driven model construction
for clinical problems.

|