|
|
 |
|
 |
Seminars
|
|
|
|
Time: Every Wednesday 3 - 4pm
Venue: TR3 S16 Level 3, Room 9
This seminar is a forum for presenting research papers and for reporting recent work done by graduate students and faculty members in the broad area of embedded systems design. Therefore, anyone who is interested in this area is invited to attend this seminar and is also welcome to give a talk. Graduate students working in the area of embedded systems/computer engineering are especially encouraged to do so.
For new students, this seminar will also provide an opportunity to form an impression about the kind of research work that is going on in this area. No prior knowledge in the area of embedded systems is required to benefit from this seminar.
Each week there will be a presentation on some topic related to embedded systems. The presentation should last for about 45 minutes and can be quite informal. The primary idea is to have plenty of discussions on the topic of the presentation.
The seminors are orgnized by:
Kathy, Nguyen Dang ( kathyngu@comp.nus.edu.sg) and Edward, Sim Joon( esim@comp.nus.edu.sg). |
Schedule for 2004-2005 Semester 2
| Date |
Speaker |
Topic |
Details |
| 2nd Feb |
LI Xianfeng |
Modeling Out-of-Order Processors for Software Timing Analysis |
 |
| 9th Feb |
Chinese New Year! |
| 16th Feb |
Dr. Nuggehalli |
Quest for QoS with teh IEEE 802.11e |
 |
| 23rd Feb |
Ramkumar |
Design and evaluation of banked register file for smt processors and its associated renaming algorithm |
 |
| 2nd March |
ZHU Yongxin |
Performance and Energy Bounds for Multimedia Applications on Dual-processor Power-aware SoC Platforms |
 |
| 9th March |
Edward,SIM |
Fast Spatial and Temporal Copartitioning
for Dynamically
Reconfigurable Coprocessors |
 |
| 16th March |
YANG Shaofa |
A Formal Concurrency Model Based Architecture Description Language for Synthesis of Software Development Tools |
 |
| 23th March |
Ge Zhiguo |
A Reconfigurable Instruction Memory Hierarchy for Embedded Systems |
 |
| 30rd March |
Ankit Goel |
Interacting Process Classes: Modeling and Simulation |
 |
| 6th April |
S.D. Fernando |
Design of Networked Reconfigurable Encryption Engine |
 |
| 13th April |
Balaji Raman |
To be decided |
|
Date: |
2nd February |
Speaker: |
LI Xianfeng |
Title: |
Modeling Out-of-Order Processors for Software Timing Analysis |
Abstract: |
Estimating the Worst Case Execution Time (WCET) of a program on a given processor is important for the schedulability analysis of real-time systems. WCET analysis techniques typically model the timing effects of microarchitectural features in modern processors (such as the pipeline, caches, branch prediction, etc.) to obtain safe but tight estimates. In this talk, I will present the modeling of out-of-order processor pipelines for WCET analysis. This analysis is, in general, difficult even for a basic block (a sequence of instructions with single-entry and single-exit point) if some of the instructions have variable latencies. This is because the worst case execution time of a basic block on out-of-order pipelines cannot be obtained by assuming maximum latencies of the individual instructions. Our timing estimation technique for a basic block is inspired by an existing performance analysis technique for tasks with data dependences and resource contentions in real-time distributed systems. We extend our analysis by modeling the interaction among consecutive basic blocks as well as the effect of instruction cache. Finally, we employ Integer Linear Programming (ILP) to compute the WCET of an entire program. The accuracy of our analysis is demonstrated via tight estimates obtained for several benchmarks. |
|
Date: |
16th February |
Speaker: |
Dr. Nuggehalli (Centre for Electronics Design and Technology, Indian Institute of Science) |
Title: |
Quest for QoS with the IEEE 802.11e |
Abstract: |
The IEEE 802.11e Medium Access Control (MAC) standard is a recent supplement to the IEEE 802.11 family of technologies to support Quality-of-Service (QoS) in wireless networks. The standard provides for differentiated service by introducing different access categories. Each access category corresponds to a different set of channel access parameters such as contention window size and inter-frame spacing. High priority traffic (such as voice or video streams) are able to contend for channel access more successfully than low priority traffic (e.g. data), allowing for better QoS and increased utility. However, the underlying expectation is that nodes will follow the rules set by the standard.. We argue that this is an unreasonable assumption because in a non-cooperative environment, individual nodes can improve their performance by classifying low priority traffic as high priority traffic. In this talk, we begin by showing that such untruthful behavior can severely disrupt the QoS capability of 802.11e based networks. We will then provide a discussion of various approaches to mitigate the behavior of selfish nodes. In particular, we will present a strategy-proof, incentives- based approach motivated by the Vickrey-Clarke-Groves mechanism. Finally, we will discuss the implementation of our approach in the context of the 802.11e standard.
|
Biodata: |
Pavan Nuggehalli received the M.Sc (Engg.) degree in electrical sciences from the Indian Institute of Science, Bangalore, in 1998 and the Ph.D. degree in electrical and computer engineering from the University of California at San Diego, La Jolla, in 2003. His research focuses on architectures, protocols and performance analysis of wireless ad hoc and sensor networks. He is currently an Assistant Professor in the Centre for Electronics Design and Technology, Indian Institute of Science.
|
|
Date: |
23rd February |
Speaker: |
JAYASEELAN Ramkumar |
Title: |
Design and evaluation of banked register file for smt processors and its associated renaming algorithm |
Abstract: |
Simultaneous multi threading is a technique in which instructions from
multiple threads can be issued in the same clock cycle. For this to be
feasible the processor must maintain the context of each thread seperately
in hardware. Associated with each thread the processor needs to maintain a
program counter, a return stack, a complete set of architectural
registers, re-order and store buffers. SMT processors need to have a large
number of registers in their register file and this results in long
register access time and high power consumption. In this work we propose
banking the register file as a soultion to reduce the power consumption
and access time. We use analytical models (modified version of cacti) to
determine the power, area and access time of banked register file
configuration. The banked register file configuration shows a reduction of
29% in power consumption, 43% in access time in comparison to a standard
register file in an SMT processor. However 15-20% degradation in IPC has
been observed due to conflicts in register access. This necessitates a
suitable register renaming scheme to support the banked register file. We
have also designed one such renaming scheme and the combined scheme
produces negligible degradation in IPC.
|
|
Date: |
2nd March |
Speaker: |
ZHU Yongxin |
Title: |
Performance and Energy Bounds for Multimedia Applications on Dual-processor Power-aware SoC Platforms |
Abstract: |
The energy consumption of SoC architectures for multimedia processing is
now as important as their performance
because of the plethora of battery-operated devices running multimedia
applications. In this talk, we present an analytical
framework to evaluate the performance and the power of a dual-processor
SoC architecture that supports dynamic frequency
scaling in an integrated manner. As a result, we identify performance
and energy bounds associated with platform
configuration tradeoffs that includes operating frequencies of the
processors, buffer sizes, buffer types and scheduling schemes. It
is difficult and costly for simulation-based approaches to evaluate such
tradeoffs because of the bursty nature of multimedia
traffic and the high variability in the multimedia processing
requirements. Furthermore, our model accounts for the impact of
leakage power of platforms built with nanometer range process technologies.
|
|
Date: |
9th March |
Speaker: |
Edward,SIM Joon |
Title: |
Fast Spatial and Temporal Copartitioning for Dynamically Reconfigurable Coprocessors |
Abstract: |
Processors with programmable and dynamically reconfigurable co-processors are now starting to be commercially available. Compared to their predecessors that do not support dynamically reconfiguration, these processors present a different hardware-software partitioning problem in which reconfiguration at run time has to be considered. In this paper, we present an algorithm that considers the dynamic placement of multiple loop kernels into programmable hardware for the purpose of accelerating overall execution. As far as we know, this is the first partitioning algorithm that accounts for dynamic reconfiguration, its overhead, and multiple loop kernels instantiated with different loop parameters co-located in one configuration. With the necessary profile information, the algorithm can be implemented in a compiler or as a separate component of the design flow. Depending on the reconfiguration cost overhead, our experiments with four benchmarks show a speedup of up to 2.25 times over a pure software execution. |
|
Date: |
16th March |
Speaker: |
Yang Shaofa |
Title: |
A Formal Concurrency Model Based Architecture Description Language for Synthesis of Software Development Tools |
Abstract: |
Rapidly increasing design and manufacturing non-recurring engineering costs are prompting a shift in electronic design from hardwired application specific integrated circuits to the use of software on programmable platforms. However, in order to minimize the power nad performance overhead of such processors, we are seeing the introduction of domain or application specific processors such as network and communication processors. The design of such specialized processors requires software development tools such as simulators and compilers. In ordr to quickly develop these tools for multiple design points under consideration, it is highly desirable to have them synthesized from formal processor descriptions written Architecture Description Languages. In this paper, we present the Mescal Architecture Description Language (MADL). MADL features a two-layer structure, a core layer and an annotation layer. The core layer is based on a formal and flexible microprocessor model -- the operation state machine (OSM), which enables MADL to express concurrency at the operation execution level for a wide range of architectures. We address the challenges faced in designing the core layer to combine to OSM model with techniques for achieving compact processor descriptions. The annotation layer features a generic syntax that allows creating annotation schemes to specify implementation dependent or tool specific information. To show the effectiveness of MADL, we present an MADL-based simulator synthesis framework that has been used to generate efficient cycle accurate simulators and instruction set simulators with very low developmen effort. We also describe our annotation schemes that enable the extraction of architecture properties for use in instruction scheduling and integer-linear-programming based register allocation. Our experimental results demonstrate the efficacy of MADL as a practical and promising language for the development of programmable platforms. |
|
Date: |
23rd March |
Speaker: |
GE Zhiguo |
Title: |
A Reconfigurable Instruction Memory Hierarchy for Embedded Systems |
Abstract: |
The performance of the instruction memory hierarchy is of crucial importance in embedded systems. In this paper, we propose a reconfigurable instruction memory hierarchy for embedded systems whose architectural parameters can be customized for specific applications. The proposed instruction memory hierarchy consists of an instruction cache and a scratchpad memory.
We propose an algorithm to manage this instruction memory hierarchy and optimize its performance. Given a fixed amount of reconfigurable on-chip storage resource and an application, our algorithm determines the sizes of the instruction cache and the scratchpad to best suit the application. It analyzes the application, partitions the available storage resources into cache and scratchpad, and assigns instructions to the cache and scratchpad. Our algorithm aims to reduce the instruction fetch miss rate and improves the system performance, as well as reducing the energy consumption.
We have implemented this reconfigurable instruction memory hierarchy on the Altera Nios II FPGA platform. Our experimental results using five benchmarks from the MediaBench and the MiBench suites show that our proposed architecture provides significant performance improvements and energy reduction. |
|
Date: |
30th March |
Speaker: |
Ankit Goel |
Title: |
Interacting Process Classes: Modeling and Simulation |
Abstract: |
We present a modeling framework targeted towards modeling
and simulation of Reactive Systems. Objects with similar behaviors are
grouped together into classes, where the behavior of a class is
described by state machine diagrams and the main structural
relationships between the classes are captured by class diagrams. In
addition, we use sequence
diagrams to specify the interactions between the classes at a behavioral
level. The key idea in our framework is that within a class, objects are
grouped together dynamically based on their past and future behaviors.
This leads to a memory efficient simulation for finite state systems, and
also allows simulation of unboundedly many objects. |
|
Date: |
6th April |
Speaker: |
Shakith Devinda Fernando |
Title: |
Design of Networked Reconfigurable Encryption Engine |
Abstract: |
The current state-of-the-art in FPGA's has given rise to many potential
networked appliances that would be able to download new hardware services
and upgrades and execute them locally. However this technology has not
been widely used. This seminar, I present a user scenario of networked
reconfiguration in encryption application using embedded software and
reconfigurable hardware. The appliance's hardware can be reconfigured at
run-time, thus allowing to switch between several encryption standards
and achieve hardware acceleration for each encryption standard. The
reconfiguration bitstream is retrieved from the network, allowing future
flexible scalability. A prototype has been built to demonstrate the
functionality of the networked reconfigurable encryption engine.
|
|
|
|
|
 |