Publications of Weng-Fai Wong

List of Publications

The PDF files provided may not necessarily be the exact form of the paper that appeared. To obtain that, please refer to the proceedings/journal.

This is my Google Scholar profile.

2017

  1. N.M. Ho, and W.F. Wong, "Exploiting half precision arithmetic in Nvidia GPUs," Accepted by 2017 IEEE High Performance Extreme Computing Conference (HPEC 2017). Waltham, MA, U.S.A. Sep 2017. (Best Paper Finalist)

  2. J. Zhou, R. Ramanathan, W.F. Wong and P. S. Thiagarajan, "Automated Property Synthesis of ODEs Based Bio-pathways Models" Proceedings of the 15th conference on Computational Methods for Systems Biology Darmstadt, Germany. pp. 265-282. Sep 2017.

  3. J. Zhou, Y. Zhang, and W.F. Wong. "Fault Tolerant Stencil Computation on Cloud-based GPU Spot Instances." Accepted for publication by IEEE Transactions on Cloud Computing.

  4. Z. Xie, Q. Cai, H.V. Jagadish, B.C. Ooi, and W.F. Wong. Parallelizing Skip Lists for In-Memory Multi-Core Database Systems. Proceedings of the 2017 IEEE International Conference on Data Engineering (ICDE), San Diego, CA, U.S.A. pp. 119-122, Apr 2017.

  5. N.M. Ho, E. Manogaran, W.F. Wong, and A. Anoosheh. "Efficient floating point precision tuning for approximate computing". Proceedings of The 2017 Asia and South Pacific Design Automation Conference (ASP-DAC). ( PPT) Tokyo, Japan. pp. 63-68. Jan 2017.

2016

  1. C. Yao, D. Agrawal, G. Chen, Q.Lin, B.C. Ooi, W.F. Wong, and M. Zhang. "Exploiting Single-Threaded Model in Multi-core In-memory Systems," IEEE Transactions on Knowledge and Data Engineering. vol. 28, No. 10, pp. 2635-2650. Oct 2016.

  2. C. Wang, and W.F. Wong. "TreeFTL: An Efficient Workload-adaptive Algorithm for RAM Buffer Management of NAND Flash-based Devices". IEEE Transactions on Computers. Vol. 65, No. 8, pp. 2618-2630. Aug 2016.

2015

  1. R. Ramanathan, Y. Zhang, J. Zhou, W.F. Wong, and P.S. Thiagarajan, "Parallelized Parameter Estimation of Biological Pathway Models". Proceedings of 2015 Hybrid Systems Biology Workshop. Madrid, Spain. Lecture Notes in Computer Science. No. 9271. pp. 37-57. Springer. Sep 2015.

  2. W.T. Tang, W.J. Tan, R.S.M Goh, S.J. Turner, and W.F. Wong, "A Family of Bit-Representation-Optimized Formats for Fast Sparse Matrix-Vector Multiplication on the GPU". IEEE Transactions on Parallel and Distributed Systems. Vol. 26, No. 9, pp. 2373-2385. Sep 2015.

  3. N.M. Ho, N. Thoai, and W.F. Wong, "Multi-agent simulation on multiple GPUs". Simulation Modelling Practice and Theory. Vol. 57, pp. 118-132. Sep 2015.

  4. W.J. Tan, W.T. Tang, R.S.M. Goh, S.J. Turner, and W.F. Wong, "A Code Generation Framework for Targeting Optimized Library Calls for Multiple Platforms". IEEE Transactions on Parallel and Distributed Systems. vol. 26, no. 7, pp. 1789-1799. Jul 2015.

  5. P. Roy, J. Wang, and W.F. Wong, "PAC : Program Analysis for Approximation-aware Compilation". Proceedings of 2015 ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). pp. 69-78. Amsterdam, The Netherlands. Oct 2015.

  6. H. Zhang, G. Chen, B.C. Ooi, W.F. Wong, S. Wu, and Y. Xia, "Anti-Caching"-based Elastic Memory Management for Big Data". Proceedings of The 31st IEEE International Conference on Data Engineering. pp. 1268-1279. Seoul, South Korea. Apr 2015. (PPT of presentation)

2014

  1. J. Wang, P. Roy, W.F. Wong, X. Bi and H. Li, "Optimizing MLC-based STT-RAM Caches by Dynamic Block Size Reconfiguration". Proceedings of The 32nd IEEE International Conference on Computer Design. pp. 133-138. Seoul, South Korea. Oct 2014.

  2. J. Bosboom, S. Rajadurai, W.F. Wong, and S. Amarasinghe, "StreamJIT: A Commensal Compiler for High-Performance Stream Programming". Proceedings of The 2014 ACM Conference on Object-Oriented Programming, Systems, Languages & Applications (OOPSLA). pp. 177-195. Portland, OR, U.S.A. Oct 2014.

  3. P. Roy, M. Manoharan, and W.F. Wong, "EnVM : Virtual Memory Design for New Memory Architectures". Proceedings of ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). Article No. 12. New Delhi, India. Oct 2014.

  4. H.P. Huynh, A. Hagiescu, Z.L. Ong, W.F. Wong, and R.S.M. Goh, "Mapping Streaming Applications onto GPU Systems". IEEE Transactions on Parallel and Distributed Systems. Vol. 25, No. 9, pp. 2374-2385. Sep 2014.

  5. P. Roy, R. Ray, C. Wang and W.F. Wong, "ASAC: Automatic Sensitivity Analysis for Approximate Computing". Proceedings of the 2014 ACM SIGPLAN Conference on Languages, Compilers and Tools for Embedded Systems (LCTES). pp. 95-104. Edinburgh, U.K. Jun 2014.

  6. Z. Sun, X. Bi, H. Li, W.F. Wong, and X. Zhu, "STT-RAM Cache Hierarchy With Multi-retention MTJ Designs." IEEE Transactions on VLSI Systems. vol. 22., no. 6, pp. 1281-1293. Jun 2014.

  7. J. Wang, Y. Tim, W.F. Wong, Z.L. Ong, Z. Sun, H. Li, "A Coherent Hybrid SRAM and STT-RAM L1 Cache Architecture for Shared Memory Multicores". (PPT of presentation) Proceedings of The 2014 Asia and South Pacific Design Automation Conference (ASP-DAC). Paper 7A5. pp. 610-615. Singapore. Jan 2014.

2013

  1. A. Hagiescu, B. Liu, R. Ramanathan, S.K. Palaniappan, Z. Cui, B. Chattopadhyay, P.S. Thiagarajan, and W.F. Wong, "GPU code generation for ODE-based applications with phased shared-data access patterns," ACM Transactions on Architecture and Code Optimization. Vol. 10, Issue 4, Article 55. Dec 2013.

  2. W.T. Tang, W.J. Tan, R. Ray, Y.W. Wong, W. Chen, S-H. Kuo, R.S.M. Goh, S.J. Turner, and W.F. Wong, "Accelerating Sparse Matrix-Vector Multiplication on GPUs using Bit-Representation-Optimized Schemes." Proceedings of The 2013 International Conference on High Performance Computing, Networking, Storage and Analysis (SC 13). Article No. 26, Denver, CO, U.S.A. Nov 2013.

  3. J. Wang, Y. Tim, W.F. Wong and H. Li, "A Practical Low-Power Memristor-based Analog Neural Branch Predictor". Proceedings of 2013 International Symposium on Low Power Electronics and Design (ISLPED). pp. 175-180. Beijing, P.R. China. Sep 2013.

  4. C. Wang, and W.F. Wong, "SAW: Operating System-Assisted Wear Leveling in NAND Flash Devices". Proceedings of the 50th Design Automation Conference (DAC).. Article 164, pp. 164:1-164:9. Austin, TX, U.S.A. Jun 2013.

  5. W.T. Tang, W.J. Tan, R. Krishnamoorthy, Y.W. Wong, S-H. Kuo, R.S.M. Goh, S.J. Turner, and W.F. Wong, "Optimizing and Auto-Tuning Iterative Stencil Loops for GPUs with the In-Plane Method". Proceedings of 27th IEEE International Parallel and Distributed Processing Symposium (IPDPS 13). pp. . 452-462. Boston, MA. U.S.A. May 2013.

  6. Y. Chen, W.F. Wong, H. Li, C.K. Koh, Y. Zhang, and W. Wen, "On-chip Caches built on Multi-Level Spin-Transfer Torque RAM Cells and Its Optimizations". ACM Journal on Emerging Technologies in Computing Systems. vol. 9, no. 2, pp. 16:1-16:22. May 2013.

  7. C. Wang, and W.F. Wong, "TreeFTL: Efficient RAM Management for High Performance of NAND Flash-based Storage Systems". Proceedings of Design, Automation, and Test in Europe (DATE 13). pp. 374-379. Grenoble, France. Mar 2013.

2012

  1. A. Al-Dujaili, F. Deragisch, A. Hagiescu and W.F. Wong. "Guppy: A GPU-like Soft-Core Processor". Proceedings of the International Conference on Field Programmable Technology 2012. pp. 57-60. Seoul, South Korea. Dec 2012.

  2. W.T. Tang, Y. Wong, W.J. Tan, T. Dubrownik, R.S.M. Goh, S. Kuo, R. Duan, S. Turner, and W.F. Wong. "Tulipse: A Visualization Framework for User-Guided Parallelization". Proceedings of 2012 International European Conference on Parallel and Distributed Computing (EURO-PAR 2012). Lecture Notes of Computer Science, vol. 7484, pp. 4-15. Springer-Verlag. Rhodes Island, Greece. Aug 2012.

  3. B. Liu, A. Hagiescu, S. Palaniappan, B. Chattopadhyay, Z. Cui, W.F. Wong, and P.S. Thiagarajan, "Approximate Probabilistic Analysis of Biopathway Dynamics". Bioinformatics. Vol. 28, No. 11. pp. 1508-1516. 2012.

  4. C. Wang, and W.F. Wong, "Observational Wear Leveling: An Efficient Algorithm for Flash Memory Management". Proceedings of the 49th Design Automation Conference (DAC)., pp. 235-242. San Francisco, CA, U.S.A. Jun 2012.

  5. C. Wang, and W.F. Wong, "ADAPT: Efficient Workload-sensitive Flash Management Based on Adaptation, Prediction and Aggregation". Proceedings of the 28th IEEE Conference on Massive Data Storage (MSST). pp. 1-12. Pacific Grove, CA, U.S.A. Apr 2012.

  6. C. Wang, and W.F. Wong, "Extending the Lifetime of NAND Flash Memory by Salvaging Bad Blocks". Proceedings of Design, Automation, and Test in Europe (DATE 12). pp. 260-263. Dresden, Germany. Mar 2012.

  7. H.P. Huynh, A. Hagiescu, W.F. Wong, and R.S.M. Goh, "Scalable Framework for Mapping Streaming Applications onto Multi-GPU Systems". Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2012). pp. 1-10. New Orleans, LA, U.S.A. Feb 2012. (Presentation slides.)


2011

  1. Z. Sun, X. Bi, H. Li, W.F. Wong, Z.L. Ong, X. Zhu, and Wenqing Wu, "Multi-Retention Level STT-RAM Cache Designs with a Dynamic Refresh Scheme." Proceedings of The 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 44). pp. 329-338. Porto Alegre, Brazil. Dec 2011.

  2. H. Li, X. Wang, Z.L. Ong, W.F. Wong, Y. Zhang, P. Wang, and Y. Chen, "Performance, Power and Reliability Tradeoffs of STT-RAM Cell Subjective to Architecture-level Requirement". IEEE Transactions on Magnetics. vol. 47, no. 10, pp. 2356 - 2359. Oct 2011.

  3. Y. Chen, W.F. Wong, H. Li and C.K. Koh, "Processor Caches built using Multi-Level Spin-Transfer Torque RAM Cells". Proceedings of the 2011 International Symposium on Low Power Electronics and Design (ISLPED). pp. 73-78. Fukuoka, Japan. Aug 2011.

  4. C.-T. Yeh, C.-H. Wang, I.J. Huang and W.F. Wong, "Internet-based Hardware/Software Co-design Framework for Embedded 3D Graphics Applications". EURASIP Journal on Advances in Signal Processing. 2011:25. Jul 2011.

  5. A. Hagiescu, H.P. Huynh, W.F. Wong, and R.S.M. Goh, "Automated architecture-aware mapping of streaming applications onto GPUs". Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium. pp. 467-478. Anchorage, AL, U.S.A. May 2011. (Presentation slides.)

  6. Q. Zhao, D. Koh, S. Raza, S. Amarasinghe, D. Bruening, and W.F. Wong, "Dynamic Cache Contention Detection in Multi-threaded Applications". Proceedings of the 2011 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE 2011). pp. 27-37. Newport Beach, CA, U.S.A. Mar 2011.

  7. A. Hagiescu, and W.F. Wong, "Co-synthesis of FPGA-Based Application-Specific Floating Point SIMD Accelerators". Proceedings of the 19th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. pp. 247-256. Monterey, CA, U.S.A. Mar 2011. (Presentation slides.) (Recording of talk given by Andrei at the conference.)

  8. Z. Sun, C-T. Ye, and W.F. Wong, "A UML 2-based HW/SW Co-Design Framework for Body Sensor Network Applications". Proceedings of Design, Automation, and Test in Europe (DATE 11). pp. 1505-1508. Grenoble, France. Mar 2011.


2010

  1. Q. Zhao, I. Cutcutache, W.F. Wong, "PiPA: Pipelined Profiling and Analysis on Multi-core Systems". ACM Transactions on Architecture and Code Optimization. Vol. 7, Issue 3, Article 13. Dec 2010.

  2. E.J. Sim, W.F. Wong, G. Walla, T. Ziermann, and J. Teich, "Interprocedural Placement-Aware Configuration Prefetching for FPGA-based Systems", Proceedings of the the 18th IEEE Symposium on Field-Programmable Custom Computing Machines. pp. 179-182. Charlotte, NC, U.S.A. May 2010.


2009

  1. J.H. Perkins, S. Kim, S. Larsen, S. Amarasinghe, J. Bachrach, M. Carbin, C. Pacheco, F. Sherwood, S. Sidiroglou, G. Sullivan, W.F. Wong, Y. Zibin, M.D. Ernst, and M. Rinard, "Automatically Patching Errors in Deployed Software" Proceedings of the 22nd ACM Symposium on Operating Systems Principles, pp. 87-102. Big Sky, MT, U.S.A. Oct 2009.

  2. C.K. Koh, W.F. Wong, Y. Chen and H. Li, "The Salvage Cache: A fault-tolerant cache architecture for next-generation memory technologies". Proceedings of the 27th IEEE International Conference on Computer Design, pp. 268-274. Lake Tahoe, CA, U.S.A. Oct 2009.

  3. Z. Ge, T. Mitra, W.F. Wong, "A DVS-based Pipelined Reconfigurable Instruction Memory". Proceedings of the 46th Design Automation Conference (DAC)., pp. 897-902. San Francisco, CA, U.S.A. Jul 2009.

  4. A. Hagiescu, R.M. Rabbah, and W.F. Wong, "A Computing Origami: Folding Streams in FPGAs". Proceedings of the 46th Design Automation Conference., pp. 282-287. San Francisco, CA, U.S.A. Jul 2009.

  5. C.K. Koh, W.F. Wong, Y. Chen and H. Li, "Tolerating process variations in large, set associative caches: The buddy cache" ACM Transactions on Architecture and Code Optimization. Vol. 6, No. 2, pp. 1-34. Jun 2009.

  6. I. Cutcutache, T.T.N. Dang, W.K. Leong, S. Liu, K.D. Nguyen, L.T.X. Phan, E.J. Sim, Z. Sun, T.B. Tok, L. Xu, F.E.H. Tay, and W.F. Wong, "BSN Simulator: Optimizing Application Using System Level Simulation", Proceedings of the 6th International Workshop on Wearable and Implantable Body Sensor Networks (BSN 2009), pp. 9-14. Berkeley, CA, U.S.A., Jun 2009.

  7. E.J. Sim, W.F. Wong, and J. Teich, "Optimal Placement-aware Trace-based Scheduling of Hardware Reconfigurations for FPGA Accelerators". Proceedings of the the 17th IEEE Symposium on Field-Programmable Custom Computing Machines. pp. 279-282. Napa, CA, U.S.A. Apr 2009.

  8. Z. Sun, and W.F. Wong, "A UML-Based Approach for Heterogeneous IP Integration", Proceedings of the 14th Asia and South Pacific Design Automation Conference (ASP-DAC)." pp. 155-160. Yokohama, Japan. Jan 2009.


2008

  1. E.J. Sim, T. Mitra, and W.F. Wong, "Defining neighborhood relations for fast spatial-temporal partitioning of applications on reconfigurable architectures". Proceedings of the 2008 IEEE International Conference on Field Programmable Technology. pp. 121-128. Taipei, Taiwan. Dec 2008.

  2. I. Cutcutache, and W.F. Wong, "Fast, frequency-based, integrated register allocation and instruction scheduling", Software: Practice and Experience. Vol. 38, Issue 11, pp. 1105-1126, Sep 2008.

  3. K.D. Nguyen, I. Cutcutache, S. Sinnadurai, S. Liu, C. Basol, E.J. Sim, L.T.X. Phan, T.B. Tok, L. Xu, F.E.H. Tay, T. Mitra, and W.F. Wong, "Fast and Accurate Simulation of Biomonitoring Applications on a Wireless Body Area Network", Proceedings of the 5th International Workshop on Wearable and Implantable Body Sensor Networks (BSN 2008), pp. 145-148. Hong Kong, P.R.C., Jun 2008.

  4. Q. Zhao, I. Cutcutache, and W.F. Wong, "PiPA: Pipelined Profiling and Analysis on Multi-core Systems". Proceedings of The 2008 International Symposium on Code Generation and Optimization (CGO 08), pp. 185-194. Boston, MA, U.S.A. Apr 2008.

  5. Q. Zhao, R.M. Rabbah, S. Amarasinghe, L. Rudolph, and W.F. Wong, "How to do a million watchpoints: Efficient Debugging using Dynamic Instrumentation". The 17th International Conference on Compiler Construction (CC 2008). Lecture Notes of Computer Science, vol. 4959, pp. 147-162. Springer-Verlag. Budapest, Hungary. Apr 2008.


2007

  1. K.D. Nguyen, P.S. Thiagarajan, and W.F. Wong, "A UML-based Design Framework for Time-triggered Applications". Proceedings of the 28th IEEE Real-Time Systems Symposium (RTSS 07). pp. 39-48. Tucson, Arizona, U.S.A. Dec 2007.

  2. C.K. Koh, W.F. Wong, Y. Chen, and H. Li, "VOSCH: Voltage Scaled Cache Hierarchies". Proceedings of The 25th IEEE International Conference on Computer Design (ICCD 07). pp. 496-503. Lake Tahoe, U.S.A. Oct 2007.

  3. Z. Ge, H.B. Lim, and W.F. Wong, "DRIM : A Low Power Dynamically Reconfigurable Instruction Memory Hierarchy for Embedded Systems", Proceedings of The 10th Design, Automation, and Test in Europe (DATE 07). pp. 1343-1348, Nice, France. Apr 2007.

  4. Q. Zhao, R.M. Rabbah, S. Amarasinghe, L. Rudolph, and W.F. Wong, "Ubiquitous Memory Introspection". Proceedings of The 2007 International Symposium on Code Generation and Optimization (CGO) pp. 299-311. San Jose, U.S.A. Mar 2007.


2006

  1. Y.Y. Leow, C.Y. Ng and W.F. Wong, "Generating Hardware from OpenMP Programs". Proceedings of The 2006 IEEE International Conference on Field Programmable Technology. pp. 73-80. Bangkok, Thailand. Dec 2006.

  2. Q. Zhao, J.E. Sim, W.F. Wong, and L. Rudolph, "DEP: Detailed Execution Profile", Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT 2006). pp. 154-163. Seattle, U.S.A. Sep 2006.

  3. Y. Zhu, W.F. Wong, and S. Andrei, "Co-optimization of Performance and Power in Superscalar Processor Design" , The 1st International Workshop on Embedded Software Optimization (ESO 2006) . Lecture Notes of Computer Science, vol. 4097, pp. 868-878. Springer-Verlag. Seoul, South Korea. Aug 2006.

  4. K.D. Nguyen, G.P.S. Koh, P.S. Thiagarajan, and W.F. Wong, "UML-Based Modeling of Time-triggered Applications". Presented at the 3rd International DAC Workshop UML for SoC Design (UML-SOC).


2005

  1. K.K.K. Win, and W.F. Wong, "Cooperative Instruction Scheduling with Linear Scan Register Allocation". Proc. of The 12th Annual IEEE International Conference on High Performance Computing. Lecture Notes of Computer Science, vol. 3769, pp. 528-537. Springer-Verlag. Goa, India. Dec 2005.

  2. W.F. Wong, "Targeted Data Prefetching". Proc. of the 10th Asia-Pacific Computer Systems Architecture Conference (ACSAC 05), Lecture Notes of Computer Science, vol. 3740, pp. 775-786. Springer-Verlag. Oct 2005.

  3. E.J. Sim, T. Mitra, and W.F. Wong, "Compile-time Design Space Exploration for Dynamically Reconfigurable System-on-a-Chip" Invited presentation at the Optimizing Compiler Assisted SoC Assembly Workshop (OCASA) San Francisco, U.S.A. Sep 2005.

  4. Q. Zhao, R.M. Rabbah, and W.F. Wong, "Dynamic Memory Optimization using Pool Allocation and Prefetching", Workshop on Binary Instrumentation and Applications. St. Louis, U.S.A. Sep 2005. Published in Computer Architecture News, vol. 33, no. 5, pp. 27-33. Dec 2005.

  5. Y. Zhu, W.F. Wong, and C.K. Koh, "A Performance and Power Co-optimization Approach for Modern Processors". Proceedings of the 5th International Conference on Computer and Information Technology. pp. 822-828. Shanghai, P.R.C. Sep 2005.

  6. Z. Ge, H.B. Lim, and W.F. Wong, "A Reconfigurable Instruction Memory Hierarchy for Embedded Systems". Proceedings of the 15th International Conference on Field Programmable Logic and Applications. pp. 7-12. Tampere, Finland. Aug 2005.

  7. Y. Zhu, Z. Sun, A. Maxiaguine, and W.F. Wong, "Using UML 2.0 for System Level Design of Real Time SoC Platforms for Stream Processing". Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications. pp. 154-159. Hong Kong. Aug 2005.

  8. K.D. Nguyen, Z. Sun, P.S. Thiagarajan, and W.F. Wong, "Model-Driven SoC Design: The UML-SystemC Bridge" in "UML for SOC Design" edited by Grant Martin and Wolfgang Müller. pp. 175-197. ISBN 0-387-25744-6. Springer. July 2005.

  9. Z. Sun, Y. Zhu, W.F. Wong, and S.K. Pilakkat, "Design of Clocked Circuits using UML". Proceedings of the Asia and South Pacific Design Automation Conference 2005 (ASP-DAC)." pp. 901-904. Jan 2005.

  10. Y. Zhu, W.F. Wong, and S. Andrei, "An Integrated Performance and Power Model For Superscalar Processor Designs." (Poster) Proceedings of the Asia and South Pacific Design Automation Conference 2005 (ASP-DAC)." 948-951. Jan 2005.


2004

  1. K.D. Nguyen, Z. Sun, P.S. Thiagarajan, and W.F. Wong, "Model-driven SoC Design Via Executable UML to SystemC", Proceedings of the 25th IEEE International Real-Time Systems Symposium (RTSS). pp. 459-468. Dec 2004.

  2. M.R. George, and W.F. Wong, "Windows CE for a Reconfigurable System-on-a-Chip Processor", Proceedings of the International Conference on Field-Programmable Technology 2004 (FPT). pp. 201-208. Dec 2004.

  3. J.H. Pan, T. Mitra, and W.F. Wong, "Configuration Bitstream Compression for Dynamically Reconfigurable FPGAs", Proceedings of the International Conference on Computer Aided Design 2004 (ICCAD). pp. 766-773. Nov 2004.

  4. R.M. Rabbah, M. Ekpanyapong, H. Sandanagobalane, and W.F. Wong, "Compiler-Orchestrated Prefetching via Speculation and Predication", Proc. of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). pp. 189-198. Oct 2004.

  5. A. Maxiaguine, Y. Zhu, S. Chakraborty, and W.F. Wong, "Tuning SoC Platforms for Multimedia Processing: Identifying Limits and Tradeoffs", Proceedings of the Second IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS Merged Conference). pp. 128-133. Sep 2004.

  6. W.H. Tan, P.S. Thiagarajan, W.F. Wong, Y. Zhu and S.K. Pilakkat, "Synthesizable SystemC Code from UML Models" Presented at International Workshop on UML for SoC Design (USOC 2004). Sponsored by Design Automation Conference 2004. Jun 2004.

  7. V-M Panait, A. Sasturkar, and W.F. Wong, "Static Identification of Delinquent Loads", Proc. of 2004 International Symposium on Code Generation and Optimization (CGO 2004), pp. 303-314. Mar 2004.


2003

  1. Z. Ge, J. Liao, and W.F. Wong, "Compiling to FPGAs via an EPIC Compiler's Intermediate Representation", Proc. of IEEE International Conference on Field Programmable Technology, pp. 431-434. Dec 2003.

  2. J. Liao, W.F. Wong, and T. Mitra, "A Model for Hardware Realization of Kernel Loops", Proc. of 13th International Conference on Field Programmable Logic and Application, Lecture Notes of Computer Science, vol. 2778, pp. 334-344. Springer-Verlag. Sep 2003.

  3. L. Peng, W.F. Wong, and C.K. Yuen, "SilkRoad II: mixed paradigm cluster computing with RC_dag consistency", Parallel Computing, vol 29-8 , pp. 1091-1115. Aug 2003.

  4. L. Peng, W.F. Wong, and C.K. Yuen, "The Performance Model of SilkRoad - A Multithreaded DSM System for Clusters", DSM2003: Workshop on Distributed Shared Memory on Clusters, appeared in Proc. of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 495-501. May 2003.

  5. L. Peng and W.F. Wong, "Memory Model Support for Mixed Programming Paradigm in SilkRoad", in Annual Review of Scalable Computing, C.K. Yuen (ed), vol. 5, pp. 65-91. ISBN: 981-238-369-7. Singapore University Press. 2003.


2002

  1. V.S. Gheorghita, W.F. Wong, T. Mitra, and S. Talla, "A Co-simulation Study of Adaptive EPIC Computing," Proc. of IEEE International Conference on Field-Programmable Technology (FPT 2002), pp. 268-275. Dec 2002.

  2. S.P. Seng, K.V. Palem, R.M. Rabbah, W.F. Wong, W. Luk, and P.Y.K. Cheung, "PD-XML: Extensible Markup Language for Processor Description" Proc. of IEEE International Conference on Field-Programmable Technology (FPT 2002), pp. 437-440. Dec 2002.

  3. C.M. Tan, C.P. Tan, and W.F. Wong, "Shell over a Cluster (SHOC): Towards Achieving Single System Image via the Shell," Proc. of IEEE International Conference on Cluster Computing (CLUSTER 2002), pp. 28-36. Sep 2002.

  4. L. Peng, W.F. Wong, and C.K. Yuen, "SilkRoad II: A Multi-Paradigm Runtime System for Cluster Computing", Proc. of IEEE International Conference on Cluster Computing (CLUSTER 2002) (Poster), pp. 443-444. Sep 2002.

  5. J. Kim, W.F. Wong, and K.V. Palem, "A Framework for Data Prefetching using Off-line Training of Markovian Predictors", Proceedings of the International Conference on Computer Design (ICCD 2002), pp. 340-347. Sep 2002.

  6. K. Puttaswamy, L. N. Chakrapani, K. W. Choi, Y. S. Dhillon, U. Diril, P. Korkmaz, K. K. Lee, J. C. Park, A. Chatterjee, P. Ellervee, V. Mooney, K. Palem and W. F. Wong, " Power-Performance Trade-Offs in second level memory used by an ARM-Like RISC Architecture," in Power Aware Computing, Rami Melhem and Robert Graybill, eds. pp. 211-226. Kluwer Academic/Plenum Publishers, May 2002.

  7. Y. Zhu, and W.F. Wong, Sensitivity Analysis of a Superscalar Processor Model. Proceedings of the Seventh Asia-Pacific Computer Systems Architectures Conference (ACSAC2002), Melbourne, Australia. Conferences in Research and Practice in Information Technology, 6. Lai, F. and Morris, J., Eds. pp. 109-118. Jan 2002.

2001

  1. Y. Chobe, B. Narahari, R. Simha, and W.F. Wong, Tritanium: Augmenting the Trimaran Compiler Infrastructure To Support IA-64 Code Generation. Proceedings of the First Workshop on Explicitly Parallel Instruction Computing (EPIC) Architectures and Compiler Techniques, pp. 76-79. Dec 2001.

  2. L.N. Chakrapani, P. Korkmaz, V.J. Mooney III, K. Palem, and W.F. Wong, "The Emerging Power Crisis in Embedded Processors: What Can A Poor Compiler Do?" (Invited Talk), Proc. of International Conference on Compilers, Architectures, and Synthesis of Embedded Systems, pp. 176-180. Nov 2001.

  3. K. Palem, S. Talla, and W.F. Wong, "Compiler Optimizations for Adaptive EPIC Processors", First International Workshop on Embedded Software, Lecture Notes of Computer Science, vol. 2211, pp. 257-273. Springer-Verlag. Oct 2001.

2000

  1. L.F. Lau, A.L. Ananda, G. Tan, W.F. Wong, "Gucha: Internet-based Parallel Computing using Java", Proc. of 4th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), pp. 397-408. Dec 2000.

  2. L. Peng, W.F. Wong, M.D. Feng, and C.K. Yuen, "SilkRoad: A Multithreaded Runtime System with Software Distributed Shared Memory for SMP Cluster", Proc. of IEEE International Conference on Cluster Computing (CLUSTER 2000), pp. 243-249. Dec 2000.

  3. L.F. Lau, A.L. Ananda, G. Tan, W.F. Wong, "JAVM: Internet-based Parallel Computing using Java", in Annual Review of Scalable Computing, pp. 59-74. World Scientific Publisher. ISBN 981-02-4413-4. Dec 2000.

  4. M.C. Ng and W.F. Wong "ORION: An Adaptive Home-Based Software Distributed Shared Memory System", Proc. of 2000 International Conference on Parallel and Distributed Systems (ICPADS 2000). pp. 187-194. 2000.

  5. Y. Zhu and W.F. Wong "Modeling Architectural Improvements in Superscalar Processors" (Extended Abstract), Proc. of HPC-Asia 2000. vol. 1. pp. 28-30. 2000.

1999

  1. W.F. Wong "Optimizing Floating Point Operations in Scheme", Computer Languages. vol. 25. pp. 89-112. 1999.

  2. K.S. Loh and W.F. Wong, "Multiple Context Multithreaded Superscalar Processor Architecture", Journal of Systems Architecture. vol. 46, no. 3, pp. 243-258. 1999.

  3. W.F. Wong, "Source Level Static Branch Prediction", Computer Journal. vol. 42, no. 2, pp. 142-149. 1999.

  4. M.C. Ng and W.F. Wong, "Adaptive Schemes for Home-based DSM Systems", Proceedings of the 1999 Workshop on Software Distributed Shared Memory. pp. 13-20. June 1999.

  5. C.P. Tan, W.F. Wong and C.K. Yuen, "tmPVM - Task Migratable PVM", Proceedings of the 2nd Merged Symposium IPPS/SPDP. pp. 196-202.5. April 1999.

1998

  1. K.S. Loh, M.K. Quek and W.F. Wong, "SPATS - Accurate and Flexible Simulation of Superscalar Processors", Computer Architecture '98: Selected papers of the 3rd Australasian Conference. J. Morris (ed). pp. 133-146. ISBN 981-3083-93-X. Springer-Verlag 1998.

  2. Y. Zhu and W.F. Wong, "The Effect of Instruction Dependency on Superscalar Processor Performance", Computer Architecture '98: Selected papers of the 3rd Australasian Conference. J. Morris (ed). pp. 215-226. ISBN 981-3083-93-X. Springer-Verlag 1998.

1997

  1. Y. Zhu and W.F. Wong, "Performance Analysis of Superscalar Processors using a Queueing Model", Computer Architecture '97: Selected papers of the 2nd Australasian Conference. R. Pose (eds). pp. 147-157. ISBN 981-3083-11-5. Springer-Verlag 1997.

1996

  1. M.D. Feng, W.F. Wong and C.K. Yuen, "BaLinda Lisp: Design and Implementation", Computer Language, vol. 22, no. 4, pp. 205-214. Dec 1996.

  2. M.D. Feng, W.F. Wong and C.K. Yuen, "Highly Efficient Parallel Lisp Implementation on Distributed Systems", Parallel Computing: State-of-the-Art and Perspectives. E. D'Hollander, G.R. Joubert, F.J. Peters and D. Trystram (eds). pp. 319-326. ISBN 0-444-82490-1. Elsevier Science B.V. 1996.

1995

  1. H. Imai, W.F. Wong and K.F. Loe (eds), Advances in Computing Techniques - Algorithms, Databases and Parallel Processing. ISBN 981-02-2501-6. World-Scientific 1995.

  2. M.D. Feng, W.F. Wong and C.K. Yuen, "Compiling parallel Lisp for a shared memory multiprocessor", Proc. of 7th IASTED Conference on Parallel and Distributed Computing and Systems, pp. 487-490. Oct 1995.

  3. M.D. Feng, W.F. Wong and C.K. Yuen, "Design and implementation of abstract machine for parallel Lisp compilation", Proc. of International Conference on Parallel Processing, II-37-II-44. Aug 1995.

  4. W.F. Wong and E. Goto, "Fast Evaluation of the Elementary Functions in Single Precision". IEEE Transactions on Computer. vol. 44, no. 3, pp. 453-458. Mar 1995.

  5. W.F. Wong, Y. Oyanagi and E. Goto, "Evaluation of the Hitachi S-3800 Supercomputer using Six Benchmarks". International Journal of Supercomputer Applications and High Performance Computing. vol. 9, no. 1, pp. 58-70. Spring 1995.

1994

  1. W.F. Wong and E Goto, "A Simulation Study on the Interactions between Multithreaded Architectures and the Cache". International Journal of High Speed Computing. vol. 6, no. 2, pp. 343-356. 1994.

  2. W.F. Wong, Y. Oyanagi and E. Goto, "Supercomputer Performance Evaluation using Six Benchmarks", Proc. of IEEE Region 10's Ninth Annual International Conference, vol. 2, pp. 1107-1111. Aug 1994.

  3. S. Ohta, E. Goto, W.F. Wong and N. Yoshida, "Improvement and New Proposal on Fast Evaluation of Elementary Functions" (in Japanese), Joho Shori Gakkai Ronbunshi, (Journal of the Information Processing Society of Japan). vol. 35, no. 5, pp. 926-933. May 1994.

  4. W.F. Wong and E. Goto, "Fast Hardware-Based Algorithms for Elementary Function Computations based on the Rectangular Multipliers", IEEE Transactions on Computers. vol. 43, no. 3, pp. 278-294. Mar 1994.

  5. W.F. Wong and E. Goto, "Fast Evaluation of the Elementary Functions in Double Precision", Proc. of 27th IEEE Hawaii International Conference on Information Science. vol. 1, pp. 349-358. Maui, Jan 1994.

1993

  1. W.F. Wong, E. Goto and N. Yoshida, `Fast Evaluation of Elementary Functions' (in Japanese), Joho Shori Gakkai Ronbunshi, (Journal of the Information Processing Society of Japan). vol. 34, no. 7, pp.1570-1579. Jul 1993.

  2. W.F. Wong, "Survey of Parallel Lisp Dialects", contributed chapter in C.K. Yuen, Parallel Lisp Systems - A Study of Languages and Architectures, ISBN 0-412-45560-9, Chapman and Hall 1993.

1992

  1. P. Spee, W.F. Wong, M. Sato and E. Goto, "Evaluation of the Continuation Bit in the Cyclic Pipeline Computer', Parallel Computing. vol. 18, no. 12, pp. 1346-1361. Dec 1992.

  2. W.F. Wong and E. Goto, "Improving the Cache Performance of Multithreaded Architectures", Proc. of International Computer Symposium 1992, pp. 1189-1196. Taichung, Dec 1992.

  3. W.F. Wong and C.K. Yuen, "A Model of Speculative Parallelism", Parallel Processing Letters. vol. 2, no. 2&3, pp.265-272. Sep 1992.

  4. W.F. Wong and E. Goto, "Division and Square-rooting using a Split Multiplier", Electronics Letters. vol. 28, no. 18, pp. 1758-1759. Aug 1992.

1991

  1. W.F. Wong, E. Goto, Y. Oyanagi and N. Yoshida, "Six Benchmark Problems for Number Crunchers", Supercomputer. vol. VIII, no. 6, pp. 39-45. Nov 1991.

  2. P. Spee, W.F. Wong and E. Goto, "Effects of Multiple Instruction Stream Execution on Cache Performance", International Journal of High Speed Computing. vol. 3, no. 2, pp. 135-155. 1991.

  3. W.F. Wong, E. Goto, Y. Oyanagi and N. Yoshida, "Six Benchmark Problems for Number Crunchers", Proc. of the International Symposium on Supercomputing 1991. pp. 120-125. Fukuoka, Nov 1991.

  4. W.F. Wong and E. Goto, "Fast Hardware-Based Algorithms for Elementary Function Computations", Proc. of the International Symposium on Supercomputing 1991. pp.56-65. Fukuoka, Nov 1991.

  5. P. Spee, W.F. Wong, M. Sato and E. Goto, "Evaluation of the Continuation Bit in the Cyclic Pipeline Computer". Poster Presentation at Parallel Computing 91. London, Sep 1991.

1990

  1. W.F. Wong, J.J. Yee and C.K. Yuen, "A Data Driven, Direct Execution Architecture for A Parallel Lisp Dialect", Proc. of the 1990 U.K. Conference on Parallel Computing in Lisp. Twickenham, London, Nov 1990.

  2. W.F. Wong and K.T. Lua, "A Preliminary Evaluation of a Massively Parallel Processor : GAPP", Microprocessing and Microprogramming. vol. 29, no. 1, pp. 53-62. Jul 1990.

  3. C.K. Yuen and W.F. Wong, "BaLinda Lisp : A Parallel List-Processing Language", Proc. of the 2nd IEEE International Conference on Tools for Artificial Intelligence. pp. 618-624. Fairfax, U.S.A., 1990.

  4. C.K. Yuen and W.F. Wong, "BIDDLE : The Design of a BIdirectional Data Driven Lisp Engine", Proc. of the 13th Australian Computer Science Conference. pp. 421-429. Melbourne, Feb 1990.

1989

  1. W.F. Wong and C.K. Yuen, "BIDDLE : A BIdirectional Data Driven Lisp Engine", Proc. of the 1989 IEEE International Workshop on Tools for Artificial Intelligence. pp. 194-199. Fairfax, U.S.A., Dec 1989.

  2. W.F. Wong and C.K. Yuen, "SARC - A Stack and Register Computer", Proc. of the 1989 International Conference on Computer Architecture and Digital Signal Processing. pp. 194-199. Hong Kong, Oct 1989.

  3. W.F. Wong, "A Stack Addressing Scheme Based on Windowing", ACM SIGARCH Computer Architecture News, Vol 17, No. 1. pp. 63-69. Mar 1989.

Last updated Oct 4, 2017