Publications of Weng-Fai Wong

List of Publications of Weng-Fai WONG

The PDF files provided may not necessarily be the exact form of the paper that appeared. To obtain that, please refer to the proceedings/journal.

This is my Google Scholar profile.

2026

W. Li, B. Gao, and W.-F. Wong, "CARBS: Compiler Autotuning via Randomized Biased Search." Accepted by The 35th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 26).
H. Tan, Y. Chen, G. Alonso, W.-F. Wong, and B. He, "Approaching Shannon Bound with Lossless LLM Weight Compression." Accepted by 53rd Annual International Symposium on Computer Architecture (ISCA)
F. Yu, H. Tan, Y. Chen, W.-F. Wong, and B. He, "XtraMAC: An Efficient MAC Architecture for Mixed-Precision LLM." Accepted by 53rd Annual International Symposium on Computer Architecture (ISCA)
Y. Fu, Y. Chen, C. Chen, B. He, and W.-F. Wong, "Geometric Partition for Billion-Scale Approximate Nearest Neighbor Search." Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE).
W. Li, B. Gao, and W.-F. Wong, "Texplorer: Efficient Tensor Program Optimization for GPUs via Accurate Search Space Bounding." Accepted by IEEE Transactions on Parallel and Distributed Systems.
Q. Wang, H. Lv, Y. Zhang, W.-F. Wong, and B. He, "Incremental GNN Embedding Computation on Streaming Graphs". Accepted by The 42nd IEEE International Conference on Data Engineering (ICDE).
Z. Yan, S. Wang, K. Tang, Z. Bai, and W.-F. Wong, "HyperSNN: A new efficient and robust deep learning model for resource constrained control applications." Accepted by The 2026 American Control Conference (ACC 2026). New Orleans, U.S.A. May 2026.
Z. Yan, J. Mao, Q. Liu, F. Li, T. Luo, G. Pan, B. Zhu, and W.-F. Wong, "Otters: An Energy-Efficient Spiking Transformer via Optical Time-to-First-Spike Encoding". Accepted by The 14th International Conference on Learning Representations (ICLR 2026). Rio de Janeiro, Brazil. Apr 2026.
C. Wang, Z. Yan, Z. Zhou, X. Chen and W.-F. Wong, "Energy-Efficient and Dequantization-Free Quantization of LLMs: A Spiking Neural Network Approach to Salient Value Mitigation". Accepted by The ACM Web Conference 2026 (WWW 2026), Dubai, U.A.E. Apr 2026.
H. Tan, Y. Chen, X. Chen, Q. Zhang, C. Chen, W. Wong, and B. He, "RidgeWalker: Perfectly Pipelined Graph Random Walks on FPGAs." Accepted by 32nd International Symposium on High-Performance Computer Architecture (HPCA 2026). Sydney, Australia. Feb 2026.

2025

M. Yin, B.C.M. Choong, C. Qu, R.S.M. Goh, W.-F. Wong and T. Luo, "Optimizing Neural Networks with Learnable Non-Linear Activation Functions via Lookup-Based FPGA Acceleration." Accepted by the International Conference on Computer Aided Design 2025 (ICCAD). Munich, Germany. Oct 2025.
B. Xu, Z. Wen, Y. Chen, W. Liu, W.-F. Wong, and B. He, "ScalaGBM: Memory Efficient GBDT Training for High-Dimensional Data on GPU." Accepted by the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining Toronto, Canada. Aug 2025.
K. Tang, Z. Yan, and W.-F. Wong, "Sorbet: A Neuromorphic Hardware-compatible Transformer-based Spiking Language Model," Accepted as poster by Forty-Second International Conference on Machine Learning (ICML 2025). Vancouver, Canada. Jul 2025.
F. Yu, H. Tan, X. Chen, Y. Chen, B. He, and W.-F. Wong, "Clementi: Efficient Load Balancing and Communication Overlap for Multi-FPGA Graph Processing." Accepted by 2025 ACM SIGMOD/PODS International Conference on Management of Data.
Q. Wang, Y. Chen, B. He, and W.-F. Wong, "Scalable and Load-Balanced Full-Graph GNN Training on Multiple GPUs." Accepted by the IEEE Transactions on Knowledge and Data Engineering (TKDE).
Y. Chen, F. Yu, D. Wu, W.-F. Wong and B. He, "Configurable DSP-Based CAM Architecture for Data-Intensive Applications on FPGAs." Accepted by The 62nd Design Automation Conference (DAC). San Francisco, CA, U.S.A. Jun 2025.
Z. Wang, T. Luo, C. Liu, W. Liu, R.S.M. Goh, W.-F. Wong, "Enabling Energy-Efficient Deployment of Large Language Models on Memristor Crossbar: A Synergy of Large and Small." IEEE Transactions on Pattern Analysis and Machine Intelligence. vol. 47, issue 2, pp. 916-933. DOI: https://doi.org/10.1109/TPAMI.2024.3483654 Feb 2025.

2024

Z. Yan, K. Tang, J. Zhou, W.-F. Wong, "Low Latency Conversion of Artificial Neural Network Models to Rate-encoded Spiking Neural Networks", Accepted by IEEE Transactions on Neural Networks and Learning Systems. Dec 2024.
J. Shen, Y. Chen, W.-F. Wong, E.-C. Chiang, "T-Edge: Trusted Heterogeneous Edge Computing", Accepted by The Annual Computer Security Applications Conference (ACSAC) 2024. Dec 2024.
Y. Fu, C. Chen, Y. Chen, W.-F. Wong, and B. He, "Vista: Vector Indexing and Search for Large-scale Imbalanced Datasets." Accepted by The 41st IEEE International Conference on Data Engineering (ICDE).
Y. Fu, X. Chen, W.-F. Wong, B. He, and C. Chen, "Optimizing the Number of Clusters for Billion-scale Quantization-based Nearest Neighbor Search." IEEE Transactions on Knowledge and Data Engineering (TKDE). vol. 36, issue 11, pp. 6786-6800. DOI: https://doi.org/10.1109/TKDE.2024.3408815 Nov 2024.
Z. Wang, T. Luo, R.S.M Goh, W. Zhang and W.-F. Wong, "Optimizing for In-memory Deep Learning with Emerging Memory Technology." IEEE Transactions on Neural Networks and Learning Systems. vol. 35, no. 11, pp. 15306-15320. DOI: 10.1109/TNNLS.2023.3285488. Nov 2024.
Z. Yan, W. Chu, Y. Sheng, K. Tang, S. Wang, Y. Liu, and W.-F. Wong, Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-Terminal Coding Sequences. American Chemical Society (ACS) Synthetic Biology, vol. 13, issue 9, pp. 2960-2968. DOI: https://doi.org/10.1021/acssynbio.4c00371. Sep 2024.
B. Gao, Z. Wang, Z. He., T. Luo, W.-F. Wong, Z. Zhou, "IMI: In-memory Multi-job Inference Acceleration for Large Language Models". Proceedings of the 53rd International Conference on Parallel Processing (ICPP). pp. 752-761. DOI: https://doi.org/10.1145/3673038.3673053. Aug 2024.
D. Li, M.M. Wong, Y.S. Chong, J. Zhou, M. Upadhyay, A. Balaji, A. Mani, W.-F. Wong, L.S. Peh, A.T. Do, B. Wang, "1.63pJ/SOP Neuromorphic Processor with Integrated Partial Sum Routers for In-Network Computing", IEEE Transactions on Very Large Scale Integration (VLSI) Systems. Preprint pp. 1-8.
K. Tang, Z. Yan, and W.-F. Wong. "OneSpike: Ultra-Low Latency Spiking Neural Networks." Proceedings of The 2024 International Joint Conference on Neural Networks (IJCNN 2024). DOI: https://doi.org/10.1109/IJCNN60899.2024.10651169. Jul 2024.
D. Gerlinghoff, B.C.M. Choong, R.S.M. Goh, W.-F. Wong, and T. Luo. "Table-Lookup MAC: Scalable Processing of Quantised Neural Networks in FPGA Soft Logic". (arXiv version) Proceedings of The 32th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2024). pp. 235-245. Apr 2024.
M. Upadhyay, R. Juneja, W.-F. Wong, and L.-S. Peh, "NOVA: NoC-based Vector Unit for Mapping Attention Layers on a CNN Accelerator". Design, Automation, and Test in Europe (DATE 2024).

2023

Q. Wang, Y. Chen, W.-F. Wong, and B. He. "HongTu: Scalable Full-Graph GNN Training on Multiple GPUs." Proceedings of 2024 ACM SIGMOD/PODS International Conference on Management of Data. vol. 1, issue 4, article no.: 246, pp. 1-27. Dec 2023.
D. Gerlinghoff, T. Luo, R.S.M. Goh, and W.-F. Wong. "Desire Backpropagation: A Lightweight Training Algorithm for Multi-Layer Spiking Neural Networks based on Spike-Timing-Dependent Plasticity". Neurocomputing.. Vol. 560, Article No. 126773. DOI: https://doi.org/10.1016/j.neucom.2023.126773. Dec 2023.
H. Zhang, N.M. Ho, Y.P. Dogukan, P. Chen, M. Wahib, T.T. Nguyen, J. Meng, R.S.M. Goh, S. Matsuoka, T. Luo, and W.F. Wong. "Simeuro: A Hybrid CPU-GPU Parallel Simulator for Neuromorphic Computing Chips." IEEE Transactions on Parallel and Distributed Systems. vol. 34, iss. 10, pp 2767-2782. Oct 2023.
Z. Yan, J. Zhou and W.-F. Wong, "CQ+ Training: Minimizing Accuracy Loss in Conversion from Convolutional Neural Networks to Spiking Neural Networks." IEEE Transactions on Pattern Analysis and Machine Intelligence. vol. 45, no. 10. 11600-11611. Oct 2023.
M.T.L. Aung, D. Gerlinghoff, C. Qu, L. Yang, T. Huang, R.S.M. Goh, T. Luo and W.F. Wong. "DeepFire2: A Convolutional Spiking Neural Network Accelerator on FPGAs." IEEE Transactions on Computers. vol. 72, pp. 2847-2857. DOI: 10.1109/TC.2023.3272284. Oct 2023.
Z. Yan, S. Wang, K. Tang and W.-F. Wong, "Efficient Hyperdimensional Computing." 2023 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2023). Lecture Notes in Computer Science book series (LNAI,volume 14170). pp. 141-155. Springer, Cham. Sep 2023.
T. Luo, W.F. Wong, R.S.M. Goh, A.T. Do, Z. Chen, H. Li, W. Jiang, and W. Yau, "Achieving Green AI with Energy-Efficient Deep Learning Using Neuromorphic Computing". Communications of the ACM, vol. 66 no. 7, pp. 52-57. Jul 2023.
H. Tan, X. Chen, B. He, and W.F. Wong, "LightRW: FPGA Accelerated Graph Dynamic Random Walks," Proceedings of 2023 ACM SIGMOD/PODS International Conference on Management of Data. vol. 1, iss. 1, article no. 90. pp 1-27. Jun 2023.
T. Huang, J. Xu, T. Luo, X. Gu, R.S.M. Goh, and W.F. Wong. "Benchmarking Quantum(-inspired) Annealing Hardware on Practical Use Cases," IEEE Transactions on Computers. vol. 72, pp. 1692-1705. DOI Bookmark: 10.1109/TC.2022.3219257. Jun 2023.
C. Chen, Y. Wang, J. Yang, Y. Liu, M. Lu, Z. Zheng, B. He, W.-F. Wong, L. You, P. Sun, Y. Zhao, F. Hu, and A. Rudoff, "OpenEmbedding: A Distributed Parameter Server for Deep Learning Recommendation Model using Persistent Memory." Proceedings of the IEEE International Conference on Data Engineering 2023 (ICDE 2023) pp. 2976-2987, doi: 10.1109/ICDE55515.2023.00228. Apr 2023.
H. De Silva, H. Tan, N.M. Ho, J.L. Gustafson, and W.F. Wong, Towards a Better 16-Bit Number Representation for Training Neural Networks Proceedings of Next Generation Arithmetic: 4th International Conference (CoNGA 2023). Springer-Verlag, Berlin, Heidelberg, 19-37. https://doi.org/10.1007/978-3-031-32180-1_8. Singapore. Mar 2023.
N.M. Ho, D-T. Nguyen, J.L. Gustafson, and W.F. Wong, "Bedot: Bit Efficient Dot Product for Deep Generative Models" Proceedings of Next Generation Arithmetic: 4th International Conference (CoNGA 2023). Springer-Verlag, Berlin, Heidelberg, 19-37. https://doi.org/10.1007/978-3-031-32180-1_2. Singapore. Mar 2023.

2022

X. Chen, F. Cheng, H. Tan, Y. Chen, B. He, W.F. Wong and D. Chen, "ThunderGP: Resource-Efficient Graph Processing Framework on FPGAs with HLS", ACM Transactions on Reconfigurable Technology and Systems (TRETS). Vol. 15, iss. 4, article No. 44, pp 1-31, https://doi.org/10.1145/3517141. Dec 2022.
X. Chen, Y. Chen, F. Cheng, H. Tan, B. He, W.F. Wong, "ReGraph: Scaling Graph Processing on HBM-enabled FPGAs with Heterogeneous Pipelines", Presented at The 55th IEEE/ACM International Symposium on Microarchitecture (MICRO) Chicago, IL, U.S.A., Oct 2022.
M. Upadhyay, R. Juneja, B. Wang, J. Zhou, W.F. Wong and L.S. Peh, "REACT: A Heterogeneous Reconfigurable Neural Network Accelerator with Software-Configurable NoCs for Training and Inference on Wearables", The 59th Design Automation Conference (DAC). (Presentation slides) San Francisco, CA, U.S.A. Jul 2022.
Z. Yan, J. Zhou, and W.F. Wong, "EEG Classification with Spiking Neural Network: Smaller, Better, More Energy Efficient." Smart Health. Vol. 24, 100261, Jun 2022.
N.M. Ho, H. De Silva, J.L. Gustafson, and W.F. Wong, "Qtorch+: Next Generation Arithmetic for Pytorch Machine Learning." The Third International Conference on Next Generation Arithmetic (CoNGA). Lecture Notes in Computer Science. No. 13253. pp. 31-49. Springer. Singapore. Mar 2022.
L. Yang, H. Zhang, T. Luo, C. Qu, M.T.L. Aung, Y. Cui, J. Zhou, M. Wong, J. Puc, A.T. Do, R.S.M. Goh, and W.F. Wong, "Coreset: Hierarchical Neuromorphic Computing Supporting Large-Scale NeuralNetworks with Improved Resource Efficiency," Neurocomputing. Vol. 474, pp. 128-140. Feb 2022.

2021

X. Chen, H. Tan, Y. Chen, B. He, W.F. Wong and D. Chen, "Skew-oblivious Data Routing for Data Intensive Applications on FPGAs with HLS." Proceedings of The 58th Design Automation Conference (DAC). pp. 937-942. https://doi.org/10.1109/DAC18074.2021.9586184. Dec 2021.
T. Luo, L. Yang, H. Zhang, C. Qu, X. Wang, Y. Cui, W.F. Wong, and R.S.M. Goh, "NC-Net: Efficient Neuromorphic Computing Using Aggregated Sub-nets on a Crossbar-based Architecture with Non-volatile Memory." IEEE Transactions on Computer Aided Design of Integrated Circuits & Systems (TCAD). DOI: 10.1109/TCAD.2021.3120068. Oct 2021.
D.T. Nguyen, N.M. Ho, M.S. Le, W.F. Wong, and I.J. Chang. "ZEM: Zero-cycle Bit-masking Module for Deep Learning Refresh-less DRAM". IEEE Access. vol. 9, pp. 93723-93733. Jul 2021.
J. Zhou, R. Ramanathan, and W.F. Wong, "Synthesis of the Dynamical Properties of Feedback Loops in Bio-pathways" IEEE/ACM Transactions on Computational Biology and Bioinformatics. vol. 18, pp. 1217-1226, May-June 2021.
M.T.L. Aung, C. Qu, L. Yang, R.S.M. Goh, T. Luo and W.F. Wong. "DeepFire: Acceleration of Convolutional Spiking Neural Network on Modern Field Programmable Gate Arrays". the 2021 International Conference on Field-Programmable Logic and Applications (FPL).
H. Tan, X. Chen, Y. Chen, B. He, and W.F. Wong, "ThundeRiNG: Generating Multiple Independent Random Number Sequences on FPGAs", Proceedings of The 35th ACM International Conference on Supercomputing (ICS-2021). pp. 115-126. Jun 2021.
X. Chen, H. Tan, Y. Chen, B. He, W.F. Wong and D. Chen, "ThunderGP: HLS-based Graph Processing Framework on FPGAs". Proceedings of The 29th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2021). https://doi.org/10.1145/3431920.3439290. pp. 69-80. Feb 2021.
N.M. Ho, D.T. Nguyen, H. De Silva, J.L. Gustafson, W.F. Wong and I.J. Chang, "Posit Arithmetic for the Training and Deployment of Generative Adversarial Networks." Design, Automation, and Test in Europe (DATE 2021). (PPT of presentation.) Mar 2021.
N.M. Ho and W.F. Wong, "Tensorox: Accelerating GPU Applications via Neural Approximation on Unused Tensor Cores". IEEE Transactions on Parallel and Distributed Systems. vol. 33, no. 2, pp. 429-443, Feb 2021.
Z. Yan, J. Zhou, and W.F. Wong, "Near Lossless Transfer Learning for Spiking Neural Networks." Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21). Virtual conference. Feb 2021.
N.M. Ho, H. De Silva, and W.F. Wong, "GRAM: A framework for dynamically mixing precisions in GPU applications". ACM Transactions on Architecture and Code Optimization. (TACO) Vol. 18, no. 2, article no. 19, pp 1-24. Feb 2021.
C. Chen, J. Yang, M. Lu, T. Wang, Z. Zheng, Y. Chen, W. Dai, B. He, W.F. Wong, G. Wu, Y. Zhao, and A. Rudo, "Optimizing In-memory Database Engine For AI-powered On-line Decision Augmentation Using Persistent Memory." Proceedings of the VLDB Endowment. vol. 14, no. 5. pp 799-812. Jan 2021.
Z. Yan, J. Zhou, and W.F. Wong, "Energy Efficient ECG Classification with Spiking Neural Network." Biomedical Signal Processing and Control. Vol. 63, Article No. 102170. Jan 2021.

2020

B. Chen, D. Sun, J. Zhou, W.F. Wong, and Z. Ding, "A future intelligent traffic system with mixed autonomous vehicles and human-driven vehicles," Information Sciences, Vol. 529, pp. 59-72, Aug 2020.
Z. Wang, H. Zhang, T. Luo, W.F. Wong, A.T. Do, P. Vishnu, W. Zhang, and R.S.M. Goh, "NCPower: Power Modelling for NVM-based Neuromorphic Chip." Accepted by 2020 International Conference on Neuromorphic Systems 2020 (ICONS 2020). Virtual conference. July 2020.
B. Wang, J. Zhou, W.F. Wong and L.S. Peh, "Shenjing: A low power reconfigurable neuromorphic accelerator with partial-sum and spike networks-on-chip." Proceedings of Design, Automation, and Test in Europe (DATE 2020). Grenoble, France. Mar 2020.
T. Luo, X. Wang, C. Qu, M.K.F. Lee, W.T. Tang, W.F. Wong and R.S.M. Goh, "An FPGA-based Hardware Emulator for Neuromorphic Chip with RRAM." IEEE Transactions on Computer Aided Design of Integrated Circuits & Systems (TCAD). vol. 32, issue 9, pp. 438-450. DOI: 10.1109/TCAD.2018.2889670. Feb 2020.
C. Chen, Q. Wei, W.F. Wong, and C. Wang, "NV-Journaling: Locality-aware Journaling using Byte-addressable Non-volatile Memory." IEEE Transactions on Computers. Vol. 69, no. 2, pp. 288-299. DOI: 10.1109/TC.2019.2948004. Feb. 2020.
X. Chen, Y. Chen, R. Bajaj, J. He, B. He, W.F. Wong, and D. Chen, "Is FPGA useful for Hash Joins?" Conference on Innovative Data Systems Research (CIDR 2020). Amsterdam, The Netherlands. Jan 2020.

2019

J. Zhou, Y. Zhang, and W.F. Wong. "Fault Tolerant Stencil Computation on Cloud-based GPU Spot Instances." IEEE Transactions on Cloud Computing. vol. 7, no. 04, pp. 1013-1024, Oct-Dec 2019. DOI ref: 10.1109/TCC.2017.2710311.
J. Zhou, and W.F. Wong, "Resource efficient personalized ECG beat classification via temporal logic synthesis," Proceedings of The 19th IEEE International Conference on Bioinformatics and Bioengineering. (IEEE BIBE 2019). pp. 374-377. Athens, Greece. Oct 2019.
X. Chen, R. Bajaj, Y. Chen, J. He, B. He, W.F. Wong and D. Chen, "On-The-Fly Parallel Data Shuffling for Graph Processing on OpenCL based FPGA." Proceedings of The 2019 International Conference on Field-Programmable Logic and Applications (FPL 2019). pp. 67-73. Barcelona, Spain. Sep 2019.
H. De Silva, A. Santosa, N.M. Ho and W.F. Wong, "ApproxSymate: Path Sensitive Program Approximation using Symbolic Execution". Proceedings of the 20th ACM International Conference on Languages Compilers, Tools and Theory of Embedded Systems (LCTES 2019). pp. 148-162. Phoenix, AZ, U.S.A. Jun 2019.
N.M. Ho, R. Vaddi and W.F. Wong, "Multi-objective Precision Optimization of Deep Neural Networks for Edge Devices." Proceedings of Design, Automation, and Test in Europe (DATE 19). . pp. 1100-1105. DOI 10.23919/DATE.2019.8714785. PPT of presentation. Florence, Italy. Mar 2019.
Q. Cai, H. Zhang, W. Guo, G. Chen, B.C. Ooi, K.-L. Tan and W.F. Wong, MemepiC: Towards a Unified In-Memory Big Data Management System, IEEE Transactions on Big Data (TDB).. vol. 5, no. 1, pp. 4-17, Mar 2019.
M.K.F. Lee, Y. Cui, T. Somu, T. Luo, J. Zhou, W.T. Tang, W.F. Wong, and R.S.M. Goh, "A System-level Simulator for RRAM-based Neuromorphic Computing Chips." ACM Transactions on Architecture and Code Optimization. (TACO) Vol. 15, No. 4, Article No. 64, Jan 2019.

2018

H. De Silva, J. Gustafson, W.F. Wong, "Making Strassen Matrix Multiplication Safe," 25th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2018) pp. 173-182. Bengaluru, India. Dec 2018.
S. Rajadurai, J. Bosboom, W.F. Wong, and S. Amarasinghe, "Gloss: Seamless Live Reconfiguration and Reoptimization of Stream Programs." Proceedings of The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2018). pp. 98-112. Williamsburg, VA. U.S.A. Mar 2018. ( PPT of talk)

2017

N.M. Ho, and W.F. Wong, "Exploiting half precision arithmetic in Nvidia GPUs," Proceedings of the 2017 IEEE High Performance Extreme Computing Conference (HPEC 2017). Waltham, MA, U.S.A. pp. 1-7. Sep 2017. DOI: 10.1109/HPEC.2017.8091072. (Best Paper Finalist) Github source code.
J. Zhou, R. Ramanathan, W.F. Wong and P. S. Thiagarajan, "Automated Property Synthesis of ODEs Based Bio-pathways Models". Proceedings of the 15th conference on Computational Methods for Systems Biology Darmstadt, Germany. pp. 265-282. Sep 2017.
Z. Xie, Q. Cai, H.V. Jagadish, B.C. Ooi, and W.F. Wong. "Parallelizing Skip Lists for In-Memory Multi-Core Database Systems". Proceedings of the 2017 IEEE International Conference on Data Engineering (ICDE), San Diego, CA, U.S.A. pp. 119-122, Apr 2017.
N.M. Ho, E. Manogaran, W.F. Wong, and A. Anoosheh. "Efficient floating point precision tuning for approximate computing". Proceedings of The 2017 Asia and South Pacific Design Automation Conference (ASP-DAC). ( PPT) Tokyo, Japan. pp. 63-68. Jan 2017. Github source code.

2016

C. Yao, D. Agrawal, G. Chen, Q.Lin, B.C. Ooi, W.F. Wong, and M. Zhang. "Exploiting Single-Threaded Model in Multi-core In-memory Systems," IEEE Transactions on Knowledge and Data Engineering. vol. 28, No. 10, pp. 2635-2650. Oct 2016.
C. Wang, and W.F. Wong. "TreeFTL: An Efficient Workload-adaptive Algorithm for RAM Buffer Management of NAND Flash-based Devices". IEEE Transactions on Computers. Vol. 65, No. 8, pp. 2618-2630. Aug 2016.

2015

R. Ramanathan, Y. Zhang, J. Zhou, W.F. Wong, and P.S. Thiagarajan, "Parallelized Parameter Estimation of Biological Pathway Models". Proceedings of 2015 Hybrid Systems Biology Workshop. Madrid, Spain. Lecture Notes in Computer Science. No. 9271. pp. 37-57. Springer. Sep 2015.
W.T. Tang, W.J. Tan, R.S.M Goh, S.J. Turner, and W.F. Wong, "A Family of Bit-Representation-Optimized Formats for Fast Sparse Matrix-Vector Multiplication on the GPU". IEEE Transactions on Parallel and Distributed Systems. Vol. 26, No. 9, pp. 2373-2385. Sep 2015.
N.M. Ho, N. Thoai, and W.F. Wong, "Multi-agent simulation on multiple GPUs". Simulation Modelling Practice and Theory. Vol. 57, pp. 118-132. Sep 2015.
W.J. Tan, W.T. Tang, R.S.M. Goh, S.J. Turner, and W.F. Wong, "A Code Generation Framework for Targeting Optimized Library Calls for Multiple Platforms". IEEE Transactions on Parallel and Distributed Systems. vol. 26, no. 7, pp. 1789-1799. Jul 2015.
P. Roy, J. Wang, and W.F. Wong, "PAC : Program Analysis for Approximation-aware Compilation". Proceedings of 2015 ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). pp. 69-78. Amsterdam, The Netherlands. Oct 2015.
H. Zhang, G. Chen, B.C. Ooi, W.F. Wong, S. Wu, and Y. Xia, "Anti-Caching"-based Elastic Memory Management for Big Data". Proceedings of The 31st IEEE International Conference on Data Engineering. pp. 1268-1279. Seoul, South Korea. Apr 2015. (PPT of presentation)

2014

J. Wang, P. Roy, W.F. Wong, X. Bi and H. Li, "Optimizing MLC-based STT-RAM Caches by Dynamic Block Size Reconfiguration". Proceedings of The 32nd IEEE International Conference on Computer Design (ICCD). pp. 133-138. Seoul, South Korea. Oct 2014.
J. Bosboom, S. Rajadurai, W.F. Wong, and S. Amarasinghe, "StreamJIT: A Commensal Compiler for High-Performance Stream Programming". Proceedings of The 2014 ACM Conference on Object-Oriented Programming, Systems, Languages & Applications (OOPSLA). pp. 177-195. Portland, OR, U.S.A. Oct 2014.
P. Roy, M. Manoharan, and W.F. Wong, "EnVM : Virtual Memory Design for New Memory Architectures". Proceedings of ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES). Article No. 12. New Delhi, India. Oct 2014.
H.P. Huynh, A. Hagiescu, Z.L. Ong, W.F. Wong, and R.S.M. Goh, "Mapping Streaming Applications onto GPU Systems". IEEE Transactions on Parallel and Distributed Systems. Vol. 25, No. 9, pp. 2374-2385. Sep 2014.
P. Roy, R. Ray, C. Wang and W.F. Wong, "ASAC: Automatic Sensitivity Analysis for Approximate Computing". Proceedings of the 2014 ACM SIGPLAN Conference on Languages, Compilers and Tools for Embedded Systems (LCTES). pp. 95-104. Edinburgh, U.K. Jun 2014.
Z. Sun, X. Bi, H. Li, W.F. Wong, and X. Zhu, "STT-RAM Cache Hierarchy With Multi-retention MTJ Designs." IEEE Transactions on VLSI Systems. vol. 22., no. 6, pp. 1281-1293. Jun 2014.
J. Wang, Y. Tim, W.F. Wong, Z.L. Ong, Z. Sun, H. Li, "A Coherent Hybrid SRAM and STT-RAM L1 Cache Architecture for Shared Memory Multicores". (PPT of presentation) Proceedings of The 2014 Asia and South Pacific Design Automation Conference (ASP-DAC). Paper 7A5. pp. 610-615. Singapore. Jan 2014.

2013

A. Hagiescu, B. Liu, R. Ramanathan, S.K. Palaniappan, Z. Cui, B. Chattopadhyay, P.S. Thiagarajan, and W.F. Wong, "GPU code generation for ODE-based applications with phased shared-data access patterns," ACM Transactions on Architecture and Code Optimization. Vol. 10, Issue 4, Article 55. Dec 2013.
W.T. Tang, W.J. Tan, R. Ray, Y.W. Wong, W. Chen, S-H. Kuo, R.S.M. Goh, S.J. Turner, and W.F. Wong, "Accelerating Sparse Matrix-Vector Multiplication on GPUs using Bit-Representation-Optimized Schemes." Proceedings of The 2013 International Conference on High Performance Computing, Networking, Storage and Analysis (SC 13). Article No. 26, Denver, CO, U.S.A. Nov 2013.
J. Wang, Y. Tim, W.F. Wong and H. Li, "A Practical Low-Power Memristor-based Analog Neural Branch Predictor". Proceedings of 2013 International Symposium on Low Power Electronics and Design (ISLPED). pp. 175-180. Beijing, P.R. China. Sep 2013.
C. Wang, and W.F. Wong, "SAW: Operating System-Assisted Wear Leveling in NAND Flash Devices". Proceedings of the 50th Design Automation Conference (DAC).. Article 164, pp. 164:1-164:9. Austin, TX, U.S.A. Jun 2013.
W.T. Tang, W.J. Tan, R. Krishnamoorthy, Y.W. Wong, S-H. Kuo, R.S.M. Goh, S.J. Turner, and W.F. Wong, "Optimizing and Auto-Tuning Iterative Stencil Loops for GPUs with the In-Plane Method". Proceedings of 27th IEEE International Parallel and Distributed Processing Symposium (IPDPS 13). pp. . 452-462. Boston, MA. U.S.A. May 2013.
Y. Chen, W.F. Wong, H. Li, C.K. Koh, Y. Zhang, and W. Wen, "On-chip Caches built on Multi-Level Spin-Transfer Torque RAM Cells and Its Optimizations". ACM Journal on Emerging Technologies in Computing Systems. (JETC) vol. 9, no. 2, pp. 16:1-16:22. May 2013.
C. Wang, and W.F. Wong, "TreeFTL: Efficient RAM Management for High Performance of NAND Flash-based Storage Systems". Proceedings of Design, Automation, and Test in Europe (DATE 13). pp. 374-379. Grenoble, France. Mar 2013.

2012

A. Al-Dujaili, F. Deragisch, A. Hagiescu and W.F. Wong. "Guppy: A GPU-like Soft-Core Processor". Proceedings of the International Conference on Field Programmable Technology 2012. (FPT 2012) pp. 57-60. Seoul, South Korea. Dec 2012.
W.T. Tang, Y. Wong, W.J. Tan, T. Dubrownik, R.S.M. Goh, S. Kuo, R. Duan, S. Turner, and W.F. Wong. "Tulipse: A Visualization Framework for User-Guided Parallelization". Proceedings of 2012 International European Conference on Parallel and Distributed Computing (EURO-PAR 2012). Lecture Notes of Computer Science, vol. 7484, pp. 4-15. Springer-Verlag. Rhodes Island, Greece. Aug 2012.
B. Liu, A. Hagiescu, S. Palaniappan, B. Chattopadhyay, Z. Cui, W.F. Wong, and P.S. Thiagarajan, "Approximate Probabilistic Analysis of Biopathway Dynamics". Bioinformatics. Vol. 28, No. 11. pp. 1508-1516. 2012.
C. Wang, and W.F. Wong, "Observational Wear Leveling: An Efficient Algorithm for Flash Memory Management". Proceedings of the 49th Design Automation Conference (DAC)., pp. 235-242. San Francisco, CA, U.S.A. Jun 2012.
C. Wang, and W.F. Wong, "ADAPT: Efficient Workload-sensitive Flash Management Based on Adaptation, Prediction and Aggregation". Proceedings of the 28th IEEE Conference on Massive Data Storage (MSST). pp. 1-12. Pacific Grove, CA, U.S.A. Apr 2012.
C. Wang, and W.F. Wong, "Extending the Lifetime of NAND Flash Memory by Salvaging Bad Blocks". Proceedings of Design, Automation, and Test in Europe (DATE 12). pp. 260-263. Dresden, Germany. Mar 2012.
H.P. Huynh, A. Hagiescu, W.F. Wong, and R.S.M. Goh, "Scalable Framework for Mapping Streaming Applications onto Multi-GPU Systems". Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2012). pp. 1-10. New Orleans, LA, U.S.A. Feb 2012. (Presentation slides.)

2011

Z. Sun, X. Bi, H. Li, W.F. Wong, Z.L. Ong, X. Zhu, and Wenqing Wu, "Multi-Retention Level STT-RAM Cache Designs with a Dynamic Refresh Scheme." Proceedings of The 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 44). pp. 329-338. Porto Alegre, Brazil. Dec 2011.
H. Li, X. Wang, Z.L. Ong, W.F. Wong, Y. Zhang, P. Wang, and Y. Chen, "Performance, Power and Reliability Tradeoffs of STT-RAM Cell Subjective to Architecture-level Requirement". IEEE Transactions on Magnetics. vol. 47, no. 10, pp. 2356 - 2359. Oct 2011.
Y. Chen, W.F. Wong, H. Li and C.K. Koh, "Processor Caches built using Multi-Level Spin-Transfer Torque RAM Cells". Proceedings of the 2011 International Symposium on Low Power Electronics and Design (ISLPED). pp. 73-78. Fukuoka, Japan. Aug 2011.
C.-T. Yeh, C.-H. Wang, I.J. Huang and W.F. Wong, "Internet-based Hardware/Software Co-design Framework for Embedded 3D Graphics Applications". EURASIP Journal on Advances in Signal Processing. 2011:25. Jul 2011.
A. Hagiescu, H.P. Huynh, W.F. Wong, and R.S.M. Goh, "Automated architecture-aware mapping of streaming applications onto GPUs". Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium. pp. 467-478. Anchorage, AL, U.S.A. May 2011. (Presentation slides.)
Q. Zhao, D. Koh, S. Raza, S. Amarasinghe, D. Bruening, and W.F. Wong, "Dynamic Cache Contention Detection in Multi-threaded Applications". Proceedings of the 2011 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE 2011). pp. 27-37. Newport Beach, CA, U.S.A. Mar 2011.
A. Hagiescu, and W.F. Wong, "Co-synthesis of FPGA-Based Application-Specific Floating Point SIMD Accelerators". Proceedings of the 19th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. (FPGA 2011) pp. 247-256. Monterey, CA, U.S.A. Mar 2011. (Presentation slides.) (Recording of talk given by Andrei at the conference.)
Z. Sun, C-T. Ye, and W.F. Wong, "A UML 2-based HW/SW Co-Design Framework for Body Sensor Network Applications". Proceedings of Design, Automation, and Test in Europe (DATE 11). pp. 1505-1508. Grenoble, France. Mar 2011.

2010

Q. Zhao, I. Cutcutache, W.F. Wong, "PiPA: Pipelined Profiling and Analysis on Multi-core Systems". ACM Transactions on Architecture and Code Optimization. Vol. 7, Issue 3, Article 13. Dec 2010.
E.J. Sim, W.F. Wong, G. Walla, T. Ziermann, and J. Teich, "Interprocedural Placement-Aware Configuration Prefetching for FPGA-based Systems", Proceedings of the the 18th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2010). pp. 179-182. Charlotte, NC, U.S.A. May 2010.

2009

J.H. Perkins, S. Kim, S. Larsen, S. Amarasinghe, J. Bachrach, M. Carbin, C. Pacheco, F. Sherwood, S. Sidiroglou, G. Sullivan, W.F. Wong, Y. Zibin, M.D. Ernst, and M. Rinard, "Automatically Patching Errors in Deployed Software" Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP 2009), pp. 87-102. Big Sky, MT, U.S.A. Oct 2009.
C.K. Koh, W.F. Wong, Y. Chen and H. Li, "The Salvage Cache: A fault-tolerant cache architecture for next-generation memory technologies". Proceedings of the 27th IEEE International Conference on Computer Design, pp. 268-274. Lake Tahoe, CA, U.S.A. Oct 2009.
Z. Ge, T. Mitra, W.F. Wong, "A DVS-based Pipelined Reconfigurable Instruction Memory". Proceedings of the 46th Design Automation Conference (DAC)., pp. 897-902. San Francisco, CA, U.S.A. Jul 2009.
A. Hagiescu, R.M. Rabbah, and W.F. Wong, "A Computing Origami: Folding Streams in FPGAs". Proceedings of the 46th Design Automation Conference., pp. 282-287. San Francisco, CA, U.S.A. Jul 2009.
C.K. Koh, W.F. Wong, Y. Chen and H. Li, "Tolerating process variations in large, set associative caches: The buddy cache" ACM Transactions on Architecture and Code Optimization. Vol. 6, No. 2, pp. 1-34. Jun 2009.
I. Cutcutache, T.T.N. Dang, W.K. Leong, S. Liu, K.D. Nguyen, L.T.X. Phan, E.J. Sim, Z. Sun, T.B. Tok, L. Xu, F.E.H. Tay, and W.F. Wong, "BSN Simulator: Optimizing Application Using System Level Simulation", Proceedings of the 6th International Workshop on Wearable and Implantable Body Sensor Networks (BSN 2009), pp. 9-14. Berkeley, CA, U.S.A., Jun 2009.
E.J. Sim, W.F. Wong, and J. Teich, "Optimal Placement-aware Trace-based Scheduling of Hardware Reconfigurations for FPGA Accelerators". Proceedings of the the 17th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2009). pp. 279-282. Napa, CA, U.S.A. Apr 2009.
Z. Sun, and W.F. Wong, "A UML-Based Approach for Heterogeneous IP Integration", Proceedings of the 14th Asia and South Pacific Design Automation Conference (ASP-DAC)." pp. 155-160. Yokohama, Japan. Jan 2009.

2008

E.J. Sim, T. Mitra, and W.F. Wong, "Defining neighborhood relations for fast spatial-temporal partitioning of applications on reconfigurable architectures". Proceedings of the 2008 IEEE International Conference on Field Programmable Technology (FPT 2008). pp. 121-128. Taipei, Taiwan. Dec 2008.
I. Cutcutache, and W.F. Wong, "Fast, frequency-based, integrated register allocation and instruction scheduling", Software: Practice and Experience. Vol. 38, Issue 11, pp. 1105-1126, Sep 2008.
K.D. Nguyen, I. Cutcutache, S. Sinnadurai, S. Liu, C. Basol, E.J. Sim, L.T.X. Phan, T.B. Tok, L. Xu, F.E.H. Tay, T. Mitra, and W.F. Wong, "Fast and Accurate Simulation of Biomonitoring Applications on a Wireless Body Area Network", Proceedings of the 5th International Workshop on Wearable and Implantable Body Sensor Networks (BSN 2008), pp. 145-148. Hong Kong, P.R.C., Jun 2008.
Q. Zhao, I. Cutcutache, and W.F. Wong, "PiPA: Pipelined Profiling and Analysis on Multi-core Systems". Proceedings of The 2008 International Symposium on Code Generation and Optimization (CGO 08), pp. 185-194. Boston, MA, U.S.A. Apr 2008.
Q. Zhao, R.M. Rabbah, S. Amarasinghe, L. Rudolph, and W.F. Wong, "How to do a million watchpoints: Efficient Debugging using Dynamic Instrumentation". The 17th International Conference on Compiler Construction (CC 2008). Lecture Notes of Computer Science, vol. 4959, pp. 147-162. Springer-Verlag. Budapest, Hungary. Apr 2008.

2007

K.D. Nguyen, P.S. Thiagarajan, and W.F. Wong, "A UML-based Design Framework for Time-triggered Applications". Proceedings of the 28th IEEE Real-Time Systems Symposium (RTSS 07). pp. 39-48. Tucson, Arizona, U.S.A. Dec 2007.
C.K. Koh, W.F. Wong, Y. Chen, and H. Li, "VOSCH: Voltage Scaled Cache Hierarchies". Proceedings of The 25th IEEE International Conference on Computer Design (ICCD 07). pp. 496-503. Lake Tahoe, U.S.A. Oct 2007.
Z. Ge, H.B. Lim, and W.F. Wong, "DRIM : A Low Power Dynamically Reconfigurable Instruction Memory Hierarchy for Embedded Systems", Proceedings of The 10th Design, Automation, and Test in Europe (DATE 07). pp. 1343-1348, Nice, France. Apr 2007.
Q. Zhao, R.M. Rabbah, S. Amarasinghe, L. Rudolph, and W.F. Wong, "Ubiquitous Memory Introspection". Proceedings of The 2007 International Symposium on Code Generation and Optimization (CGO) pp. 299-311. San Jose, U.S.A. Mar 2007.

2006

Y.Y. Leow, C.Y. Ng and W.F. Wong, "Generating Hardware from OpenMP Programs". Proceedings of The 2006 IEEE International Conference on Field Programmable Technology (FPT 2006). pp. 73-80. Bangkok, Thailand. Dec 2006.
Q. Zhao, J.E. Sim, W.F. Wong, and L. Rudolph, "DEP: Detailed Execution Profile", Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT 2006). pp. 154-163. Seattle, U.S.A. Sep 2006.
Y. Zhu, W.F. Wong, and S. Andrei, "Co-optimization of Performance and Power in Superscalar Processor Design" , The 1st International Workshop on Embedded Software Optimization (ESO 2006) . Lecture Notes of Computer Science, vol. 4097, pp. 868-878. Springer-Verlag. Seoul, South Korea. Aug 2006.
K.D. Nguyen, G.P.S. Koh, P.S. Thiagarajan, and W.F. Wong, "UML-Based Modeling of Time-triggered Applications". Presented at the 3rd International DAC Workshop UML for SoC Design (UML-SOC).

2005

K.K.K. Win, and W.F. Wong, "Cooperative Instruction Scheduling with Linear Scan Register Allocation". Proc. of The 12th Annual IEEE International Conference on High Performance Computing. Lecture Notes of Computer Science, vol. 3769, pp. 528-537. Springer-Verlag. Goa, India. Dec 2005.
W.F. Wong, "Targeted Data Prefetching". Proc. of the 10th Asia-Pacific Computer Systems Architecture Conference (ACSAC 05), Lecture Notes of Computer Science, vol. 3740, pp. 775-786. Springer-Verlag. Oct 2005.
E.J. Sim, T. Mitra, and W.F. Wong, "Compile-time Design Space Exploration for Dynamically Reconfigurable System-on-a-Chip" Invited presentation at the Optimizing Compiler Assisted SoC Assembly Workshop (OCASA) San Francisco, U.S.A. Sep 2005.
Q. Zhao, R.M. Rabbah, and W.F. Wong, "Dynamic Memory Optimization using Pool Allocation and Prefetching", Workshop on Binary Instrumentation and Applications. St. Louis, U.S.A. Sep 2005. Published in Computer Architecture News, vol. 33, no. 5, pp. 27-33. Dec 2005.
Y. Zhu, W.F. Wong, and C.K. Koh, "A Performance and Power Co-optimization Approach for Modern Processors". Proceedings of the 5th International Conference on Computer and Information Technology. pp. 822-828. Shanghai, P.R.C. Sep 2005.
Z. Ge, H.B. Lim, and W.F. Wong, "A Reconfigurable Instruction Memory Hierarchy for Embedded Systems". Proceedings of the 15th International Conference on Field Programmable Logic and Applications. pp. 7-12. Tampere, Finland. Aug 2005.
Y. Zhu, Z. Sun, A. Maxiaguine, and W.F. Wong, "Using UML 2.0 for System Level Design of Real Time SoC Platforms for Stream Processing". Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications. pp. 154-159. Hong Kong. Aug 2005.
K.D. Nguyen, Z. Sun, P.S. Thiagarajan, and W.F. Wong, "Model-Driven SoC Design: The UML-SystemC Bridge" in "UML for SOC Design" edited by Grant Martin and Wolfgang Müller. pp. 175-197. ISBN 0-387-25744-6. Springer. July 2005.
Z. Sun, Y. Zhu, W.F. Wong, and S.K. Pilakkat, "Design of Clocked Circuits using UML". Proceedings of the Asia and South Pacific Design Automation Conference 2005 (ASP-DAC)." pp. 901-904. Jan 2005.
Y. Zhu, W.F. Wong, and S. Andrei, "An Integrated Performance and Power Model For Superscalar Processor Designs." (Poster) Proceedings of the Asia and South Pacific Design Automation Conference 2005 (ASP-DAC)." 948-951. Jan 2005.

2004

K.D. Nguyen, Z. Sun, P.S. Thiagarajan, and W.F. Wong, "Model-driven SoC Design Via Executable UML to SystemC", Proceedings of the 25th IEEE International Real-Time Systems Symposium (RTSS). pp. 459-468. Dec 2004.
M.R. George, and W.F. Wong, "Windows CE for a Reconfigurable System-on-a-Chip Processor", Proceedings of the International Conference on Field-Programmable Technology 2004 (FPT). pp. 201-208. Dec 2004.
J.H. Pan, T. Mitra, and W.F. Wong, "Configuration Bitstream Compression for Dynamically Reconfigurable FPGAs", Proceedings of the International Conference on Computer Aided Design 2004 (ICCAD). pp. 766-773. Nov 2004.
R.M. Rabbah, M. Ekpanyapong, H. Sandanagobalane, and W.F. Wong, "Compiler-Orchestrated Prefetching via Speculation and Predication", Proc. of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). pp. 189-198. Oct 2004.
A. Maxiaguine, Y. Zhu, S. Chakraborty, and W.F. Wong, "Tuning SoC Platforms for Multimedia Processing: Identifying Limits and Tradeoffs", Proceedings of the Second IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS Merged Conference). pp. 128-133. Sep 2004.
W.H. Tan, P.S. Thiagarajan, W.F. Wong, Y. Zhu and S.K. Pilakkat, "Synthesizable SystemC Code from UML Models" Presented at International Workshop on UML for SoC Design (USOC 2004). Sponsored by Design Automation Conference 2004. Jun 2004.
V-M Panait, A. Sasturkar, and W.F. Wong, "Static Identification of Delinquent Loads", Proc. of 2004 International Symposium on Code Generation and Optimization (CGO 2004), pp. 303-314. Mar 2004.

2003

Z. Ge, J. Liao, and W.F. Wong, "Compiling to FPGAs via an EPIC Compiler's Intermediate Representation", Proc. of IEEE International Conference on Field Programmable Technology (FPT 2003), pp. 431-434. Dec 2003.
J. Liao, W.F. Wong, and T. Mitra, "A Model for Hardware Realization of Kernel Loops", Proc. of 13th International Conference on Field Programmable Logic and Application, Lecture Notes of Computer Science, vol. 2778, pp. 334-344. Springer-Verlag. Sep 2003.
L. Peng, W.F. Wong, and C.K. Yuen, "SilkRoad II: mixed paradigm cluster computing with RC_dag consistency", Parallel Computing, vol 29-8 , pp. 1091-1115. Aug 2003.
L. Peng, W.F. Wong, and C.K. Yuen, "The Performance Model of SilkRoad - A Multithreaded DSM System for Clusters", DSM2003: Workshop on Distributed Shared Memory on Clusters, appeared in Proc. of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 495-501. May 2003.
L. Peng and W.F. Wong, "Memory Model Support for Mixed Programming Paradigm in SilkRoad", in Annual Review of Scalable Computing, C.K. Yuen (ed), vol. 5, pp. 65-91. ISBN: 981-238-369-7. Singapore University Press. 2003.

2002

V.S. Gheorghita, W.F. Wong, T. Mitra, and S. Talla, "A Co-simulation Study of Adaptive EPIC Computing," Proc. of IEEE International Conference on Field-Programmable Technology (FPT 2002), pp. 268-275. Dec 2002.
S.P. Seng, K.V. Palem, R.M. Rabbah, W.F. Wong, W. Luk, and P.Y.K. Cheung, "PD-XML: Extensible Markup Language for Processor Description" Proc. of IEEE International Conference on Field-Programmable Technology (FPT 2002), pp. 437-440. Dec 2002.
C.M. Tan, C.P. Tan, and W.F. Wong, "Shell over a Cluster (SHOC): Towards Achieving Single System Image via the Shell," Proc. of IEEE International Conference on Cluster Computing (CLUSTER 2002), pp. 28-36. Sep 2002.
L. Peng, W.F. Wong, and C.K. Yuen, "SilkRoad II: A Multi-Paradigm Runtime System for Cluster Computing", Proc. of IEEE International Conference on Cluster Computing (CLUSTER 2002) (Poster), pp. 443-444. Sep 2002.
J. Kim, W.F. Wong, and K.V. Palem, "A Framework for Data Prefetching using Off-line Training of Markovian Predictors", Proceedings of the International Conference on Computer Design (ICCD 2002), pp. 340-347. Sep 2002.
K. Puttaswamy, L. N. Chakrapani, K. W. Choi, Y. S. Dhillon, U. Diril, P. Korkmaz, K. K. Lee, J. C. Park, A. Chatterjee, P. Ellervee, V. Mooney, K. Palem and W. F. Wong, " Power-Performance Trade-Offs in second level memory used by an ARM-Like RISC Architecture," in Power Aware Computing, Rami Melhem and Robert Graybill, eds. pp. 211-226. Kluwer Academic/Plenum Publishers, May 2002.
Y. Zhu, and W.F. Wong, Sensitivity Analysis of a Superscalar Processor Model. Proceedings of the Seventh Asia-Pacific Computer Systems Architectures Conference (ACSAC2002), Melbourne, Australia. Conferences in Research and Practice in Information Technology, 6. Lai, F. and Morris, J., Eds. pp. 109-118. Jan 2002.

2001

Y. Chobe, B. Narahari, R. Simha, and W.F. Wong, Tritanium: Augmenting the Trimaran Compiler Infrastructure To Support IA-64 Code Generation. Proceedings of the First Workshop on Explicitly Parallel Instruction Computing (EPIC) Architectures and Compiler Techniques, pp. 76-79. Dec 2001.
L.N. Chakrapani, P. Korkmaz, V.J. Mooney III, K. Palem, and W.F. Wong, "The Emerging Power Crisis in Embedded Processors: What Can A Poor Compiler Do?" (Invited Talk), Proc. of International Conference on Compilers, Architectures, and Synthesis of Embedded Systems, pp. 176-180. Nov 2001.
K. Palem, S. Talla, and W.F. Wong, "Compiler Optimizations for Adaptive EPIC Processors", First International Workshop on Embedded Software, Lecture Notes of Computer Science, vol. 2211, pp. 257-273. Springer-Verlag. Oct 2001.

2000

L.F. Lau, A.L. Ananda, G. Tan, W.F. Wong, "Gucha: Internet-based Parallel Computing using Java", Proc. of 4th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), pp. 397-408. Dec 2000.
L. Peng, W.F. Wong, M.D. Feng, and C.K. Yuen, "SilkRoad: A Multithreaded Runtime System with Software Distributed Shared Memory for SMP Cluster", Proc. of IEEE International Conference on Cluster Computing (CLUSTER 2000), pp. 243-249. Dec 2000.
L.F. Lau, A.L. Ananda, G. Tan, W.F. Wong, "JAVM: Internet-based Parallel Computing using Java", in Annual Review of Scalable Computing, pp. 59-74. World Scientific Publisher. ISBN 981-02-4413-4. Dec 2000.
M.C. Ng and W.F. Wong "ORION: An Adaptive Home-Based Software Distributed Shared Memory System", Proc. of 2000 International Conference on Parallel and Distributed Systems (ICPADS 2000). pp. 187-194. 2000.
Y. Zhu and W.F. Wong "Modeling Architectural Improvements in Superscalar Processors" (Extended Abstract), Proc. of HPC-Asia 2000. vol. 1. pp. 28-30. 2000.

1999

W.F. Wong "Optimizing Floating Point Operations in Scheme", Computer Languages. vol. 25. pp. 89-112. 1999.
K.S. Loh and W.F. Wong, "Multiple Context Multithreaded Superscalar Processor Architecture", Journal of Systems Architecture. vol. 46, no. 3, pp. 243-258. 1999.
W.F. Wong, "Source Level Static Branch Prediction", Computer Journal. vol. 42, no. 2, pp. 142-149. 1999.
M.C. Ng and W.F. Wong, "Adaptive Schemes for Home-based DSM Systems", Proceedings of the 1999 Workshop on Software Distributed Shared Memory. pp. 13-20. June 1999.
C.P. Tan, W.F. Wong and C.K. Yuen, "tmPVM - Task Migratable PVM", Proceedings of the 2nd Merged Symposium IPPS/SPDP. pp. 196-202.5. April 1999.

1998

K.S. Loh, M.K. Quek and W.F. Wong, "SPATS - Accurate and Flexible Simulation of Superscalar Processors", Computer Architecture '98: Selected papers of the 3rd Australasian Conference. J. Morris (ed). pp. 133-146. ISBN 981-3083-93-X. Springer-Verlag 1998.
Y. Zhu and W.F. Wong, "The Effect of Instruction Dependency on Superscalar Processor Performance", Computer Architecture '98: Selected papers of the 3rd Australasian Conference. J. Morris (ed). pp. 215-226. ISBN 981-3083-93-X. Springer-Verlag 1998.

1997

Y. Zhu and W.F. Wong, "Performance Analysis of Superscalar Processors using a Queueing Model", Computer Architecture '97: Selected papers of the 2nd Australasian Conference. R. Pose (eds). pp. 147-157. ISBN 981-3083-11-5. Springer-Verlag 1997.

1996

M.D. Feng, W.F. Wong and C.K. Yuen, "BaLinda Lisp: Design and Implementation", Computer Language, vol. 22, no. 4, pp. 205-214. Dec 1996.
M.D. Feng, W.F. Wong and C.K. Yuen, "Highly Efficient Parallel Lisp Implementation on Distributed Systems", Parallel Computing: State-of-the-Art and Perspectives. E. D'Hollander, G.R. Joubert, F.J. Peters and D. Trystram (eds). pp. 319-326. ISBN 0-444-82490-1. Elsevier Science B.V. 1996.

1995

H. Imai, W.F. Wong and K.F. Loe (eds), Advances in Computing Techniques - Algorithms, Databases and Parallel Processing. ISBN 981-02-2501-6. World-Scientific 1995.
M.D. Feng, W.F. Wong and C.K. Yuen, "Compiling parallel Lisp for a shared memory multiprocessor", Proc. of 7th IASTED Conference on Parallel and Distributed Computing and Systems, pp. 487-490. Oct 1995.
M.D. Feng, W.F. Wong and C.K. Yuen, "Design and implementation of abstract machine for parallel Lisp compilation", Proc. of International Conference on Parallel Processing, II-37-II-44. Aug 1995.
W.F. Wong and E. Goto, "Fast Evaluation of the Elementary Functions in Single Precision". IEEE Transactions on Computer. vol. 44, no. 3, pp. 453-458. Mar 1995.
W.F. Wong, Y. Oyanagi and E. Goto, "Evaluation of the Hitachi S-3800 Supercomputer using Six Benchmarks". International Journal of Supercomputer Applications and High Performance Computing. vol. 9, no. 1, pp. 58-70. Spring 1995.

1994

W.F. Wong and E Goto, "A Simulation Study on the Interactions between Multithreaded Architectures and the Cache". International Journal of High Speed Computing. vol. 6, no. 2, pp. 343-356. 1994.
W.F. Wong, Y. Oyanagi and E. Goto, "Supercomputer Performance Evaluation using Six Benchmarks", Proc. of IEEE Region 10's Ninth Annual International Conference, vol. 2, pp. 1107-1111. Aug 1994.
S. Ohta, E. Goto, W.F. Wong and N. Yoshida, "Improvement and New Proposal on Fast Evaluation of Elementary Functions" (in Japanese), Joho Shori Gakkai Ronbunshi, (Journal of the Information Processing Society of Japan). vol. 35, no. 5, pp. 926-933. May 1994.
W.F. Wong and E. Goto, "Fast Hardware-Based Algorithms for Elementary Function Computations based on the Rectangular Multipliers", IEEE Transactions on Computers. vol. 43, no. 3, pp. 278-294. Mar 1994.
W.F. Wong and E. Goto, "Fast Evaluation of the Elementary Functions in Double Precision", Proc. of 27th IEEE Hawaii International Conference on Information Science. vol. 1, pp. 349-358. Maui, Jan 1994.

1993

W.F. Wong, E. Goto and N. Yoshida, `Fast Evaluation of Elementary Functions' (in Japanese), Joho Shori Gakkai Ronbunshi, (Journal of the Information Processing Society of Japan). vol. 34, no. 7, pp.1570-1579. Jul 1993.
W.F. Wong, "Survey of Parallel Lisp Dialects", contributed chapter in C.K. Yuen, Parallel Lisp Systems - A Study of Languages and Architectures, ISBN 0-412-45560-9, Chapman and Hall 1993.

1992

P. Spee, W.F. Wong, M. Sato and E. Goto, "Evaluation of the Continuation Bit in the Cyclic Pipeline Computer', Parallel Computing. vol. 18, no. 12, pp. 1346-1361. Dec 1992.
W.F. Wong and E. Goto, "Improving the Cache Performance of Multithreaded Architectures", Proc. of International Computer Symposium 1992, pp. 1189-1196. Taichung, Dec 1992.
W.F. Wong and C.K. Yuen, "A Model of Speculative Parallelism", Parallel Processing Letters. vol. 2, no. 2&3, pp.265-272. Sep 1992.
W.F. Wong and E. Goto, "Division and Square-rooting using a Split Multiplier", Electronics Letters. vol. 28, no. 18, pp. 1758-1759. Aug 1992.

1991

W.F. Wong, E. Goto, Y. Oyanagi and N. Yoshida, "Six Benchmark Problems for Number Crunchers", Supercomputer. vol. VIII, no. 6, pp. 39-45. Nov 1991.
P. Spee, W.F. Wong and E. Goto, "Effects of Multiple Instruction Stream Execution on Cache Performance", International Journal of High Speed Computing. vol. 3, no. 2, pp. 135-155. 1991.
W.F. Wong, E. Goto, Y. Oyanagi and N. Yoshida, "Six Benchmark Problems for Number Crunchers", Proc. of the International Symposium on Supercomputing 1991. pp. 120-125. Fukuoka, Nov 1991.
W.F. Wong and E. Goto, "Fast Hardware-Based Algorithms for Elementary Function Computations", Proc. of the International Symposium on Supercomputing 1991. pp.56-65. Fukuoka, Nov 1991.
P. Spee, W.F. Wong, M. Sato and E. Goto, "Evaluation of the Continuation Bit in the Cyclic Pipeline Computer". Poster Presentation at Parallel Computing 91. London, Sep 1991.

1990

W.F. Wong, J.J. Yee and C.K. Yuen, "A Data Driven, Direct Execution Architecture for A Parallel Lisp Dialect", Proc. of the 1990 U.K. Conference on Parallel Computing in Lisp. Twickenham, London, Nov 1990.
W.F. Wong and K.T. Lua, "A Preliminary Evaluation of a Massively Parallel Processor : GAPP", Microprocessing and Microprogramming. vol. 29, no. 1, pp. 53-62. Jul 1990.
C.K. Yuen and W.F. Wong, "BaLinda Lisp : A Parallel List-Processing Language", Proc. of the 2nd IEEE International Conference on Tools for Artificial Intelligence. pp. 618-624. Fairfax, U.S.A., 1990.
C.K. Yuen and W.F. Wong, "BIDDLE : The Design of a BIdirectional Data Driven Lisp Engine", Proc. of the 13th Australian Computer Science Conference. pp. 421-429. Melbourne, Feb 1990.

1989

W.F. Wong and C.K. Yuen, "BIDDLE : A BIdirectional Data Driven Lisp Engine", Proc. of the 1989 IEEE International Workshop on Tools for Artificial Intelligence. pp. 194-199. Fairfax, U.S.A., Dec 1989.
W.F. Wong and C.K. Yuen, "SARC - A Stack and Register Computer", Proc. of the 1989 International Conference on Computer Architecture and Digital Signal Processing. pp. 194-199. Hong Kong, Oct 1989.
W.F. Wong, "A Stack Addressing Scheme Based on Windowing", ACM SIGARCH Computer Architecture News, Vol 17, No. 1. pp. 63-69. Mar 1989.

Last updated Mar 31, 2026