25 November 2020 – Associate Professor Hahn Jungpil and PhD in Information Systems graduate, Peng Jiaxu, recently won the Best Paper award in the Advances in Research Methods track at the International Conference on Information Systems (ICIS) 2020.
The annual conference of the Association of Information Systems brings together information systems researchers and academics from around the world. The conference was due to be held in Hyderabad, India, but it will be held online from 13 to 16 December instead.
In their paper, “Transfer Learning in Dynamic Data Environments: Trade-offs in Response to Changes”, A/P Hahn and Dr. Peng, who is now an Assistant Professor in the Department of Accounting Information Systems at Beijing’s Central University of Finance and Economics, analyse fundamental trade-offs in machine learning-based predictive analytics for dynamic data environments.
A/P Hahn explains that with recent advancements in big data analytics, artificial intelligence (AI) and machine learning, predictive analytics is being applied to wide-ranging domains, including fraud detection and credit default prediction. “But most organisations assume implicitly that once a prediction model has been successfully trained, the model will continue to produce accurate predictions. However, when the data environment changes – in other words, when data patterns change – we cannot simply continue to apply the predictive models trained with past, and perhaps no longer relevant, data,” said A/P Hahn. “The problem is that we often lack sufficient data to adjust our prediction models in changing data environments in a timely manner. Since information and data is often too scarce in the early stages of the new data regime, analysts face two fundamental trade-offs.”
These two trade-offs are identified in their paper as a bias-variance trade-off, and an exploration-exploitation trade-off.
“The first trade-off – the bias-variance trade-off – arises because data analysts can still make use of source data, such as historical data, to overcome the data scarcity issue – that is, to use the transfer learning strategy,” explained A/P Hahn, who is also the Head of the Department of Information Systems and Analytics at NUS Computing. “However, under a changing data environment, using transfer learning may still result in a biased model that is far from the target.”
Transfer learning is an approach used in machine learning, where a model trained for one task is reused to complete a different, but related, secondary task. In this case, the first task or model would be for the previous data regime, and the secondary task or model would be for making predictions in the new (changed) data regime.
“The second trade-off – the exploration-exploitation trade-off – arises because data analysts can wait to collect more timely relevant data to train a more accurate model for the target task. However, this results in the cost of worsening prediction performance before the delayed adjustment is made,” said A/P Hahn.
In their paper, A/P Hahn and Dr. Peng incorporate change into predictive analytics by proposing a general transfer learning framework based on sample selection. Changing data patterns are represented by a sample selection model, while the transfer learning strategy is operationalised based on the sample selection probability. They conducted simulation analyses to corroborate their theoretical analysis of overall trade-offs in a dynamic data environment.
“I am very happy to receive this award, especially since there are many who appreciate our study. I appreciate the great discussions I had with Jungpil and the support from him in developing this study, as well as the help from many of my peers, our school, and NUS Information Technology,” said Dr. Peng.
Added A/P Hahn: “We’ve been working on this problem for a while and although we knew that this was a worthy problem to solve from the very beginning, we did struggle quite a bit to get the framing right. It’s always very rewarding when you get these ‘a-ha’ moments with challenging problems. I was also very fortunate to have worked with a very diligent student who didn’t give up and continued to push the boundaries.”
Transfer Learning in Dynamic Data Environments: Trade-offs in Response to Changes