BIG DATA METHODS AND ANALYSIS

The reports that managers receive, the answers that are obtained from ad-hoc queries, and the visual representation of data on top management’s decisional dashboards are all made possible with the algorithms and techniques working behind the interface. Whether analytics tools deliver what decision makers truly need depends on what is built into them. Hence, our research group seeks to develop novel ways to build tools, methods, and designs to better handle the 4Vs of big data.

Big Data Methods and AnalysisBig Data Methods and Analysis
Figure 1. Research Themes in Big Data

 

In our quest for more effective business data analytics tools, methods, and designs, we focus not only on the algorithms and statistical approaches that these tools are built on, but also on how they must be contextualized. Some of our topics of current interest in analytic tools’ designs is in the processing of big data based on dynamic search queries (i.e., reverse dictionary and optimized search query algorithms) and the graphical connotation of the big data collected (e.g., subgraph representation of entire dataset and network visualization). The range of projects in big data method design include identifying the core journals of the IS discipline and improving accuracy of neural network rule extraction for credit scoring.

Sample Projects

Method for Core Journal Identification

It is a perennial interest of the information systems community to identify a set of information systems journals. The primary approaches to achieving this identification are surveys of academics, article-level citation, and senior scholar consensus. An example of the last approach is the basket of eight journals identified by senior scholars of the Association for Information Systems (AIS). Our work on this topic proposes a different and efficient approach afforded by the publication of data from Journal Citation Reports (JCR). This provides aggregate citation data across individual journals. While the findings provide general empirical support for the choice of the AIS basket of eight journals, they also indicate that five additional journals qualify as core information systems journals. Each of these journals has numerous citations of journals within this set and low citations of individual journals outside this set. Furthermore, a network centrality analysis of this set of journals reveals a high correlation between in-degree centrality and the perceived importance of journals. Overall, the study demonstrates the suitability of this method for identifying core journals in a discipline.

Table 1. Citations of the AIS Set of 8 Journals
Citing journal To all journals To IS journals
(exclude
self-citations)
%-Citation
Information Systems Research (ISR) 4,568 403 8.8
MIS Quarterly (MISQ) 4,822 428 8.9
Journal of Management Information Systems (JMIS) 2,156 293 13.6
Journal of Information Technology (JIT) 1,696 258 15.2
Information Systems Journal (ISJ) 1,243 190 15.3
Journal of Strategic Information Systems (JSIS) 1,909 317 16.6
European Journal of Information Systems (EJIS) 3,037 534 17.6
Journal of the Association for Information Systems (JAIS) 2,228 478 21.5

 

Neural Network Rule Extraction for Credit Scoring

This project presents an approach for sample selection using an ensemble of neural networks for credit scoring. The ensemble determines samples that can be considered outliers by checking the classification accuracy of the neural networks on the original training data samples. Those samples that are consistently misclassified by the neural networks in the ensemble are removed from the training dataset. The remaining data samples are then used to train and prune another neural network for rule extraction. Our experimental results on publicly available benchmark credit scoring datasets show that by eliminating the outliers, we obtain neural networks with higher predictive accuracy and simpler in structure compared to the networks that are trained with the original dataset. A rule extraction algorithm is applied to generate comprehensible rules from the neural networks. The extracted rules are more concise than the rules generated from networks that have been trained using the original datasets.

Approach to Generate Comprehensible Rules from the Trained Neural NetworkApproach to Generate Comprehensible Rules from the Trained Neural Network
Figure 2. Approach to Generate Comprehensible Rules from the Trained Neural Network

 

Representative Publications

  • Nguyen, H.D., and Poo, D.C.C. (2016) “Automated Mobile Health: Designing a Social Reasoning Platform for Remote Health Monitoring,” Proc. HCI International, Toronto, Canada, July 17-22.
  • Chan, H.C., Guness, V., and Kim, H.W. (2015) “A Method for Identifying Journals in a Discipline: An Application to Information Systems,” Information & Management, 2(52), 239-246.
  • Setiono, R., Azcarraga, A., and Hayashi, Y. (2015) “Using Sample Selection to Improve Accuracy and Simplicity of Rules Extracted from Neural Networks for Credit Scoring Applications,” International Journal of Computational Intelligence and Applications, 14(4), 1550021-1-20.
  • Setiono, R., Azcarraga, A., and Hayashi, Y. (2014) “Mofn Rule Extraction from Neural Networks Trained with Discretized Input,” Proc. International Joint Conference on Neural Networks, Beijing, China, July 6-11, 1079-1086.
  • Kajanan, S., Bao, Y., Datta, A., VanderMeer, D., and Dutta, K. (2014) “Efficient Automatic Search Query Formulation Using Phrase-Level Analysis,” Journal of the Association for Information Science and Technology, 65(5), 1058–1075.
  • Lu, X., Phan, T.Q., and Bressan, S. (2013) “Incremental Algorithms for Sampling Dynamic Graphs,” Database and Expert Systems Applications, 8055, 327–341.
  • Shaw, R., Datta, A., VanderMeer, D., and Dutta, K. (2013) “Building a Scalable Database-Driven Reverse Dictionary,” IEEE Transactions on Knowledge and Data Engineering, 25(3), 528- 540.

THE BUSINESS OF BUSINESS ANALYTICS

Integrating analytics into organizations is fraught with challenges before the value of these tools and techniques can be realized. Our research in this area examines how business analytics can be integrated into the processes and decision making of an organization or institution and how such integration could yield tangible returns and value. For example, in one of our projects, we focus on quantifying the value of business analytics by constructing a competition model between firms with analytics capability and those without. The results reveal that investment in building analytics capability could improve a firm’s equilibrium price, market share, and profit. In another project, we studied how an organization should condition itself for business analytics capability. Our viewpoint is that an organization’s resource allocation and resource orchestration processes need to be clearly understood before analytics capability can be effectively built and leveraged in an organization. Going forward, we are interested in identifying the boundary conditions for business analytics – the set of conditions under which it is valuable. With such knowledge, firms, institutions, and even policy makers would be able to implement business analytics with investment returns in sight and advantage over competitors.

Sample Projects

Shareholder Reactions to BA Announcements

Despite the growing acceptance of business analytics (BA) as a tool for making smarter business decisions, past research has rarely investigated shareholder reactions to BA announcements. We use signaling theory and resource-based theory (RBT) as our theoretical lens. The results show that BA announcements generate positive abnormal returns, thereby providing empirical evidence that shareholders view BA as beneficial. The results also suggest that characteristics that are more salient to shareholders are rewarded. Specifically, firms implementing BA systems from market-leading vendors obtain more positive stock market reactions compared with other firms. Announcements convey more benefits to overbought stocks than oversold stocks, and generate higher positive return in firms with high sales growth and high return on assets (ROA). Overall, empirical evidence favors signaling theory over RBT.

Research Framework for Returns to BA AnnouncementsResearch Framework for Returns to BA Announcements
Figure 3. Research Framework for Returns to BA Announcements

 

A Research Agenda for Exploring the Impact of BA on Organizations

Much attention is currently being paid in both the academic and practitioner literatures to the value that organisations could create through the use of big data and business analytics. We argue that while there is some evidence that investments in business analytics can create value, the thesis that ‘business analytics leads to value’ needs deeper analysis. In particular, we argue that the roles of organisational decision-making processes, including resource allocation processes and resource orchestration processes, need to be better understood in order to understand how organisations can create value from the use of business analytics. Specifically, we propose that the first-order effects of business analytics are likely to be on decision-making processes and that improvements in organisational performance are likely to be an outcome of superior decision-making processes enabled by business analytics. For this purpose, we identify prior research traditions in the Information Systems (IS) literature that discuss the potential of data and analytics to create value. This is to put into perspective the current excitement around ‘analytics’ and ‘big data’, and to position those topics within prior research traditions. We then draw on a number of existing literatures to develop a research agenda to understand the relationship between business analytics, decision-making processes, and organisational performance.

Research Model for the Effect of BA Use on Organizational PerformanceResearch Model for the Effect of BA Use on Organizational Performance
Figure 4. Research Model for the Effect of BA Use on Organizational Performance

 

Representative Publications

  • Teo, T.S.H., Nishant, R., and Koh, P.B.L. (forthcoming in 2016) “Do Shareholders Favor Business Analytics Announcements?” Journal of Strategic Information Systems.
  • Wu, J., Li, H., Lin, Z., and Goh, K.Y. (forthcoming in 2016). “How Big Data and Analytics Reshape the Wearable Device Market: The Context of E-Health,” International Journal of Production Research.
  • Sharma, R., Mithas, S., and Kankanhalli, A. (2014) “Transforming Decision-making Processes: A Research Agenda for Understanding the Impact of Business Analytics on Organizations,” European Journal of Information Systems, 23(4), 433-441. (Guest Editorial)

APPLICATION OF BIG DATA ANALYTICS

The advent of big data is revolutionizing not only business practices but also how academics conduct research. The availability of massive amounts of data are allowing researchers to tackle research problems that heretofore were not possible. For example, in one project we triangulate data from LinkedIn of over 200 million résumés of individuals with patent data from the US Patent and Trademark Office (USPTO) and financial data from financial databases to investigate whether investments in IT have an impact on entrepreneurial spawning. In another project, we analyze all individual plays in all games over 5 years in the American Football context to understand if momentum in organizational performance influences whether or not organizations become more or less likely to take risks.

Big data does not necessarily mean data at a more macro scale. Oftentimes, big data involves more fine-grained data that allows the investigation of more micro processes. In one project, we analyze over 1,800 hours of eye-tracking data and triangulate it with video footage and website trace log data to understand the effectiveness of different designs of collaborative shopping websites. Overall, our research in this domain offers valuable insights into various applications of big data analytics.

Sample Projects

Impact of a Paywall on WOM via Social Media

Information goods providers such as print newspapers are experimenting with different pricing models for their online content. Despite prior research on the topic, it is still not clear how their information pricing strategy influences word-of-mouth (WOM) via social media, which has become a dominant channel for raising awareness about a newspaper’s articles and attracting new visitors to its website. Using The New York Times’ (NYT) paywall rollout as a natural experiment, our study examines how the implementation of a paywall by a firm (i.e., a shift from “free” to “for-a-fee”) influences the pattern and effectiveness of online WOM in social media. Our results indicate that implementing a paywall (i.e., charging for content that was earlier available for free) has a disproportionate impact on WOM for popular and niche articles, creating a longer tail in the WOM (i.e., content sharing) distribution. Further, we find that the impact of WOM on the NYT’s website traffic weakens significantly after the introduction of a paywall. These results show that a paywall has implications for product and promotion strategies. The study offers novel and important implications for the theory and practice of strategic use of social media and paywalls.

Inverse Relationship between Content Popularity and UsageInverse Relationship between Content Popularity and Usage
Figure 5. Inverse Relationship between Content Popularity and Usage

 

Momentum and Organizational Risk Taking

This study examines how momentum shapes organizational risk taking. We define momentum as a sustained and systematic trajectory in performance over time, and we argue that such trends impact interpretations of current performance as well as expectations of future performance. Drawing on the variable focus of attention model, we posit that momentum therefore directs the focus of organizational attention between concerns of aspirations, survival, and slack. Our conceptual model accounts for momentum that occurs within a performance period as well as that which occurs across periods. We propose that within- and across-period momentums are unique in terms of when and how each type impacts risk taking. We tested and found support for our hypotheses in the context of 22,603 play-by-play decisions made by the 32 teams of the National Football League during the 2000–2005 regular season games.

Proposed Moderating Effects of Within-Period MomentumProposed Moderating Effects of Within-Period Momentum
Figure 6. Proposed Moderating Effects of Within-Period Momentum

 

Representative Publications

  • Oh, H., Animesh, A., and Pinsonneault, A. (2016) "Free Versus For-A-Fee: The Impact of a Paywall on the Pattern and Effectiveness of Word-Of-Mouth via Social Media,” MIS Quarterly, 40(1), 31-56. (An earlier version of the paper won the best paper award at ICIS 2013)
  • Yue, Y., Ma, X., and Jiang, Z. (2016) “Influence of Content Layout and Motivation on Users’ Herd Behavior in Social Discovery,” ACM Conference on Human Factors in Computing Systems (CHI), San Jose, CA, May 7-12.
  • Chen, Q., Huang, K.W., and Heng, C.S. (2014) “IT Investment: The Unexpected Effects on Entrepreneurial Spawning,” Proc. International Conference on Information Systems, Auckland, New Zealand, December 14-17.
  • Yue, Y., Ma, X., and Jiang, Z. (2014) “Share your View: Impact of Co-Navigation Support and Status Composition in Collaborative Online Shopping,” ACM Conference on Human Factors in Computing Systems (CHI), Toronto, Canada, April 26-May 1.
  • Lehman, D.W., and Hahn, J. (2013) “Momentum and Organizational Risk Taking: Evidence from the National Football League,” Management Science, 59(4), pp. 852-868.