There’s power in hierarchy — but not what you expect

27 August 2019

Department of Information Systems & Analytics, Economics of IS, Faculty, Feature

Huang Ke-Wei

Associate Professor

Information Systems and Analytics

SHARE THIS ARTICLE

These days, it seems that whenever you’re thirsty and in need of a quick caffeine pick-me-up, there’s always a Starbucks close by — whether you’re running errands locally in the Singapore heartlands of Bedok, or climbing the Badaling section of the Great Wall of China. Starbucks’ ubiquity isn’t just a figment of your imagination, it’s a fact backed by the firm’s latest sales figures.

At the end of July, Starbucks CEO Kevin Johnson announced that sales at their U.S. and Chinese stores performed so well that the company would be raising its full-year earnings and revenue forecast. The international coffee chain now expects earnings to hit 78 cents a share and revenue to exceed $6.82 billion by the year’s end — up from 72 cents and $6.67 billion respectively.

Forecasting earnings is an important part of doing business. It helps companies make decisions about where to take the firm in the future, informs creditors about the risk involved in loans, and indicates where investors should park their money.

“It’s an important question to consider and can often be a billion-dollar problem,” says Ke-Wei Huang, an associate professor at the NUS School of Computing who studies data mining and the economics of information systems. Analysts adopt various approaches to forecasting earnings, with some predicting the value directly. But with earnings comprising so many component variables, projection can be a tricky task. Instead, Huang argues for a bottom-up approach to forecasting.

“Look for hierarchical relationships,” he says. See if the variable in question, which he calls the “focal variable,” can be broken down into component variables. Company earnings, for example, can be calculated by subtracting the total cost from total revenue. In turn, the two component variables can be further decomposed into their respective sub-components — gross profit and cost of goods sold for total revenue; operating and non-operating costs for total costs.

This decomposition — the process of breaking down the focal variable into its component variables — can help make earning predictions more accurate. The challenge, however, is deciding which combination of component variables to use.

“The number of potential ways to decompose is very large,” says Huang. With six component variables — a number that is “very, very small” in the real world, he says, will give you more than 500 possible solutions. Nearly double that figure to ten variables and you’ll have close to 120,000 solutions. Increase that further to 12 variables and you’ll have roughly 4.2 million possible combinations to consider.

“The number of possible combination schemes can be astronomical even for relatively modest data sets,” writes Huang and his PhD student Mengke Qiao in a paper published last December. Because this complexity is still too large for computers to process efficiently, it is crucial to identify which decompositions, or combinations of component variables, will help enhance prediction accuracy.

With a little help from Darwin

Huang and Qiao decided to tackle this challenge two to three years ago. To search for optimal decompositions of the focal variable, they turned to a technique called “genetic algorithm.” Pioneered by John Holland, a professor of electrical engineering and computer science at the University of Michigan, Ann Arbor, in the early 1970s, the method gained popularity nearly two decades later.

Genetic algorithm, as its name suggests, is inspired by Charles Darwin’s theory of natural selection. Borrowing from the concept that drives biological evolution and adapting it to computer science, Holland postulated a method that takes a group of candidate solutions and “evolves” it towards an optimal solution through successive iterations of eliminating weaker candidates.

Applying it to their forecasting problem, Huang and Qiao began by using prediction models to randomly generate a handful of possible component combinations — what they term as solutions. “Once we create, for example, 10 solutions, we would then evaluate their performance and keep only the top two to three solutions,” explains Huang. “We then create the next generation of potential solutions based on those good ones, evaluate their performance again, and so on. So it’s evolutionary.”

New solutions are commonly generated using bio-inspired processes. For example, the original solution might be tweaked slightly, or two good solutions might be merged to create a new one — called mutation and crossover events, respectively.

“The genetic algorithm model is very different and interesting from other methods,” says Huang. “It’s very good at trying to find complicated solutions.”

The team then tested their findings using real-world data. They took the top-performing decompositions obtained and combined them using a stacking method called Long Short-Term Memory (LTSM). They then applied it to predict the earnings of close to 600 U.S. companies, whose data was publically listed on the Compustat North America database.

The tests showed that Huang’s techniques improved prediction performance over traditional forecasting models, called autoregressive models, which predict the focal variable directly. It also outperformed two “state-of-the-art” data mining algorithms XGboost and Random Forest.

Still, Huang strives to improve the technique. “The main weakness is that the solutions we find are good, but they may not be the best,” he says. And because the technique involves studying component variables, it takes longer than analysing the focal variable on its own.

“Because our method is much slower than predicting directly, the gain should be larger,” says Huang. “Otherwise the benefit doesn’t justify the cost.”

Moving forward, the team plans to conduct tests using a dataset with more component variables (so far, they’ve tested a maximum of six variables). They’re also looking to use other datasets that can be decomposed. They’re particularly interested in looking at predicting Gross Domestic Product, which can be decomposed into numerous component variables, such as consumption, investment, government spending, import and export.

“Our method has improved the prediction performance so that’s why it’s an important problem to study,” says Huang. “But we still want to improve it further.”

Paper:
Hierarchical Accounting Variables Forecasting by Deep Learning Methods

Trending Posts

21 May 2021

Creating Human-Aware AI

In 1961, something momentous happened at a squat, nondescript factory in the tiny town of Ewing, New Jersey. The Unimate, a robotic arm, was fired up for the first time, ...

10 June 2022

Want to make a good app? Update often and get customers involved

Modern-day learners have a wealth of “teachers” to turn to: online books, e-learning courses, YouTube tutorials, and even smartphone apps. If, for instance, you are yearning to lead a more ...

1 July 2019

Does “practice makes perfect” apply to businesses too?

Remember when your piano teacher used to insist you practise your scales every single day? Turns out she wasn’t just being a tyrannical tormentor, but a firm believer in the ...

15 December 2023

To Attract VCs’ Attention, Should Startups Go with Crowdfunding or Angel Investing?

Roughly a decade ago, there was a big shake-up to the startup world. Entrepreneurs looking to fund their latest business venture no longer had to seek seed capital from traditional ...

24 September 2021

Making sense of messy data with ThunderGP

Choice is good, but sometimes having too much choice can be a bad thing. Just ask anyone who’s ever tried to delve into a new film on Netflix, discover new ...

30 November 2023

Policing the Dark Web: Can Targeting Large Vendors Curb Further Drug Sales?

One day in May 2014, law enforcement officials swooped down on a warehouse in the San Francisco Bay Area. There they found a mini laboratory, pill press machines, and barrels ...

4 May 2025

Unlocking the True Potential of Enterprise Systems: Why User Behavior Matters More Than You Think

A new study by NUS Computing’s Assoc Prof Tan Chuan Hoo reveals how leadership, user mindset, and system design determine whether enterprise systems are used effectively—or fail despite good technology. ...

5 January 2024

Detecting Logic Bugs in a Way That’s Quicker and More Effective

Sometime between 2019 and 2020, a curious phenomenon began surfacing on Signal, FaceTime, and four other mobile messaging applications: someone could ring a person up and listen in to the ...

11 October 2018

Online Shopping and the Science of Serendipity: NUS Computing Researcher Jack Jiang on Product Search in Social Commerce

Have you ever gone to an e-commerce website with the intention of buying one specific thing, but then ended up with something totally different? ...

1 July 2020

Wanted: Sensitive New Age…Robot

Today’s virtual assistants and smart devices have come a long way. They can tell you if you’re running low on milk, what the weather will be like tomorrow, or change ...

19 December 2019

Lost? Eyes in the sky can tell you where you are

No matter how many times you’ve flown, sitting at the window seat and watching the world shrink away from view as the plane takes off never seems to grow old. ...

12 March 2020

Humans, Robots, and the Trust that binds them

Like so many parts along the Californian coast, Honda Point is breathtakingly beautiful. People go to visit, but when they do, it’s not for the views. ...

13 August 2019

The dilemma of an unknown diameter

They say that in the future, vehicles will be able to talk. Not in the way that those in the Pixar movie “Cars” do, but more in the sense of ...

1 July 2021

Fighting fake news with FANG

There have been many moments of disbelief throughout the pandemic, but one of the most shocking ones happened last April, when then U.S. President Donald Trump suggested that disinfectants could ...

5 October 2020

Watching People Walk

Life has a funny way of leading people down paths they least expect. Just ask NUS Computing lecturer Boyd Anderson. Two years ago, Anderson, then a PhD student, found himself ...

14 June 2019

Power to the consumer – user innovation drives new apps for mobile phones

In 1958, a curious sight began appearing on the sidewalks of California. People were taking apart roller skates, attaching them to the underside of wooden planks, sometimes boxes, and whizzing ...

4 September 2020

Bringing video games to life

Your heartbeat quickens as you watch your video game avatar run through the twisting corridors of the castle. There is still treasure to be found and a hostage to be ...

12 November 2019

Here’s to better apps for all of us

This is a scenario that’s probably familiar to many of us: You touch down at your long-awaited holiday destination, collect your luggage, and step outside the airport, raring to go. ...

15 June 2023

Plugging the Prefetcher Security Gap

Today’s world moves at such a breakneck speed that it has transformed us into a society that loathes to wait. Online deliveries turn up at our doorsteps in two hours ...

30 December 2024

Unlocking the Power of High-Dimensional Simulations with STDE

In a world increasingly driven by artificial intelligence and complex computations, tackling the most challenging problems—from modeling galaxies to designing personalized medicine—requires innovation. One such breakthrough is the Stochastic Taylor ...

4 March 2025

There’s power in hierarchy — but not what you expect

SHARE THIS ARTICLE

Trending Posts

Programmes

ADMISSIONS

RESEARCH

DEPARTMENTS

RESOURCES

Programmes

ADMISSIONS

RESEARCH

DEPARTMENTS

RESOURCES