Can I Touch Your Code? How Clashing Coding Styles Quietly Undermine Open Source Software

17 April 2026

AI, Artificial Intelligence, Coding, Department of Information Systems & Analytics, Digital Transformation, Faculty Research, Open Source, Research

Hahn Jungpil

Professor

Information Systems & Analytics

SHARE THIS ARTICLE

Can I Touch Your Code? How Clashing Coding Styles Quietly Undermine Open Source Software

Every day, millions of lines of code are written without a single human typing them. AI coding assistants have become the fastest-growing contributors to software projects worldwide – generating, completing, and committing code at a pace no human team can match. In 2025, 41% of all code written globally was AI-generated or AI-assisted – and the share was rising fast. Google and Microsoft both disclosed that around 30% of their new code was AI-generated, while Meta’s CEO indicated that AI would handle half of his company’s software development within a year. At some startups, entire codebases are now 95% machine-written.

This is the age of vibe coding: describe what you want, let the AI build it, and keep moving. The speed is intoxicating. But underneath the velocity, a quieter problem is taking shape.

AI tools don’t write code the way a seasoned contributor to a project would. They write code the way they were trained – drawing on patterns from across the internet, calibrated to be generally readable rather than locally consistent. When different developers use different AI tools, or prompt the same tool in different ways, each produces code with its own stylistic fingerprint. The result lands in the codebase and stays there. Nobody planned it. Nobody owns it. And increasingly, nobody notices – until the friction has already built up.

When hundreds of developers contribute to the same project, their individual coding habits leave fingerprints everywhere. New research from NUS Computing shows these clashing styles aren’t just an aesthetic nuisance. They’re a coordination problem that quietly undermines a project’s progress.

Open source software is often described as a miracle of modern collaboration. People who have never met, working across time zones, languages, and skill levels, somehow manage to build things that power the internet. Linux runs the majority of the world’s servers. Android sits in billions of pockets. The AI models reshaping industries lean heavily on open source libraries and frameworks.

No central authority, no fixed teams. And yet, it works.

But there’s a quieter problem sitting underneath all that success. It’s neither bugs nor architectural complexity.

It’s something much smaller, and easier to overlook: the way code is written.

A study by Professor Hahn Jungpil, Provost’s Chair Professor in the Department of Information Systems and Analytics at the NUS School of Computing, together with Assistant Professor Zhiyi Wang from the University of Colorado Boulder (NUS IS PhD 2019), and Associate Professor Steven L. Johnson from the University of Virginia, introduces the concept of programming style inconsistency and puts hard numbers behind something the open source community has long felt but rarely examined – that the way code looks can shape how well a project performs.

Why Style Matters More Than You Think

To understand why something as seemingly superficial as coding style could affect a project’s success, it helps to understand how open source development actually works.

In most corporate teams, developers coordinate through meetings, project managers, shared schedules, and explicit task assignments. Open source is different. Contributors are typically volunteers. They may never meet. They work across time zones, often in their spare time, and rarely have access to synchronous communication.

So how do they coordinate? Largely through the code itself.

Developers read the codebase to figure out what’s been done, understand how things fit together, and decide where they can contribute. The code isn’t just a product. It’s the primary coordination tool. Researchers call this artifact-centric coordination: the idea that the software artifact, the codebase, mediates collaboration in the absence of traditional management structures.

This is where programming style becomes important. If the codebase is written in a consistent style, reading and navigating it is straightforward. Developers can focus on logic, architecture, and what needs to be done. But when styles clash, every encounter with unfamiliar formatting creates a small moment of friction. The developer has to pause, adjust, and mentally recalibrate before they can engage with the substance of the code.

Multiply that across thousands of files and dozens of contributors, and the cumulative cost becomes significant.

Measuring the Invisible

One reason programming style inconsistency has received so little attention is that it’s hard to measure. How do you objectively quantify how “messy” a codebase looks?

Professor Hahn’s team developed a rigorous method. Using static code analysis based on Google’s Closure Linter, they built a custom programme that reads source code and catalogues the stylistic choices made in every file. Their system tracked dozens of attributes across three categories: formatting (indentation, spacing, brace placement), readability (naming conventions, comment styles), and language usage (loop structures, keyword preferences).

Each source file was then represented as a style vector, a numerical profile of how the code in that file was written. Think of it as a fingerprint for coding style.

With these vectors, the team measured inconsistency at two levels:

Component-level inconsistency captures how much styles clash within a single file, usually because multiple developers have edited it over time.
System-level inconsistency captures how much styles vary across all the files in a project, reflecting how fragmented the codebase looks as a whole.

The distinction is important. A project could have files that are internally clean but written quite differently from each other. Or it could have files where styles are jumbled within each file but roughly similar across the system. Both create problems, but through different channels.

The team applied this approach to 1,817 JavaScript projects on GitHub, scanning more than five million source code files and tracking each project’s development activity month by month over several years.

What the Data Shows

The results were clear, and they held up across every statistical model and robustness check the researchers ran (21 models in total, using multiple measures of success).

Projects with greater programming style inconsistency showed significantly less technical progress. This was true at both levels: within files and across the system.

Technical progress here was measured in several ways: total lines of code changed, lines added, issues closed, pull requests merged, and code changes excluding cleanup commits. The pattern was consistent across all of them.

To put it simply: when a codebase is stylistically messy, developers contribute less to it. Not because the code doesn’t function, but because it becomes harder to work with.

The mechanism is cognitive. Inconsistent styles increase the mental effort required to comprehend code. It takes longer to figure out what’s going on, which makes it harder to plan and execute changes. The researchers call this added overhead instantiation costs. On top of that, developers tend to hold strong preferences about coding style, and encountering conventions that conflict with their own can be genuinely frustrating, reducing their motivation to continue.

One developer in the study’s dataset illustrated the problem in a GitHub issue. They wanted to contribute new features but couldn’t determine the project’s preferred coding standard. Some files used two-space indentation, others four spaces, and still others used tabs. Before writing a single new line of code, they had to stop and ask.

That kind of friction is invisible in most project metrics. But its effects accumulate over time.

The Role of Architecture and Work Patterns

The study goes further by examining whether common coordination strategies in open source can mitigate or worsen the problem. The answer depends on which strategy you’re talking about.

Think of it this way: imagine a large building where every room was designed by a different architect. If the rooms are self-contained, with their own entrances and no shared walls, you can use any one room without being confused by the others. That’s modularity: designing a codebase so its components are as independent as possible.

The study found that modularity does help, but only at the system level. When a project is well modularised, developers can work within a single component without having to navigate stylistic chaos in other parts of the codebase. The inconsistency is still there, but they’re less likely to encounter it. However, if the room you’re working in is itself a mess, the neatness of the building’s layout doesn’t help you much. Modularity did nothing to resolve clashing styles within individual files.

Now imagine the same building, but each architect works alone, at different times, without ever talking to the others. Each one builds on what the last person did, but nobody discusses conventions. That’s open superposition, the dominant work pattern in open source, where developers pick up tasks independently and layer their contributions one after another.

Open superposition made the style problem worse – at both levels. When developers work independently without shared conventions or explicit communication, the friction from clashing styles has no outlet. There’s no discussion to resolve it, no alignment to prevent it. Developers encounter the inconsistency, absorb the cognitive cost, and in many cases, contribute less or pull back entirely.

This is a nuanced finding. Open superposition is not inherently bad. It’s efficient, it respects developers’ autonomy, and it keeps projects moving. But it has a blind spot. When the codebase is difficult to engage with, independence tips into isolation, and style problems compound rather than resolve.

More Relevant Now Than Ever: The Age of AI Coding

When this research was conducted, AI coding assistants were a curiosity rather than a fixture. Today, they are everywhere – and the problem this study identified is being replicated at machine speed.

Coding standards were in place in only about 27% of observations across the dataset. The vast majority of the time, style was left entirely to individual preference. That was already a problem when the contributors were human. It becomes a more serious one when the contributors are AI tools that carry their own stylistic defaults – defaults shaped by training data from across the Internet, not by the conventions of the project they’re writing for.

The evidence is accumulating. A 2025 empirical study published in the Proceedings of the ACM on Software Engineering analysed code generated by five mainstream large language models and documented 24 distinct types of coding style inconsistency between AI output and human-written code – inconsistencies spanning formatting, naming conventions, and structural patterns that make the code harder for the next developer to read, understand, and build on.

Industry data tells the same story at scale. A December 2025 analysis of 470 open-source GitHub pull requests found that even in projects with formatters and linters already in place, AI-co-authored code showed elevated rates of spacing errors, indentation inconsistencies, structural drift, and naming mismatches – all of which increase the cognitive load on the next developer to touch that code. AI-assisted pull requests produced approximately 1.7 times more issues overall than their human-written equivalents.

A longitudinal study tracking 211 million lines of code across repositories owned by Google, Microsoft, and Meta found that as AI tool adoption rose, the proportion of code dedicated to refactoring – the activity that keeps a codebase clean and internally consistent – collapsed from 25 percent of changed lines in 2021 to under 10 percent by 2024. Code duplication increased approximately fourfold over the same period. AI tools, it turns out, are prolific adders of code and poor removers of redundancy.

What makes this particularly relevant to Professor Hahn’s research is the mechanism. The study found that open superposition – where contributors work independently, layering their changes without explicit coordination – amplifies the damage from style inconsistency. AI tools are the apotheosis of open superposition. Different developers using different tools, or the same tool with different prompts, produce code with different stylistic fingerprints. There is no discussion, no alignment, no shared convention. Each contribution lands and stays. That is precisely the dynamic this research warned about – and it is now playing out at machine speed.

The findings here are not a historical curiosity about the pre-AI era of open source. They are a diagnostic framework for what is happening right now – and a guide for what to do about it. The same mechanisms this research identified, the cognitive friction of inconsistent styles, the compounding effect of independent contributions without shared conventions, the limits of modularity as a fix – are the mechanisms through which AI tools are quietly degrading the codebases they are meant to improve.

The Bigger Picture

This matters because open source is not a niche activity. It underpins cloud computing, AI infrastructure, mobile operating systems, and much of the internet. The health of these projects depends on a steady flow of contributions from distributed developers, and anything that creates unnecessary friction in that flow has outsized consequences.

For project maintainers and platform designers, the practical takeaways are concrete. Investing in coding standards and automated style enforcement is not cosmetic overhead. It’s a coordination investment. Modular architecture helps contain style problems across the system, but teams also need strategies for managing within-file inconsistencies, something modularity alone cannot address. And when style inconsistency is already high, encouraging some explicit coordination between developers, rather than relying entirely on independent contributions, can help reduce the cognitive tax.

Platforms like GitHub have an important role to play too. Features that check for style inconsistency, flag drift introduced by AI-generated pull requests, and surface codebase-wide patterns would help maintainers see a problem that is currently invisible in most project metrics. The tools to do this exist. What has been missing is the research to make the case that they matter. That case has now been made.

A new way to think about code

Technical discussions about software tend to focus on function, performance, and architecture. Style is usually treated as a matter of personal taste, something for linters to tidy up, not something that shapes how well a project actually works.

This research challenges that assumption. It suggests that the appearance of code, the patterns and conventions that determine how it reads on screen, is itself a coordination mechanism. When styles are consistent, developers can focus on what matters. When they clash, the codebase quietly pushes contributors away.

In open source, where progress depends on many people building on each other’s work, that makes all the difference. Sometimes, the smallest details, two spaces or four, braces here or there, carry more weight than anyone assumed.

Read the full study here: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=396169

Trending Posts

26 June 2026

Practising What They Post: How Health Platforms Are Changing the Way Doctors Practise

Think about the last time you looked up a doctor online. Maybe you checked their credentials, read a few patient reviews, or noticed that they had answered hundreds of questions ...

29 September 2025

The Dark Side of Discovery: How Search Engines Concentrate Power in Online Markets

The Dark Side of Discovery: How Search Engines Concentrate Power in Online Markets When most people think about the Internet, they picture a place of easy access and endless information. ...

6 December 2019

The holy grail of seamless systems integration

Hospital visits can be complicated things. Sometimes it starts out as a visit to the outpatient clinic, where a doctor draws blood or orders some scans to investigate your niggling ...

23 January 2020

Let’s maximise influence, but in a fair way

A few years ago, Yair Zick was attending a conference in Stockholm when he struck up a conversation with two researchers from the University of Southern California (USC). Zick, a ...

27 December 2019

Move over Alfred, there’s a new butler in town

The shiny, black robotic arm gleamed as it whirred into action and ‘waved’ at us, accompanied by Alexa’s robotic, yet (somehow) cheery, disembodied greeting, “Hello! My name is MICO.” Mohit ...

28 May 2019

What Bayesian Optimisation can teach us about baking better cookies and more

Mention “Bayesian Optimisation” to Professor Bryan Low Kian Hsiang and he begins to talk about baking cookies. That’s because to the uninitiated, concepts such as “distributed batch Gaussian process optimisation” ...

4 June 2021

Boosting creativity in the crowd with deep learning

How can you get your next great idea? One way is to ask other people, and many of them, even a crowd. Crowdsourcing — harnessing the wisdom of the crowd ...

28 April 2025

When AI Confidence Rubs Off on Us: How AI’s Confidence Shapes Human Decision-Making

A new study by NUS Computing’s AI4SG Lab reveals that AI confidence levels can significantly influence human self-confidence, with lasting effects on our decision-making. ...

21 May 2021

Creating Human-Aware AI

In 1961, something momentous happened at a squat, nondescript factory in the tiny town of Ewing, New Jersey. The Unimate, a robotic arm, was fired up for the first time, ...

19 June 2025

Breaking the Bottleneck: Making Zero-Knowledge Proofs Practical at Scale

Explore how scalable collaborative zk-SNARKs enable fast, secure zero-knowledge proofs across multiple servers. This breakthrough improves privacy and scalability for AI verification, blockchain, and data markets, making advanced cryptography more ...

1 July 2019

Does “practice makes perfect” apply to businesses too?

Remember when your piano teacher used to insist you practise your scales every single day? Turns out she wasn’t just being a tyrannical tormentor, but a firm believer in the ...

13 November 2020

Quantum Physics Gets a Boost from AI

Stéphane Bressan and Christian Miniatura grew up in rival neighbourhoods of the naval garrison town of Toulon in southern France. They went to the same high school and the same ...

12 November 2019

Here’s to better apps for all of us

This is a scenario that’s probably familiar to many of us: You touch down at your long-awaited holiday destination, collect your luggage, and step outside the airport, raring to go. ...

25 July 2019

Building a Vibrant Innovation Ecosystem

From driverless cars to life-saving medical devices and everything in between, the technologies of the future not only promise to change the world, but also to create high-paying jobs and ...

27 August 2019

There’s power in hierarchy — but not what you expect

These days, it seems that whenever you’re thirsty and in need of a quick caffeine pick-me-up, there’s always a Starbucks close by — whether you’re running errands locally in the ...

2 May 2025

Building the Right Features: Rethinking Innovation in the App Economy

A new study published in Information Systems Research by NUS Computing Assistant Professor Aditya Karanam sheds light on how feature strategy influences app adoption in the competitive app market. ...

10 April 2023

Building a better detector to guard computers against malicious hardware attacks

The past few years have been a mixed bag for facial recognition. In 2017, the technology stepped into the global spotlight as Apple launched the iPhone X — its first ...

15 December 2023

To Attract VCs’ Attention, Should Startups Go with Crowdfunding or Angel Investing?

Roughly a decade ago, there was a big shake-up to the startup world. Entrepreneurs looking to fund their latest business venture no longer had to seek seed capital from traditional ...

4 June 2025

Designing Better Software Teams

A forthcoming study by NUS Computing’s Prof Jungpil Hahn and collaborators sheds new light on how software team structures impact product success in the digital economy. ...

20 April 2022

Walk, Watch, Learn: On-the-go video learning

As COVID crept across the world, confining people to their homes and chaining them to their desks — for work, school, and play — Zhao Shengdong was no exception. Involved ...

29 July 2022

‘Hearing’ how you walk

In one scene from the hit TV series Star Trek, Dr Bones McCoy runs to the aid of his fallen crewmate, who lies strewn across a barren, other-worldly landscape. He ...

13 April 2023

Spotting concurrency bugs in software with sampling

In the summer of 1983, the government organisation Atomic Energy of Canada Limited launched its newest radiation therapy machine. The Therac-25 was highly anticipated — it boasted a revolutionary dual ...

21 August 2025

Reimagining Connectivity: How AudioCast Delivers Ultra-Low-Power Wireless Communication at Scale

Reimagining Connectivity: How AudioCast Delivers Ultra-Low-Power Wireless Communication at Scale In our increasingly connected world, the buzz of wireless signals is omnipresent – Wi-Fi in our homes, Bluetooth in ...

4 May 2025

Unlocking the True Potential of Enterprise Systems: Why User Behavior Matters More Than You Think

A new study by NUS Computing’s Assoc Prof Tan Chuan Hoo reveals how leadership, user mindset, and system design determine whether enterprise systems are used effectively—or fail despite good technology. ...

2 October 2019

Quicker MRIs in the future? Machine learning can help

If you’ve ever had an MRI done, you would know that it’s not the most comfortable experience. They can make you feel claustrophobic, you’ll often hear loud thumping or tapping ...

30 December 2024

Unlocking the Power of High-Dimensional Simulations with STDE

In a world increasingly driven by artificial intelligence and complex computations, tackling the most challenging problems—from modeling galaxies to designing personalized medicine—requires innovation. One such breakthrough is the Stochastic Taylor ...

27 March 2023

Can I Touch Your Code? How Clashing Coding Styles Quietly Undermine Open Source Software

SHARE THIS ARTICLE

Trending Posts

Programmes

ADMISSIONS

RESEARCH

DEPARTMENTS

RESOURCES

Programmes

ADMISSIONS

RESEARCH

DEPARTMENTS

RESOURCES