Mastering Codebase Scaling: A Guide for Large Developer Teams
How to Manage a Growing Engineering Team Without Sacrificing Productivity
The first encounter with a project for a software engineer is usually a university assignment or personal project to start learning programming. It often starts with one person and one great idea for an app. It feels like you can achieve anything at this stage. You can build something useful just by pulling a one-nighter. You don’t need to focus on quality. There is no friction. We all know that feeling.
With the power of one person, you can be fast. If you add a second pair of hands to a project, it will be quicker, but you won’t double your output. There will be more friction that will slow both developers down.
This friction can be managed in small teams. You can somewhat control what is happening if your team consists of 3, 5, or 10 developers. It will be difficult to know everything and keep everyone in the loop. When your team grows to 20, 50, or even over 100 developers, it becomes impossible to keep track of everything.
Issues such as a broken dev branch preventing the team from developing new code, missing tests, low-quality code, and mismatched requirements will arise. However, these challenges can be addressed using techniques employed by large-scale teams.
Working with Git
The basis of managing the codebase at scale is preparing the process of working with Git. A branching model must accommodate multiple developers working simultaneously, ensuring seamless release branch creation and hotfix implementation without disrupting development. Gitflow is an example of a straightforward branching model that works well for growing teams.
It’s important to use clear commit names, such as those following the Conventional Commits specification, but having a well-structured pull request template is even more crucial, as it serves as both documentation and a code review aid.
A useful feature for reviews is the code owners file, which helps distribute responsibility among teams. If someone modifies code outside their designated part of the codebase, they must obtain additional sign-off on pull requests. This ensures that changes made by one team do not unintentionally break another team's module.
Creating Shared Architecture
When collaborating with multiple teams, it’s easy to encounter mismatched requirements. Lack of feedback, wrong assumptions. Communication is difficult.
Big teams focus more on architecting code upfront, which creates space for discussion that will resolve all of the concerns. While there are many architectural solutions and tools, the most important thing should be the review process. It can take the form of cyclical meetings and should be focused on discussing design documents that describe future architectural changes.
This way, you can encourage communication and create a democratic way of making decisions.
Multiple Repositories or Monorepo
If the product is getting bigger and the team grows, the challenge you will face will be dividing the code into smaller pieces.
This helps prevent conflicts when multiple developers modify the same file. The more modular the project, the lower the risk of issues. It’s also a good idea if the code is shared between different projects or if the goal is to open-source some part of the code.
The most straightforward idea would be to divide the project into separate packages existing in their own repository, but that comes with a set of challenges.
Imagine that you separated some piece of code to another repository and it’s deployed to some kind of artifactory. Let’s say it’s an “Auth” package.
If you find a bug in the “Auth” package, you need to commit the fix, update the package version, and publish. Now, you need to update the version of “Auth” in every package that depends on it. The more packages you have, and more developers work on the same codebase, the more complex it gets.
To solve this issue, you can use monorepo, where the project or even multiple projects and separate packages are stored in a single repository. That also creates new challenges, but there are tools that can help you:
manage dependencies, symlink the projects, and avoid duplication of dependencies,
share configs between the packages - type checks, linters, testing setup,
manage the deployments of independent packages and projects,
cache tasks to speed up their execution.
Microservices, Micro-frontends, and Micro-apps
Working on a big codebase is rarely pleasant. There is a lot of friction when the boundaries between parts of the codebase are unclear. One way to address this issue is by splitting the codebase into more manageable chunks.
Deciding how to implement this approach is complex, but it offers several advantages. Teams working independently on their parts of the app, tests running only on particular module, and independent deploys. Development becomes smoother, pipelines run faster, and developers are no longer blocked when something goes wrong in a different team.
Storybook - Shared Frontend Components
If the team builds multiple projects, the best idea to maintain the unified design is to create a design system with a shared component library. Separating the components into a new package and deciding whether to store it in a single repository or monorepo isn’t the only challenge that you will face. The most important problem will be communication.
You want to avoid component duplication, understand how the redesign will impact the apps, and ensure developers know which components they can use. All of these challenges can be addressed with well-prepared designs and thorough documentation. On top of that, you also want the stakeholders to be able to see real components instead of designs, and you need some ways to check the visual regression of components.
All of that can be achieved by using Storybook, which allows you to create a separate web or mobile app with all your components. The biggest advantage of this tool is the possibility to interact with the components. You can easily set configurable properties that every stakeholder or developer can modify in an easy-to-use UI.
Automatic Checks
When working with a huge team, it’s harder and harder to control what’s happening with the code. Even if the communication is great, you have well-maintained documentation and experienced developers, mishaps happen.
Automatic checks should occur at two levels: one locally on the developer's machine, and the other in the CI/CD pipeline. Why split it? Checks on CI/CD might be expensive. Not only because of costly server time but because an automated check that takes an hour or two would significantly slow down the team.
That’s why you can move some checks to pre-push hooks, such as verifying if the commit message follows your team’s established practices, running tests for modified modules, linting, type checking, or any other scripts that ensure nothing is overlooked before the commit undergoes CI/CD verification.
The downside to local checks is that they can usually be skipped, and some people don’t have the patience to wait for all the checks. That’s why the most important ones—tests and security scans—should happen on a remote server.
Summary
Scaling the codebase is a difficult challenge, and many teams struggle with making their developers productive. The massive amount of friction that appears with each new developer working on the same code has to be resolved by careful planning and improving the developer experience.
The key to solving this issue is creating shared communication regarding the architecture, splitting the codebase into smaller, more manageable parts, and leveraging automation. Most importantly, teams should prioritize feedback from developers, as they are the first to spot issues and can assess whether changes are truly beneficial or just adding unnecessary complexity.
You Might Like Too
Here are a few articles I came across recently that I found insightful:
Great article, a really comprehensive overview of the techniques we should all be using!
One challenge that I have (and I admit I'm biased) is that I don't really think of Gitflow as being straightforward. Especially if you compare it to other, much simpler strategies that work well with a CI/CD flow such as GitHub flow or trunk-based