Does team size impact app performance?

In 2014, CodeClimate published a blog post investigating the impact of team size on code quality. I was curious: are there correlations with app performance and team sizes as well?

Determining if an app is well-performing is more subjective than the GPA-like results a code quality analysis provides. I'll look at a number of traits that characterize apps with a healthy performance profile:

The dataset

I'm grabbing a day's worth of data from Ruby apps monitored by Scout. Team size is based on the number of users associated with a Scout account.

To reduce the impact of outliers, I've:

This gives us many hundreds of apps. While this is a small slice of web apps in the wild, it's the first analysis like this that I've found.

Do larger teams reduce N+1 queries?

An N+1 query is a repeated database query over many database records. These can usually be combined into a single query that significantly reduces the overall execution time. I was curious if larger teams - which generally have more involved code review processes - reduce the number of significant N+1 queries (significant = the sum of query time is 300 ms or greater).

To see if there was a correlation between the number of N+1s and team size, I calculated the Pearson correlation coefficient to the team size and number of N+1 queries for each app. The result: 0.13. The formula result can range from -1 to 1, so this isn't a strong correlation, but it doesn't point to larger teams being more effective at reducing N+1s.

Team Size Avg. N+1 queries Per-App
1-2 10.9
3-4 12.2
5-6 17.8
7-8 12.6

Our data shows that larger teams are actually a bit more likely to write N+1 queries.

Do larger teams reduce memory bloat?

Many languages - Ruby included - are unlikely to free memory once it is allocated. Allocating more memory is slow and Ruby's interpreter assumes that if more memory is needed once, it's likely to be needed again. This makes Rails app sensitive to memory bloat: one web request that triggers increased memory usage will have a long-running impact on the memory usage of the app.

Like N+1s, I was curious if larger teams are more effective at reducing memory bloat. To gather this data, I fetched the number of requests that triggered 100 MB+ memory increases. Interestingly, the Pearson correlation coefficient of endpoints triggering memory bloat vs. team size was 0.12, almost the same as the N+1 coefficient.

Team Size Avg. 100 MB+ requests Per-App
1-2 5.24
3-4 5.17
5-6 7.5
7-8 11.3

Like N+1s, our data shows that larger teams are actually a bit more likely to run into memory bloat.

Do larger teams build apps with more predictable response times?

Widely variant response times to the same controller-action trigger two problems:

  1. Technical: it's more difficult to scale and capacity plan
  2. Customer experience: it's more difficult to ensure consistent performance of the app across different customer sizes

Do larger teams build more predictable apps? To determine this, I looked at the median ratio of the 95th percentile response time vs. mean response time for the top 5 most time-consuming controller-actions in each app. The Pearson coefficient was 0.17, indicating another slight positive correlation between team size and less predictable apps.

Team Size Response Time 95th percentile to mean ratio (median)
1-2 3.82
3-4 5.18
5-6 4.12
7-8 4.88

Our data shows that larger teams are actually a bit more likely to build apps with more volatile response times.

Which service - Ruby, database, or HTTP calls - correlate most to unpredictable apps?

Next, I was curious to see which service layer correlated most closely to apps with a high 95th response time/mean response time ratio. This might show which service is the greatest trigger of this response time volatility.

Time spent in database calls had the strongest correlation to the response time (0.36), followed by time spent in Ruby (0.2), and HTTP calls (0.06).

Our data shows that database queries are a significant trigger of high response time variability across team sizes.

How does this compare to Code Climate's Code Quality Results?

From Code Climate:

It appears easier to achieve a 4.0 GPA if you are a solo developer and additionally, the density of teams greater than 10 concentrates under the 3.0 GPA mark.

Like code quality, it appears that it gets harder to build apps with a healthy performance profile as your team size grows.

There's more questions I'd like to investigate:

Subscribe for more👇.