statistics

Greg's picture

Contributors to Drupal.org CVS since 2000

One measure of the momentum of the fine Drupal project is the number of people who are creating contributed modules on drupal.org.

The Drupal contributed projects are stored in a system called CVS and data about that is stored in some database tables that keep track of each change by each person. At the request of some fine folks who are working on important things, I got interested in the idea of the trend related to people committing code to the drupal.org CVS server. Here is the data graphed by the number of committers per month. It is not the number of commits, which would show how active those people are, but the number of people which shows how big of a group of people is doing this work.

Also, this is only about the contributed module and theme area and not about Drupal core. Drupal core commits are done by a very small group of people after that small group reviews the code contributed by hundreds of contributers. So, this really shows activity of the non-core projects.

Contributers to drupal.org contributed module repository

I've labeled 4 points on the graph.

1. 2006 through Drupal 5.0 slump

Point 1 shows a peak at June of 2006 followed by a slow down until the trough at August of 2006 and then some small increases until December of 2006. Then there is a huge increase in people in January and February of 2007 which is also when Drupal 5.0 was released.

2. 2007 Follows a similar contribution trend


Greg's picture

Contributors to Drupal 7.x - End of Code Freeze Edition

Last week was the amazing Do It With Drupal conference and Angela Byron wanted some updated contributor statistics for her presentation. So, I analyzed the commit messages for Drupal core to find who has been helping out and once again the process and the data are getting better and better.

This time I'm using direct database information from the cvs commit log tables and using PHP to parse it which means that it's easier to create rules for fixing usernames or eliminating bad data. I also pulled in company information from groups.drupal.org to get a rough sense of which companies, as a group, are contributing the most to Drupal core. AND, thanks to Dreditor the commit messages are getting cleaner and include information about the person who has done reviews on patches.

Remember, none of this data is really perfectly accurate, but it gives us a tangible sense of what is going on.

Attached are a CSV file and an OpenOffice.org spreadsheet with the data. They show the uid of the user from groups.drupal.org, their name, their organization (if they specified one), the number of times they were mentioned as an author of a patch, the number of times they were mentioned as a reviewer of a patch, and the commit ID where they were mentioned. The commit ID is useful when chasing down bad data so that I can improve the parser. So, if you find a problem please let me know the CID value so I can improve the parser. There's a chance that this could eventually make it onto drupal.org itself, but I'd like to improve the process first to understand whether or not that makes sense.

Enough with the process - it's time to name names!

Top 10 patch contributors to Drupal 7 core

Username Patches
catch 267
sun 238
damien tournoud 213
chx 159
yched 150
dave reid 145
pwolanin 141
boombatower 113
c960657 93

Greg's picture

Contributors to Drupal 7.x - Code Freeze Looming Update

The code freeze for Drupal 7.x is looming large on the horizon. From that point on we will be limited in what kinds of changes we can get into Drupal core. For some the code freeze is a time of relief: it means we are down to bug fixes and the final release should be coming soon. For others it is a hard time - bug fixing isn't always as fun as adding new features.

So, as we head into feature freeze it seemed like a good time to run some statistics on who has been contributing the most to Drupal 7.x so far.

Contributors to Drupal 7.x. Through August 10th

Following on from previous times that I've run these stats, I've published documentation of the process to get the data on groups.drupal.org. This time I went straight to the commit messages stored in database tables on drupal.org This has the benefit of counting new files as well as old files (the last times I did this it only counted changes to existing files).

So, who are the top 10 people based on the number of times their name is in a commit message?

Name Commit mentions
Damien Tournoud 192
catch 179
chx 123
pwolanin 113
Dave Reid 109
boombatower 95
yched 77
c960657 57
drewish 56
Berdir 56

The total number of mentions is 3133, so those top 10 are responsible for roughly 33% of the code. On the flip side, people with 3 or fewer mentions are responsible for roughly 15% of the code. We still have a long tail of 222 people who are mentioned in only one message. We see a fairly typical "long tail" distribution: the people who are most involved do a lot of the work, but the people who only get mentioned a few times each are still responsible for a large number of commits when aggregated together.

Commit mentions Count of people with that number
1 222
2 80
3 38
4 17
5 15
6 13
7 8
8 7
9 5
10 6
11 1
12 3
13 3
14 5
15 2

Greg's picture

Drupal 7: Who is Providing Patches for the Next Release?

Quick update: this data misses out on any files added since Drupl 6.0 was created. With the new database and testing systems, that's a lot of files! So, these need to be updated to include that data...this still gives a good idea of people who worked on everything except for Tests and DBTNG

Let's face it: we're human and nothing gets our blood flowing like a little old fashioned competition. During the release of Drupal 6 I helped out to analyze the code and provide some statistics about the release. I published the method and the data that found some pretty interesting information:

  • There were about 206 contributors when measured this way
  • The top 10 individuals were credited in almost 40% of the patches
  • People who only were credited on 1 or 2 patches still provided just over 10% of the code for Drupal.

Recently someone asked me to run statistics again for Drupal 7 so far. Thanks to the very detailed nature of the fine Drupal 7 maintainers (webchick and Dries) the commit messages give us all the info we need to see who has been involved in the code that is ultimately committed.

Drupal 7 Contributors So far

So, who are the current leaders in the race towards making Drupal 7 the most tested and usable release? Here are the top 5 individuals. As you can see these 5 people were involved in almost 25% of the patches.

Name Patches % of total Cumulative %
catch 46 6.19% 6.19%
pwolanin 40 5.38% 11.57%
Damien Tournoud 35 4.71% 16.29%
Dave Reid 33 4.44% 20.73%
chx 31 4.17% 24.90%

Greg's picture

Drupal Download Statistics - January 2008 Data

Ever quarter I try to munge and analyze the download data. The data for January is now available. Views continues its reign at the top of the module list. Images and WYSIWYG remain popular. Popular themes continue to be dominated by those that start with letters at the beginning of the alphabet.

Most Popular Drupal Modules


Syndicate content

Featured Team Member

Greg's varied background helps him as he works in various roles within the team.

Drupalcamp Colorado

We had fun at Drupalcamp Colorado!

Drupalcamp Colorado

We Wrote the Book On Drupal Security:

Cracking Drupal Book Cover

We were at Drupalcon San Francisco

See the videos now: