Code Archeology research

Finding patterns in software ecosystems. This is landing page for posting raw results. For more context, you probably want to visit my blog.

Apache Java Corpus

This corpus is comprised of all Apache projects on Github that are marked as Java. This list roughly agrees with Java projects published on the Apache Foundation site.

Carfax Corpus

This is my company's codebase. Source code is not available for independent verification, but summary results are published.

Qualitas Corpus

Qualitas Corpus is somewhat of a standard in code quality research.