Here are a few elements of practice in open-source which I find appropriate to work conveniently on the long-term, minimising low value problems to solve and the cognitive load spent working. I would use them if I were to initiate a new project.
Note that this list is not exhaustive and is being completed progressively.
Organisation and project
- Use a repository per project: if you have several projects, split them in several repositories
- Avoid submodules: if you need to have access to some dependencies, perform the work so that it can be accessible properly in your development setup
- Make your organization and project visuals beautiful and confortable, define the scope of your project concisely
- Introduce a
pre-commitsetup with linters, formatters, typo fix, etc. - Come-up with an appropriate OSI-approved License for your project’s goals
- Craft a Contributor License Agreement
- Do a SWOT analysis for each of the dependencies you are using
Pull Request
- Enforce conventional commits for PR titles: this eases releases’ changelogs then
- Enforce the “Squash & Merge” method for PRs: this makes the git log readable, readers can refer to the PR to understand the changes in details
- Open draft PRs as soon as possible to make your work discoverable while communicating that your PR is not yet ready for review
- Create PR templates for contributors
Commit
- Sign commits
- Use the body of the message to motivate the changes and to explain them when they are non-trivial
- Use atomic “Conventional Commits” as much as possible in PR
- Mention co-authors of changes or etc., using the
Co-authored-by:field in commit. To ease this, one can create a.gitmessagefile which is tracked in one’s git configuration
Issues and triaging
- Create issue template for users to provide a reproducer of their issue and information about their environment
- Dedicate time to triaging new issues
- Dedicate time to triaging stale PRs
- Dedicate time to triaging stale issues
Reviewing
- Suggest changes using passive form and questions
- Precise whether some suggestions are required or whether they are nitpicks
- Use suggestions on GitHub when possible to make reviews actionable
- Use permalinks of parts of the code-base for precise communication
- Implement each suggestion of the review in separate commits: it makes the life of the reviewer easier. One can use
git add -pto be able to select changes from the staging area for a particular commit.
Security
- Specify the security policy using a
SECURITY.mdfile - Make sure all maintainers use Two Factor Authentication
Test suite and quality insurance
- Use test coverage and have it reach > 90%
- Use the latest builds of dependencies everytime for development to fix issues as soon as possible
- Measure technical debt and monitor it
- Flag flaky tests and set time to resolve the cause of them
- Monitor performance regressions in PRs
- Identify low value tasks and setup systems so that engineers do not have to perform them in urgence
Versioning and Release
- Use Semantic Versioning
- Automate release generating changelog using the history (using git cliff)
- Have branches for minor or major versions to back port bug fixes if needed
- Assess how to perform minor or major release based on what reached the development branch (using branches for minor versions)
C++
- Consider whether you really have to use this language over more recent alternative such as Rust
- Enforce a sane subset of the language. I tend to agree with most of Chromium conventions. In particular, consider whether you need particular features from the next standard to effectively deliver value for end-users or to make your team’s life easier.
- Use all possible sanitizers in CI, you might need different workflows because you cannot activate some of them together
- Use hardening mode for debug builds
- Make sure to use
libtool‘s binary versioning convention for your projects’ ABI - Take care to only expose symbols which have to be exposed decorating the API of the public (adopt the necessary practices used on Windows for other platforms)
- Use
#pragma onceover macros for headers’ inclusion - Test against all implementations of the standard library, compiler toolchains, and OSes
- For existing project with Python bindings, try to adapt them binding to use nanobind over their existing solutions (pybind11, protocol buffers, Cython, Swig, …) as nanobind’s advantages are indeniable. On a new project, design it to use nanobind (it might require slight adaptations of your implementations).
- Use at least C++17 (best tradeoff between coverage of what has been implemented upstream and usability, nanobind requires C++17 for instance)
- If you want to distribute your project for some platforms, make sure that the dependencies you use are actually designed for the model of distributions you are targeting.
- If you distribute multithreaded implementations, make sure that the thread pools which are used across the stack are consistent with the ones of your dependencies (e.g. some distributions of implementations of BLAS like OpenBLAS cane be built with
pthreador OpenMP) - Strip symbols before distributing ELF files
perf(1)is the perfect tool to understand hotspots; speedscope and Firefox profiler also comes in handy for analysing records. Make sure to compile using frame pointers if you perform performance analysis. If you want to know the original source code lines of the hotspots, compile for debug symbols and useperf-annotate(1)- Use compiler explorer for understanding how each compiler generates code for the targeted platform.
Python
- With the advent of code-generation, some scripting languages like Python are loosing their relevance (fast development and accessibility) for some applications: prefer using them for scripts and bindings only
- If you distribute open-source scientific software, distribute it on conda-forge — the Python wheel format and PyPI are being adapted to better support scientific packages via the WheelNext project. pypackaging-native comes in handy to learn more about packaging scientific software
- Design your project for distributing it for free-threading builds
- Come up with stubs for your Python APIs, but mind the maintenance cost of them. Also prefer intuitive API over an complex pseudo-type system for your APIs’ arguments.