Here is a non-exhaustive, continually growing list of open-source contributions (e.g. Pull Requests I had the chance to author or co-author).
Note that contributions come with many flavours and need not be technical.
scikit-learn
100+ Pull Requests, including:
- ⭐
PairwiseDistancesReduction
: some innermost cornerstone implementations - Improve the creation of
KDTree
andBallTree
from \(\mathcal{O}(n^2)\) to \(\mathcal{O}(n)\)
400+ reviews, including 100+ reviews involving Cython, e.g:
- Common Private Loss Module with Tempita
- Scalable MiniBatchKMeans plus cln / fixes / refactoring
- Monotonic Constraints for Tree-based models
- Online implementation of non-negative matrix factorization
- Interaction constraints for HGBT
- Array API support to LinearDiscriminantAnalysis
Events:
- 🇪🇺 🐍 EuroPython 2022 Talk and Sprint
- 🇪🇺 🔬 EuroScipy 2022 Talk and Sprint
- 📖 see the slides
- PyLadies mentoring sessions
- 🪐 JupyterCon 2023
- 🌳 Scientific Python 2023 Developer Summit
- 🇪🇺 🔬 EuroScipy 2023 Maintainer Track and Sprint
SciPy
Contributions:
- Fix KMeans++ initialisation slowness
- Faster implementation of
scipy.sparse.block_diag
- Fix MLE for Nakagami
nakagami_gen.fit
Reviews:
- Dinic’s algorithm for
maximum_flow
- Speed up
sparse.csgraph.dijkstra
- Extending
_distance_pybind
with additional distance metrics - Minimum Cost Flow
Scientific Python
Miscellaneous
- My really first contribution was about pragmatic performance improvement of a C++ implementation of a fingerprint extractions algorithm.