I started working on the library as soon as I joined Inria, in late 2020, and have been contributing ever since.
At its inception, the library had one objective: encoding non-normalized tabular data. For a congress at École Normale Supérieure Paris-Saclay, I created a short video showcasing the use cases and tools provided by the library at the time.
You should check it out if you’re interested to know what the lib does!
Starting in 2023, we have expended its scope significantly, and I’m happy to have contributed to the decisions – both technical and philosophical –, the technology and the promotion at multiple conferences and events!
Although it’s not my full-time job anymore, I’m still contributing on-and-off to the library, most often giving input on directions and technical decisions.
Thanks for reading! Check out the project on GitHub!
- skrub: prepping tables for machine learningIn Proceedings of the 15th European Conference on Python in Science, 2023
- dirty_cat: a library for machine learning on dirty categorical dataIn Proceedings of the 14th European Conference on Python in Science, 2022
- dirty_cat: a Python package for Machine Learning on Dirty Categorical DataIn Proceedings of the 1st Paris-Saclay University Multidisciplinary Junior Congress, 2022