Overview
I am actively promoting open source software in the field of machine learning. I have co-organized workshops on machine learning open source software. Furthermore, I was involved in establishing a new track in the Journal of Machine Learning Research focusing on open source software (of which I am now one of the action editors). Recently, I became a Debian Developer and I am now packaging machine learning and other (scientific) software for Debian. Finally, I am the main author of the open source machine learning Shogun toolbox, which includes all of the algorithmic implementations I used and developed in my research.
Machine Learning Open Source Software
Debian
In September 2008, I became an official Debian developer. I would like to take the opportunity to thank my sponsor Torsten Werner for all his work and endless patience that made creating debian packages fun. We still meet occasionally at C-Base doing package maintenance and discussion open source and debian related issues. Without Ana Beatriz Guerrero López who took the burden of being my Application Manager for almost 10 months I would not have made it into the project - so thanks!
Given my machine learning background I am naturally interested in packaging all kinds of machine learning software for debian. Since machine learning software benefits from other scientific fields like operations research I am additionally packaging such software too. Together with Aramian Wasilek I am packaging the Coin-OR mathematical programming software. The full list of packages I maintain can be seen here.
Machine Learning Data Set Repository
Patrik Hoyer, Cheng Soon Ong, Mikio Braun and me are behind an effort to standardize machine learning data formats and to create a machine learning data set repository called mldata.org in close resemblance to mloss.org. Our aim is to provide a service to the machine learning community that enables reproducible research with as little effort as possible. People will be able to upload and download data sets in various formats including our newly developed hdf5 based standard format. On such data sets they can define learning tasks, group tasks into challenges and submit solutions to particular tasks to get live feedback. We will be publicly announcing this website at NIPS*2010 at the demo session.