Torch7
Torch7 provides a Matlab-like environment for state-of-the-art machine learning algorithms. It is easy to use and provides a very efficient implementation, thanks to an easy and fast scripting language (Lua) and an underlying C/OpenMP/CUDA implementation.

Torch7's plotting capabilities.
Torch7: A Matlab-like Environment for Machines Learning
Torch7 is not officially released yet, but is in a close to final stage now. Torch7 supersedes Torch5, and for those who were using my XLearn packages, they have been ported to Torch7, and are now distributed as Torch packages (installable via Torch's own package manager, torch-pkg).
If you want to use Torch7, and cannot wait for the official release, you can easily get started by following the installation instructions on this temporary page. In a couple of lines, it amounts to:
$ git clone https://github.com/andresy/torch $ cd torch $ mkdir build; cd build $ cmake .. $ make $ [sudo] make installIn terms of code base, Torch7, and all our 3rd-party modules are hosted on Gihub. Here's a non exhaustive list:
- Torch7
- an image toolbox
- a toolbox for graphs on images
- an unsupervised learning package
- an optimization package (tuned for stochastic problems)
- a matlab-Torch7 interface
- a parallel computing env for Lua
- a camera interface package
- the neuFlow compiler/toolkit
Note: all of the packages listed above are packaged as Torch packages. Given a proper Torch7 install, you can easily install any of them, like this:
$ torch-pkg install image imgraph parallel
Torch7 is the official successor of the very cool Torch5, an original work from Ronan Collobert. It is now developed and maintained by Ronan Collobert, Koray Kavukcuoglu and I. Here is a short paper that describes Torch7. It will be presented at the Big Learn workshop this year. One the coolest features of Torch7 is speed:

Torch7 is fast. Benchmarks of Torch7 (red stripes) versus Theano (solid green), while training various neural networks architectures with SGD. We considered a single CPU core, OpenMP with 4 cores and GPU alternatives. Performance is given in number of examples processed by second (higher is better). "batch" means 60 examples were fed at the time when training with SGD.