There are a number of open source Machine Learning packages available including Statsmodels, PyML, PyBrain, MLPy, milk, and scikit-learn. These packages come with implementations of various regression and classification algorithms from Logistic Regression and Support Vector Machines, to Random Forest and Neural Networks. Each package has it's own pros and cons and some packages focus on particular algorithms. scikit-learn is my personal favourite ML Python package. Some other useful packages that I have installed include SciPy a library of scientific computing routines, NumPy which is an N-dimensional array package, Matplotlib for 2D plotting, Pandas for data structures and analysis, and iPython, which is an interactive console for editing Python code.R has a mature set of statistical packages on offer which can be called from Python. Thus, I set about installing R, then RPy2, and finally RMetrics which is the most comprehensive R package for analysing financial time series. I decided to build R from source rather than download a binary. Use wget to download the source.
-bash$ wget http://ftp.heanet.ie/mirrors/cran.r-project.org/src/base/R-3/R-3.0.1.tar.gz
Extract using tar, and enter the source directory.
-bash$ tar -zxvf ; cd R-3.0.1
You must configure the build with the --enable-R-shlib option as this makes R a shared library, which is a prerequisite for the RPy2 installation.
-bash$ ./configure --prefix=$HOME/.local --enable-R-shlib
The R make process can take a while so I put it into the background, and detach from the process with disown so that it does not terminate if I close my shell. I pipe the stdout and stderr to a text file.
-bash$ make &> make.txt &
-bash$ disown -h
I can then keep track of the make progress by tailing this file.
-bash$ tail -f make.txt
Once the make is complete I install.
-bash$ make install
With R successfully installed I download the latest rpy2 package and extract.
-bash$ wget http://sourceforge.net/projects/rpy/files/latest/download?source=files
-bash$ tar -zxvf rpy2-2.3.1.tar.gz ; cd rpy2-2.3.1
Next, update the relevant environment variables in your .bash_profile. This will vary depending on your installation, check the installation guidelines for more. Finally, install!
-bash$ python setup.py install
I then ran a test Python script from the rpy2 introduction.
import rpy2.robjects as robjects pi = robjects.r['pi'] print(pi[0])And the script output the value of pi as expected!
-bash$ python rp2_test.py
-bash$ 3.141592653589793
Next I installed the excellent RMetrics from the R shell.
-bash$ R
> source("http://www.rmetrics.org/Rmetrics.R")
> install.Rmetrics()
No comments:
Post a Comment