A beginner’s guide to PyML

6 May

Hi all,

today I’m back with a simple and short tutorial on using PyML.

For a start, PyML is a machine learning library written in Python. It’s fairly straightforward and easy to use. And I found it useful for my daily machine learning needs.

To put it in simplest terms, machine learning generally can be divided into supervised and unsupervised learning. Supervised learning is where we have the ‘correct’ answers, whereas no correct answer is available for unsupervised learning.

I’m currently focused on supervised learning, and hence this short tutorial is for supervised learning.

Firstly, you need to install PyML, and you can get it from here [ http://sourceforge.net/projects/pyml/ ]

Download the source files, untar or unzip, change directory into the folder and run the command:

sudo python setup.py install

*Be sure to use PyML version 7.9 as this tutorial is based on version 7.9

 

Next, you can start using PyML by importing the library and its components:

from PyML import *
# this imports the most frequently used components if PyML

 

We can create training data by using a list of lists. Take for example:


X = [[1,2,3,4,5],[2,3,4,5,6],[3,4,5,6,7]]
# where each of the list represents an arbitrary vector

 

Next, we use the VectorDataSet object:

data = VectorDataSet(X,L=['True','False','True'])

Note that we give the L argument a list of labels, a.k.a the correct answer to the list of lists ( for X ). Most importantly, there must be 1 label for every vector.

Now that we have the data prepared, let’s start training the data!

Create a SVM() object, than train the data:


s = SVM()
s.train(data)

 

We can save the data should we want to:

s.save(fileName)

 

So now we have completed the preparation work, its time to test the classifier.

We start off by loading the model which we have saved:


from PyML.classifiers.svm import loadSVM
loadedSVM = loadSVM(fileName,data)

 

We than create a test data and test the classifier:


testX = [[...],[...]] # some arbitrary test data
testL = ['True','False'... ] # some arbitrary labels for the test data
testData = VectorDataSet(testX,L=testL)
r = loadedSVM.test(testData)
print r # this shows the results

 

 

That’s all folks! Hope this helps!

 

An interesting Price Table design by Fanbridge

25 Apr

I was browsing around the web and came across a really interesting Pricing Table design by FanBridge:

Notice that how the design reflects positive relation between pricing and  number of features ?

 

Looks pretty cool.

A Promising Realtime, Live Updating JavaScript Framework – Meteor

11 Apr

Just saw a new JavaScript framework worth mentioning : the Meteor JavaScript Framework [ http://www.meteor.com ]

You can view the demos here : http://www.meteor.com/examples/

What makes it really different is the ‘live page update” feature; see the demos to get a feel of what i mean.

Sounds really cool to me.

You can follow them at Twitter : https://twitter.com/#!/meteorjs

Finally, developers can follow them at : https://github.com/meteor/meteor

Nice website on mobile user interface

8 Apr

A nice website i happened to bump into on mobile user interface.

http://mobile-patterns.com

 

Management. the Steve Jobs Way.

30 Dec

Nice article on management Steve Jobs’ way.

 

http://www.openforum.com/articles/steve-jobs-style-everything-we-know-about-managing-is-wrong

 

How not to write a CS paper

14 Dec

Wow, a short but interesting read :

http://www.cs.cmu.edu/~jrs/sins.html

It talks about the sins of writing a CS paper.

Google’s New Sign In, Log In Page

20 Aug

Google now has a new sign in and login page.

You just have to go to a google account’s page, and than take a look at the bottom.

You might be given a choice to try the new sign in/ login page.

 

Here’s a screenshot of it if you do not have access to it:

A Twitter Storm is Approaching

6 Aug

twitter logo

Twitter logo

Seems that Twitter is going to open source a very powerful data processing framework which is superficially similar to Hadoop.

The article from Twitter states that

“A Storm cluster is superficially similar to a Hadoop cluster. Whereas on Hadoop you run “MapReduce jobs”, on Storm you run “topologies”.

Go check out the article right now.

Best Resources for Learning Chrome Extension Development. A summary.

3 Aug


I was required to build an internal tool for migrating data from an deprecated platform to a platform based on WordPress.

Due to various complications, the migration of data cannot be done via server-side programmatic means.

Therefore I studied quite a bit on Google Chrome Extension as a viable means to perform this task.

So here’s a summary of the articles/resources I have read to get my job done:

If you are a beginner:

Useful examples on how to use the APIs

Useful Answers to Various Chrome Extension Development Issues:

Best Resources to Learn about Node.js. A Summary.

3 Aug

I discovered Node.js some time ago ( when it was version0.2.x) and finally, I have the time to dabble with it a few days ago. The following is a summary of various resources I have collected and found useful while learning about Node.js

If you are just starting out, you can check out :

If you are into using frameworks for Node.js, here’s two you should try:

For people who are into Social Authentication, you can refer to:

Need various database support? Here’s some for NoSQL databases:

For people who are slightly more advanced

I’ll update this post when i discover more useful resources for Node.js.

Cheers!

Wordpress Code Snippet by Allan Collins