Thursday, October 6, 2011

The Perceptron

In the field of pattern classification, the purpose of a classifier is to use the object's characteristics to identify which class it belongs to. The Perceptron is a classifier and it is one of the simplest kind of Artificial Neural Network. In this post we will see a Python implementation of the Perceptron. The code that we will see implements the schema represented below.


In this schema, an object (pattern) is represented by a vector x and its characteristics (features) are represented by the vector's elements x_1 and x_2. We call the vector w, with elements w_1 and w_2, the weights vector. The values x_1 and x_2 are the input of the Perceptron. When we activate the Perceptron each input is multiplied by the respective weight and then summed. This produces a single value that it is passed to a threshold step function. The output of this function is the output of the Perceptron. The threshold step function has only two possible output: 1 and -1. Hence, when it is activated his reponse indicates that x belong to the first class (1) or the second (-1).

A Perceptron can be trained and we have to guide his learning. In order to train the Perceptron we need something that the Perceptron can imitate, this data is called train set. So, the perceptron learns as follow: an input pattern is shown, it produces an output, compares the output to what the output should be, and then adjusts its weights. This is repeated until the Perceptron converges to the correct behavior or a maximum number of iteration is reached.

The following Python class implements the Percepron using the Rosenblatt training algorithm.
from pylab import rand,plot,show,norm

class Perceptron:
 def __init__(self):
  """ perceptron initialization """
  self.w = rand(2)*2-1 # weights
  self.learningRate = 0.1

 def response(self,x):
  """ perceptron output """
  y = x[0]*self.w[0]+x[1]*self.w[1] # dot product between w and x
  if y >= 0:
   return 1
  else:
   return -1

 def updateWeights(self,x,iterError):
  """
   updates the weights status, w at time t+1 is
       w(t+1) = w(t) + learningRate*(d-r)*x
   where d is desired output and r the perceptron response
   iterError is (d-r)
  """
  self.w[0] += self.learningRate*iterError*x[0]
  self.w[1] += self.learningRate*iterError*x[1]

 def train(self,data):
  """ 
   trains all the vector in data.
   Every vector in data must have three elements,
   the third element (x[2]) must be the label (desired output)
  """
  learned = False
  iteration = 0
  while not learned:
   globalError = 0.0
   for x in data: # for each sample
    r = self.response(x)    
    if x[2] != r: # if we have a wrong response
     iterError = x[2] - r # desired response - actual response
     self.updateWeights(x,iterError)
     globalError += abs(iterError)
   iteration += 1
   if globalError == 0.0 or iteration >= 100: # stop criteria
    print 'iterations',iteration
    learned = True # stop learning
Perceptrons can only classify data when the two classes can be divided by a straight line (or, more generally, a hyperplane if there are more than two inputs). This is called linear separation. Here is a function that generates a linearly separable random dataset.
def generateData(n):
 """ 
  generates a 2D linearly separable dataset with n samples. 
  The third element of the sample is the label
 """
 xb = (rand(n)*2-1)/2-0.5
 yb = (rand(n)*2-1)/2+0.5
 xr = (rand(n)*2-1)/2+0.5
 yr = (rand(n)*2-1)/2-0.5
 inputs = []
 for i in range(len(xb)):
  inputs.append([xb[i],yb[i],1])
  inputs.append([xr[i],yr[i],-1])
 return inputs
And now we can use the Perceptron. We generate two dataset, the first one is used to train the classifier (train set), and the second one is used to test it (test set):
trainset = generateData(30) # train set generation
perceptron = Perceptron()   # perceptron instance
perceptron.train(trainset)  # training
testset = generateData(20)  # test set generation

# Perceptron test
for x in testset:
 r = perceptron.response(x)
 if r != x[2]: # if the response is not correct
  print 'error'
 if r == 1:
  plot(x[0],x[1],'ob')  
 else:
  plot(x[0],x[1],'or')

# plot of the separation line.
# The separation line is orthogonal to w
n = norm(perceptron.w)
ww = perceptron.w/n
ww1 = [ww[1],-ww[0]]
ww2 = [-ww[1],ww[0]]
plot([ww1[0], ww2[0]],[ww1[1], ww2[1]],'--k')
show()
The script above gives the following result:


The blue points belong to the first class and the red ones belong to the second. The dashed line is the separation line learned by the Perceptron during the training.

11 comments:

  1. it seems like current implementation doesn't work with dots, scattered it full quadrant [-1,1]x[-1,1].

    ReplyDelete
  2. I have referenced your code and shared the modified code on Github(https://github.com/Honghe/perceptron).
    If your don't like please inform me.
    Thx!

    ReplyDelete
  3. How would you modify the code to plot for a vector with 3 features instead of 2

    ReplyDelete
    Replies
    1. You have to draw a 3D plot. Maybe this could help you:

      http://glowingpython.blogspot.it/2012/12/3d-stem-plot.html

      Pay attention, in this case the separation surface is a plane and not a line.

      Delete
  4. Thank you, that was helpful. How would you calculate the z cordinate ie. "ww3" following your example. Thank you for your time.

    ReplyDelete
    Replies
    1. You should find the plane orthogonal to the vector percepton.w. Let me know if you succeed :)

      Delete
  5. :-) will try, thanks

    ReplyDelete
  6. Cool ... Very nicely blogged ...

    ReplyDelete
  7. What can I do if I have a 4 class data with x and y features?

    ReplyDelete
    Replies
    1. Hi, you can use a one vs all strategy: https://en.wikipedia.org/wiki/Multiclass_classification

      I'd recommend you to use the sklearn implementation: http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html

      Delete