Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 718fca6

Browse files
Create SVM.py
added svm implementation
1 parent 95e3ef0 commit 718fca6

File tree

1 file changed

+178
-0
lines changed

1 file changed

+178
-0
lines changed

‎SVM.py‎

Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
# -*- coding: utf-8 -*-
2+
"""
3+
Created on Sat Jan 02 12:40:14 2016
4+
5+
@author: Pavitrakumar
6+
"""
7+
8+
from __future__ import division
9+
import numpy as np
10+
from sklearn import datasets
11+
import cvxopt
12+
import cvxopt.solvers
13+
from sklearn.cross_validation import train_test_split
14+
15+
#generating linearly seprable data set.
16+
#Target values = -1 or 1.
17+
18+
mean1 = np.array([0, 2])
19+
mean2 = np.array([2, 0])
20+
cov = np.array([[0.8, 0.6], [0.6, 0.8]])
21+
X1 = np.random.multivariate_normal(mean1, cov, 100)
22+
y1 = np.ones(len(X1))
23+
X2 = np.random.multivariate_normal(mean2, cov, 100)
24+
y2 = np.ones(len(X2)) * -1
25+
X = np.vstack((X1, X2))
26+
y = np.hstack((y1, y2))
27+
28+
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
29+
30+
31+
#X is 150x 4 matrix (4 featues, 150 data points)
32+
#y is 150x 1 matrix
33+
34+
n_samples = X_train.shape[0]
35+
n_features = X_train.shape[1]
36+
37+
def kernel(x,y):
38+
return np.dot(x,y)
39+
40+
K = np.zeros((n_samples, n_samples))
41+
for i in range(n_samples):
42+
for j in range(n_samples):
43+
K[i,j] = kernel(X_train[i], X_train[j])
44+
45+
#K is (150,150) matrix
46+
47+
#look at lecture notes, each term of P matrix is like y1y1(X1^T*X1)...
48+
#we have already calculated the (Xi^T*Xj) part in matix K.
49+
#now we need to calculate the yiyj part - since yi is just scalar, we just compute the outer product.
50+
51+
Y = np.outer(y_train,y_train)
52+
53+
#now to combine K and Y into P
54+
55+
P = cvxopt.matrix(Y * K)
56+
57+
58+
#the quadratic equation we are trying to minimize is :
59+
# 1/2 x^T*P*x + Q^T*X
60+
#we have computed P already, now for Q:
61+
62+
#according to the final minimization equation equation, the co-efficient for sigma xn is (-1)
63+
#so, we pass a col-matrix of -1s as Q
64+
65+
Q = cvxopt.matrix(-1 * np.ones(n_samples))
66+
67+
68+
#now, we have the solver takes other optional args for constraints
69+
70+
#cvxopt.solvers.qp(P, Q, G, h, A, b)
71+
72+
A = cvxopt.matrix(y_train, (1,n_samples),tc='d') #reshaping (1x150 matrix)
73+
b = cvxopt.matrix(0.0)
74+
75+
#The above A and b correspnd to Ax=b general form and y^T*x=0 as a constraint in our equation.
76+
77+
#for general Qp using cvxopt refer: #https://courses.csail.mit.edu/6.867/wiki/images/a/a7/Qp-cvxopt.pdf
78+
79+
#G and h general form: Gx <= h and 0<=xn<=C as a constraint in our equation if we need margin. For non-margins, we have 0 <= xn.
80+
81+
#for non-margin support vectors, 0 <= xn but, it is not in the form Gx <= h. To convert it to that form, we have to convert it as:
82+
# (-1)* xn <= 0
83+
# so, G will be a matrix of diaognals -1 and h will be a 150x1 matrix of 0s
84+
# RHS = 0 so h is a 1d matrix of 0 as coefficient is easy to understand.
85+
# Why does the G matrix have a diaognal of -1s?
86+
# Since our goal is to find the "support vectors" and these support vectors are a subset of points from the exisitng data point.
87+
# (support vectors define the margin of the hyperplane we are trying to find out)
88+
# So, each of the data point is to be considered as a unknown variable and the resultant value decides whether it is a support vector or not.
89+
#If we have 5 data points, equations are: 1*x1+0*x2+0*x3+0*x4+0*x5 = 0 so here h = [0] and G = [1,0,0,0,0] .... 0*x1+0*x2+0*x3+0*x4+1*x5 = 0 so here h = [0] and G = [0,0,0,0,1]
90+
G = cvxopt.matrix(np.diag(np.ones(n_samples) * -1)) # 150x150
91+
h = cvxopt.matrix(np.zeros(n_samples)) # 150x1
92+
93+
#for margin support vectors, we have a range in which xn lies i.e 0<=xn<=C
94+
#similar to previous where we had only one equality constraint to take care of, here we have 2.
95+
#that is: 0<=xn<=C can be split in 2. (we need to split because constraints to QP algo has to be in the form Gx<=h)
96+
# 0 < xn ==> (-1)*xn < = 0 (is in the form Gx<=h) .....(1)
97+
# (1)*xn <= C ==> (1)*xn <= C (is in the form Gx<=h, no need to change signs) ...........(2)
98+
99+
100+
101+
#So, the matrix for G is the previous matrix stacked(vertically) with another diaognal matrix of 1s
102+
103+
tmp1 = np.diag(np.ones(n_samples) * -1) # = G if non-margin SVs (only using (1))
104+
tmp2 = np.identity(n_samples) # using (2) (1 x xn each row)
105+
G = cvxopt.matrix(np.vstack((tmp1, tmp2))) # vertically stack the 2 matrices above
106+
#G is now a 300x150 matrix
107+
108+
#Similarly, for h here xn <= C so h matrix also has to take care of that constraint
109+
#It is a stacked(horizontally) with a (1x150 stacked horizontally with 1x150) 1x150 matrix of C value.
110+
111+
C = 2
112+
tmp1 = np.zeros(n_samples) # = h if non-margin SVs (only using (1))
113+
tmp2 = np.ones(n_samples) * C # using (2)
114+
h = cvxopt.matrix(np.hstack((tmp1, tmp2))) # 300x1 matrix
115+
#h is 1x300 matrix [0,0,0,....0,0,C,C,....,C,C,C]
116+
117+
118+
119+
120+
#Now we have all the required variables needed to pass to the QP solver and get the xns (the support vectors)
121+
122+
solution = cvxopt.solvers.qp(P, Q, G, h, A, b)
123+
124+
125+
a = np.ravel(solution['x'])
126+
#in the above vector, the points which have non-zero(or a very low cut off value - we have chosen 1e-5) values are the support vectors.
127+
128+
sv = a > 1e-5 # [true if condition satisfies, else false]
129+
ind = np.arange(len(a))[sv] #getting the indices of the support vectors in the training set
130+
#ex: ind = [1,4,12,59,30] <- 1st,4th,12th etc.. are all support vectors.
131+
a = a[sv] # all support vectors.
132+
133+
sv_x = X[sv]
134+
sv_y = y[sv]
135+
136+
#now we need to compute intercept (b)
137+
#b = ym - sigma xnynK(xn,xm) wehre xn>cutoff and K(xn,xm) is kernalized input
138+
#basically, it is b = acual - predicted (error)
139+
#since we are only using linear kernel, it is Xi^T.Xj (K matrix)
140+
141+
b = 0
142+
143+
for i in range(len(a)):
144+
ym = sv_y[i]
145+
xm = sv_x[i]
146+
b+=ym
147+
b-=np.sum(a*sv_y*K[sv,ind[i]])
148+
b/= len(a)
149+
150+
#computing weight vector:
151+
w = np.zeros(n_features)
152+
#w = sigma xn * Xn * yn where [Xn,yn] are the data points of the support vectors and xn is the alpha of the support vector.
153+
#since we are using linear kernel, we need weights.
154+
for i in range(len(a)):
155+
w += a[i]*sv_x[i]*sv_y[i]
156+
157+
158+
159+
160+
161+
y_predict = np.zeros(len(X_test))
162+
for i in range(len(X_test)):
163+
#similar to the formula for b just re-arrange it and calculate kernel for test set this time.
164+
s = 0
165+
for a_val,xm,ym in zip(a,sv_x,sv_y):
166+
print a_val
167+
print xm
168+
print kernel(X_test[i],xm)
169+
s += a_val * ym * kernel(X_test[i],xm)
170+
y_predict[i] = s
171+
172+
y_predict = np.sign(y_predict + b)
173+
174+
print "accuracy is ",(sum(y_predict==y_test)/len(y_predict))
175+
176+
177+
#right now, it only works on binary classification tasks but one-vs -all technique can be used to implement a multi-classification tasks.
178+

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /