openModeller id: SVM
Current version: 0.5 Developer(s): Renato De Giovanni in collaboration with Ana Carolina Lorena
Accepts Categorical Maps: no
Requires absence points: no
Author(s): Vladimir N. Vapnik
Support vector machines map input vectors to a higher dimensional space where a maximal separating hyperplane is constructed. Two parallel hyperplanes are constructed on each side of the hyperplane that separates the data. The separating hyperplane is the hyperplane that maximises the distance between the two parallel hyperplanes. An assumption is made that the larger the margin or distance between these parallel hyperplanes the better the generalisation error of the classifier will be. The model produced by support vector classification only depends on a subset of the training data, because the cost function for building the model does not care about training points that lie beyond the margin. Content retrieved from Wikipedia on the 13th of June, 2007: http://en.wikipedia.org/w/index.php?title=Support_vector_machine&oldid=136646498. The openModeller implementation of SVMs makes use of the libsvm library version 2.85: Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Release history: version 0.1: initial release version 0.2: New parameter to specify the number of pseudo-absences to be generated; upgraded to libsvm 2.85; fixed memory leaks version 0.3: when absences are needed and the number of pseudo absences to be generated is zero, it will default to the same number of presences version 0.4: included missing serialization of C version 0.5: the indication if the algorithm needed normalized environmental data was not working when the algorithm was loaded from an existing model.
1) Vapnik, V. (1995) The Nature of Statistical Learning Theory. SpringerVerlag. 2) Schölkopf, B., Smola, A., Williamson, R. and Bartlett, P.L.(2000). New support vector algorithms. Neural Computation, 12, 1207-1245. 3) Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola A.J. and Williamson, R.C. (2001). Estimating the support of a high-dimensional distribution. Neural Computation, 13, 1443-1471. 4) Cristianini, N. & Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press.
SVM type
openModeller id: SvmType
Type of SVM: 0 = C-SVC, 1 = Nu-SVC, 2 = one-class SVM
Data type: integer Domain: [0.0, 2.0] Typical value: 0
Kernel type
openModeller id: KernelType
Type of kernel function: 0 = linear: u'*v , 1 = polynomial: (gamma*u'*v + coef0)^degree , 2 = radial basis function: exp(-gamma*|u-v|^2)
Data type: integer Domain: [0.0, 4.0] Typical value: 2
Degree
openModeller id: Degree
Degree in kernel function (only for polynomial kernels).
Data type: integer Domain: [0.0, oo] Typical value: 3
Gamma
openModeller id: Gamma
Gamma in kernel function (only for polynomial and radial basis kernels). When set to zero, the default value will actually be 1/k, where k is the number of layers.
Data type: real Domain: [oo, oo] Typical value: 0
Coef0
openModeller id: Coef0
Coef0 in kernel function (only for polynomial kernels).
Data type: real Domain: [oo, oo] Typical value: 0
Cost
openModeller id: C
Cost (only for C-SVC types).
Data type: real Domain: [0.001, oo] Typical value: 1
Nu
openModeller id: Nu
Nu (only for Nu-SVC and one-class SVM).
Data type: real Domain: [0.001, 1.0] Typical value: 0.5
Probabilistic output
openModeller id: ProbabilisticOutput
Indicates if the output should be a probability instead of a binary response (only available for C-SVC and Nu-SVC).
Data type: integer Domain: [0.0, 1.0] Typical value: 1
Number of pseudo-absences
openModeller id: NumberOfPseudoAbsences
Number of pseudo-absences to be generated (only for C-SVC and Nu-SVC when no absences have been provided). When absences are needed, a zero parameter will default to the same number of presences.
Data type: integer Domain: [0.0, oo] Typical value: 0
The following images show two models in the environmental space (temperature x precipitation) generated with the same presence points (Thalurania furcata boliviana localities dataset) but with different parameters. Since SVM C-SVC needs absence points, the first model included a set of pseudo-absence points that were randomly generated in areas (environmental space) distant from the presence points:
fig. 1: SVM C-SVC with default parameters. Pseudo-absence points are displayed in red. | fig. 2: SVM one-class with Nu=0.5 | fig. 3: SVM one-class with Nu=0.05 |