Support vector machine classifiers built using imperfect training data

Tapan P Bagchi; Rahul Samant; Milan Joshi

Tapan P Bagchi
Rahul Samant
Milan Joshi

Abstract

This paper extends the utility of asymmetric soft margin support vector machines by analyticallyÂ modeling imperfect class labeling in the training data. It uses Receiver Operating CharacteristicsÂ computations to first establish the strong relationship between the support vector machines performance and its ability to classify examples correctly, even in the presence of misclassified trainingÂ examples. It uses statistically designed experiments to reveal that misclassification also affectsÂ training quality, and hence performance, though not as strongly. Still, our results give strong supportÂ for ones striving to develop the best trained support vector machine that is intended to be utilized,
for instance, for medical diagnostics, as misclassifications shrink decision boundary distance and increase
generalization error. Also, this study asserts that real life costs of making wrong classificationÂ should be incorporated in the support vector machine design optimization objective.