摘要

This paper presents a new flexible approach to predict the gender of the writers from their handwriting samples. Handwriting features like slant, curvature, line separation, chain code, character shapes, and more, can be extracted from different methods. Therefore, the multi-feature sets are irrelevant and redundant. The conflict of the features exists in the sets, which affects the accuracy of classification and the computing cost. This paper proposes an approach, named kernel mutual information (KMI), that focuses on feature selection. The KMI approach can decrease redundancies and conflicts. In addition, it extracts an optimal subset of features from the writing samples produced by male and female writers. To ensure that KMI can apply the various features, this paper describes the handwriting segmentation and handwritten text recognition technology used. The classification is carried out using a Support Vector Machine (SVM) on two databases. The first database comes from the ICDAR 2013 competition on gender prediction, which provides the samples in both Arabic and English. The other database contains the Registration-Document-Form (RDF) database in Chinese. The proposed and compared methods were evaluated on both databases. Results from the methods highlight the importance of feature selection for gender prediction from handwriting.