image source: www.ie.edu |
Machine learning is a part of artificial intelligence
and also known as a method to study the data. It relates to the development of
algorithms or computer systems with the ability to automatically study from
data, identify pattern and predict the outcomes without or with minimal
intervention from human. Machine learning covers a wide range of applications
in the field that deal with massive quantities of data.
There are four basic steps to perform a machine
learning task. It starts with data collection at which the raw data can be in
the form of image, sound, or any text files. Some application such in biometric
may require specific acquisition device to capture finger vein images. The
second step is data preparation to determine or select the data with quality.
Raw data may contain outliers or noise and even missing information. Therefore,
it is important to fix this issue. Specific technique or algorithm is employed
in this stage to extract useful informations from the raw data. For an example,
the Principle Component Analysis is a common technique to extract important
features from image [1].
The next step is choosing appropriate algorithm that
works with data. Over the years, researchers have been developed algorithms for
specific types of data. Some algorithms are suitable for image and others are
well suited for text-based type. In this step, the data is divided into two
blocks; training and testing. The training set will be the majority of set and
it is used to build a model. Meanwhile, the testing will be used to evaluate
model’s performance. In general, machine learning algorithms are categorized as
supervised and unsupervised. Supervised technique makes prediction of output by
learning the labeled input data [2]. On the other hand, all
data in unsupervised technique are unlabeled and the algorithm studies the
structure of data to predict the output. Finally, the model evaluation is to
measure the performance of the trained model with the introduction of testing
set. A number of criteria can be used to evaluate the strength and weakness of
the model such as storage reduction, noise tolerance, generalization accuracy
and time requirements [3].
Reference :
[1] M. S. Mohd Asaari, S. a. Suandi, and B. A.
Rosdi, “Fusion of Band Limited Phase Only Correlation and Width Centroid
Contour Distance for finger based biometrics,” Expert Syst. Appl., vol.
41, no. 7, pp. 3367–3382, Jun. 2014.
[2] J. S. Sánchez, R. Barandela, A. I.
Marqués, R. Alejo, and J. Badenas, “Analysis of new techniques to obtain
quality training sets,” Pattern Recognit. Lett., vol. 24, no. 7, pp.
1015–1022, 2003.
[3] F. Herrera, “Prototype Selection for
Nearest Neighbor Classification : Taxonomy and Empirical Study,” vol. 34, no.
3, pp. 417–435, 2012.
By: Nordiana Mukahar
Done reviewing..
ReplyDelete