Identification Of Outliers In Oxazolines AND Oxazoles High Dimension Molecular Descriptor Dataset Using Principal Component Outlier Detection Algorithm And Comparative Numerical Study Of Other Robust Estimators (1312.2861v1)
Abstract: From the past decade outlier detection has been in use. Detection of outliers is an emerging topic and is having robust applications in medical sciences and pharmaceutical sciences. Outlier detection is used to detect anomalous behaviour of data. Typical problems in Bioinformatics can be addressed by outlier detection. A computationally fast method for detecting outliers is shown, that is particularly effective in high dimensions. PrCmpOut algorithm make use of simple properties of principal components to detect outliers in the transformed space, leading to significant computational advantages for high dimensional data. This procedure requires considerably less computational time than existing methods for outlier detection. The properties of this estimator (Outlier error rate (FN), Non-Outlier error rate(FP) and computational costs) are analyzed and compared with those of other robust estimators described in the literature through simulation studies. Numerical evidence based Oxazolines and Oxazoles molecular descriptor dataset shows that the proposed method performs well in a variety of situations of practical interest. It is thus a valuable companion to the existing outlier detection methods.