Borisova I.A., Kutnenko O.A.
Cleaning Data Sets with Diagnostic Errors in the High-Dimensional Feature Spaces
Mathematical Biology & Bioinformatics. 2019;14(2):464-476.
doi: 10.17537/2019.14.464.
References
- de Waal T., Pannekoek J., Scholtus S. Handbook of Statistical Data Editing and Imputation. Hoboken, New Jersey: John Wiley and Sons, Inc.; 2011. 456 p. doi: 10.1002/9780470904848.ch1
- Barnett V., Lewis T. Outliers in Statistical Data. Chichester: John Wiley and Sons; 1994. 584 p.
- Jason W. Osborne. Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data. 1st Edition. SAGE Publication, Inc. Los Angeles; 2013. 296 p. doi: 10.4135/9781452269948
- Luca Greco. Robust Methods for Data Reduction Alessio Farcomeni. Chapman and Hall/CRC; 2015. 297 p.
- Teng C.M. A comparison of noise handling techniques. In: Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference. 2001:269–273.
- Aggarwal C.C., Yu P.S. Outlier detection for high dimensional data. In: Proc. ACM SIGMOD Int. Conf. on Management of Data. California, USA; 2001. doi: 10.1145/375663.375668
- 7. Guyon I., Weston J., Barnhill S., Vapnik V. Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning. 2002;46(1):389–422. doi: 10.1023/A:1012487302797
- Breunig M.M., Kriegel H.-P., Ng R.T., Sander J.R. LOF: Identifying Density-based Local Outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. 2000. P. 93–104. doi: 10.1145/335191.335388
- Liu F.T., Ting K.M., Zhou Z.-H. Isolation forest. In: Proceedings of ICDM’08. Eighth IEEE International Conference on Data Mining. 2008. P. 413–422. doi: 10.1109/ICDM.2008.17
- Kriegel H.P., Schubert M., Zimek A. Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. 2008. P. 444–452. doi: 10.1145/1401890.1401946
- Yang Y., Wu X., Zhu X. Dealing with Predictive-but-Unpredictable Attributes in Noisy Data Sources. In: Proceedings of 8th European Conference on Principles and Practice of Knowledge Discovery in Databases. Springer; 2004. doi: 10.1007/978-3-540-30116-5_43
- Brodley C.E., Friedl M.A. Identifying Mislabeled Training Data. Journal of Artificial Intelligence Research. 1999;11:131–167. doi: 10.1613/jair.606
- Borisova I.A., Kutnenko O.A. The Problem of Correction Diagnostic Errors in the Target Attribute With the Function of Rival Similarity. Mathematical Biology and Bioinformatics. 2018;13(1):38–49. doi: 10.17537/2018.13.38
- Borisova I.A., Kutnenko O.A. Outliers detection in datasets with misclassified objects. Machine Learning and Data Analysis. 2015;1(11):1632–1641.(in Russ.).
- Prostate Cancer Dataset. http://www.bioinf.ucd.ie/people/ian/Singh.txt (accessed January 2019).
- Zagoruiko N.G., Borisova I.À., Dyubanov V.V., Kutnenko Î.À. Methods of recognition based on the function of rival similarity. Pattern Recognition and Image Ànàlysis. 2008;18(1):1–6. doi: 10.1134/S105466180801001X
- Zagoruiko N.G. Kognitivnyi analiz dannykh (Cognitive analysis of data). Novosibirsk; 2013. 186 p. (in Russ.).
- Arkad'ev A.G., Braverman E.M. Obuchenie mashiny klassifikatsii ob''ektov (Training of machine a classification of objects). Moscow; 1971. 112 p. (in Russ.).
- Vorontsov K.V., Koloskov A.O. Iskusstvennyi intellekt (Artificial intelligence). 2006;2:30–33 (in Russ.).
- Shlezinger M.I. In: Chitaiushchie avtomaty i raspoznavanie obrazov (Reading machines and pattern recognition): collection of scientific papers. Kiev; 1965. P. 46–61 (in Russ).
- Zagoruiko N.G. Prikladnye metody analiza dannykh i znanii (Advanced Methods of Data and Knowledge Analysis). Novosibirsk; 1999. 270 p. (in Russ.).
- Subbotin S.O. The complex of characteristic and criteria of comparison of training. Mathematical Machines and Systems. 2010;1:25–39 (in Russ.).
- Zagoruiko N.G., Kutnenko O.A. Recognition methods based on the AdDel algorithm. Pattern Recognition and Image Analysis. 2004;14(2):198–204.
|
|
|