Russian version English version
Volume 14   Issue 2   Year 2019
Borisova I.A., Kutnenko O.A.

Cleaning Data Sets with Diagnostic Errors in the High-Dimensional Feature Spaces

Mathematical Biology & Bioinformatics. 2019;14(2):464-476.

doi: 10.17537/2019.14.464.

References

 

  1. de Waal T., Pannekoek J., Scholtus S. Handbook of Statistical Data Editing and Imputation. Hoboken, New Jersey: John Wiley and Sons, Inc.; 2011. 456 p. doi: 10.1002/9780470904848.ch1
  2. Barnett V., Lewis T. Outliers in Statistical Data. Chichester: John Wiley and Sons; 1994. 584 p.
  3. Jason W. Osborne. Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data. 1st Edition. SAGE Publication, Inc. Los Angeles; 2013. 296 p. doi: 10.4135/9781452269948
  4. Luca Greco. Robust Methods for Data Reduction Alessio Farcomeni. Chapman and Hall/CRC; 2015. 297 p.
  5. Teng C.M. A comparison of noise handling techniques. In: Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference. 2001:269–273.
  6. Aggarwal C.C., Yu P.S. Outlier detection for high dimensional data. In: Proc. ACM SIGMOD Int. Conf. on Management of Data. California, USA; 2001. doi: 10.1145/375663.375668
  7. 7. Guyon I., Weston J., Barnhill S., Vapnik V. Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning. 2002;46(1):389–422. doi: 10.1023/A:1012487302797
  8. Breunig M.M., Kriegel H.-P., Ng R.T., Sander J.R. LOF: Identifying Density-based Local Outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. 2000. P. 93–104. doi: 10.1145/335191.335388
  9. Liu F.T., Ting K.M., Zhou Z.-H. Isolation forest. In: Proceedings of ICDM’08. Eighth IEEE International Conference on Data Mining. 2008. P. 413–422. doi: 10.1109/ICDM.2008.17
  10. Kriegel H.P., Schubert M., Zimek A. Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. 2008. P. 444–452. doi: 10.1145/1401890.1401946
  11. Yang Y., Wu X., Zhu X. Dealing with Predictive-but-Unpredictable Attributes in Noisy Data Sources. In: Proceedings of 8th European Conference on Principles and Practice of Knowledge Discovery in Databases. Springer; 2004. doi: 10.1007/978-3-540-30116-5_43
  12. Brodley C.E., Friedl M.A. Identifying Mislabeled Training Data. Journal of Artificial Intelligence Research. 1999;11:131–167. doi: 10.1613/jair.606
  13. Borisova I.A., Kutnenko O.A. The Problem of Correction Diagnostic Errors in the Target Attribute With the Function of Rival Similarity. Mathematical Biology and Bioinformatics. 2018;13(1):38–49. doi: 10.17537/2018.13.38
  14. Borisova I.A., Kutnenko O.A. Outliers detection in datasets with misclassified objects. Machine Learning and Data Analysis. 2015;1(11):1632–1641.(in Russ.).
  15. Prostate Cancer Dataset. http://www.bioinf.ucd.ie/people/ian/Singh.txt (accessed January 2019).
  16. Zagoruiko N.G., Borisova I.À., Dyubanov V.V., Kutnenko Î.À. Methods of recognition based on the function of rival similarity. Pattern Recognition and Image Ànàlysis. 2008;18(1):1–6. doi: 10.1134/S105466180801001X
  17. Zagoruiko N.G. Kognitivnyi analiz dannykh (Cognitive analysis of data). Novosibirsk; 2013. 186 p. (in Russ.).
  18. Arkad'ev A.G., Braverman E.M. Obuchenie mashiny klassifikatsii ob''ektov (Training of machine a classification of objects). Moscow; 1971. 112 p. (in Russ.).
  19. Vorontsov K.V., Koloskov A.O. Iskusstvennyi intellekt (Artificial intelligence). 2006;2:30–33 (in Russ.).
  20. Shlezinger M.I. In: Chitaiushchie avtomaty i raspoznavanie obrazov (Reading machines and pattern recognition): collection of scientific papers. Kiev; 1965. P. 46–61 (in Russ).
  21. Zagoruiko N.G. Prikladnye metody analiza dannykh i znanii (Advanced Methods of Data and Knowledge Analysis). Novosibirsk; 1999. 270 p. (in Russ.).
  22. Subbotin S.O. The complex of characteristic and criteria of comparison of training. Mathematical Machines and Systems. 2010;1:25–39 (in Russ.).
  23. Zagoruiko N.G., Kutnenko O.A. Recognition methods based on the AdDel algorithm. Pattern Recognition and Image Analysis. 2004;14(2):198–204.
Table of Contents Original Article
Math. Biol. Bioinf.
2019;14(2):464-476
doi: 10.17537/2019.14.464
published in Russian

Abstract (rus.)
Abstract (eng.)
Full text (rus., pdf)
References

 

  Copyright IMPB RAS © 2005-2024