Discipline: Computer Sciences and Information Management
Subcategory: Computer Science & Information Systems
Terrance Lagree - Morgan State University
Co-Author(s): Martina Taylor
Medical images of diverse modalities are sources of essential information for research and education within in large numbers in online repositories, commonly known as image atlases and biomedical literature. Besides in clinical settings, authors of journal articles frequently use images to illustrate important concepts. To enable effective search of diverse images presented in medical journal articles, it might be advantageous for a retrieval system to automatically identify the image type or modality (e.g., X-ray, MRI, Ultrasound, etc.) at first. A successful classification of images would greatly enhance the performance of the retrieval system by filtering out irrelevant images. Until now, only a few biomedical literature-based retrieval systems allow users to limit the search results to a particular modality. However, this modality is typically extracted from the caption and is often not correct or present. In addition to text features, other features based on image contents, such as color, texture, and shape might improve classification or modality detection accuracy due to the complementary nature.
Motivated by the successful use of text and image content features in domains, such as natural photographic images in Flickr, we propose a multimodal biomedical image classification approach from collections of full-text journal articles. The proposed approach uses text and image features extracted from relevant components in journal articles and utilizes the feature vector as input for a supervised learning based classification scheme. For text features, keywords are extracted from image caption, title, and abstract of journal articles images are represented as ‘bag of words’ in vector space model of Information Retrieval. For content-based image features, various color and texture related features are extracted as histograms and moments to represent images perceptually. Finally, both image and text features are combined to form a multimodal feature vector as input to a support vector machine (SVM) classifier. The classification was performed using WEKA (an open-source machine learning and data mining toolkit in Java) and results were evaluated on a collection of 9,500 images of eight different modalities (e.g., CT, X-ray, MRI, etc.) with improved accuracies (>90%) by using multimodal features. Our results suggest that the combination of text and content-based image features provides better classification accuracy as compared to using a single modality (e.g. image or text). In future, we will try to explore more advances feature extraction and representation approaches and use multiple classifiers and classifier combination approaches with different features as input for further improvement of our modality detection.
References: T. M. Lehmann, M. O. Gould, T. Deselaers, D. Keysers, H. Schubert, K. Spitzer, H. Ney, and B. B.Wein, “Automatic categorization of medical images for content-based retrieval and data mining,” Comput. Med. Imag. Graph., vol. 29, pp. 143–155, 2005
M¨uller H., de Herrera, A. G. S., Kalpathy-Cramer, J., Demner-Fushman, D., Antani, S., and Eggel, I., “Overview of the ImageCLEF 2012 Medical Image Retrieval and Classification Tasks,” in [CLEF (Online Working Notes/Labs/Workshop)], (2012).
Funder Acknowledgement(s): This study was supported, in part, by a grant from NSF HBCU-UP awarded to Md Mahmudur Rahman, Assistant Professor, Computer Science Department, Morgan State University, Baltimore, Maryland.
Faculty Advisor: Md Mahmudur Rahman, md.rahman@morgan.edu
Role: I analyzed and extracted the features(color and texture) from images of the data set and I performed the classification experiment with WEKA and evaluating the results.