Automatic Classification of Computing Literatures via Article and Reference Correlation

Oluwafemi Oriola; Lawrence Ojo; Ojonoka Atawodi

doi:doi:10.11648/j.ajcst.20220504.12

| Peer-Reviewed

Automatic Classification of Computing Literatures via Article and Reference Correlation

Oluwafemi Oriola, Lawrence Ojo, Ojonoka Atawodi

Published in American Journal of Computer Science and Technology (Volume 5, Issue 4)

Received: 17 September 2022 Accepted: 29 September 2022 Published: 21 October 2022

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Automatic literature classification via machine learning has witnessed increasing attention in various research circles, especially computing community because of the availability of large body of research articles in diverse fields. Existing works have largely drawn features from segments of articles such as abstracts, contents and their metadata with little or no attention for references. This paper posited that correlating article and reference features would enhance the performance of machine learning algorithms. Therefore, we exploited the correlation of TFIDF of articles and references using association rule and cosine similarity-based correlation methods for classification of computing literatures. We focused on Adekunle Ajasin University Research Repository. Based on the ACM’s and Denning’s taxonomies, the research articles in the database were labelled by experienced computing professionals. Logistic Regression, Support Vector Machine and Multilayer Perceptron Neural Network with N-Gram features were explored as classifiers. For ACM’s taxonomy, the highest accuracy and F1-score of 0.56 and 0.41, respectively were obtained for association rule-based correlation; 0.62 and 0.51, respectively for similarity-based correlation; and 0.59 and 0.46, respectively for the existing article-based classification. For Denning’s taxonomy, the highest accuracy and F1-score of 0.41 and 0.40, respectively were obtained for association rule-based correlation; 0.41 and 0.36, respectively for similarity-based correlation; and 0.38 and 0.37, respectively for the existing article-based classification. These results show that both methods of correlation have better prospect than the popular abstract-based classification method in automatic classification of computing literatures.

Published in	American Journal of Computer Science and Technology (Volume 5, Issue 4)
DOI	10.11648/j.ajcst.20220504.12
Page(s)	204-209
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2022. Published by Science Publishing Group

Keywords

Computing, Research Articles, Machine Learning, Classification, Reference Features

References

[1]	Akritidis, L., and Panayiotis, B. (2013). A Supervised Machine Learning Classification Algorithm for Research Articles. In SAC’13. Coimbra: ACM.
[2]	Rivest, M., Etienne, V., and E´ric, A. (2021). Article-Level Classification of Scientific Publications : A Comparison of Deep Learning, Direct Citation and Bibliographic Coupling. PLoS ONE, 16 (5): 1–18. https://doi.org/10.1371/journal.pone.0251493.
[3]	Archambault, E., Beauchesne, O. H., and Caruso, J. (2011). Towards a multilingual, comprehensive and open scientific journal ontology. In: Noyons, B., Ngulube, P., and Leta, J., editors. Proceedings of the 13th International Conference of the International Society for Scientometrics and Informetrics, 13: 66–77. http://science-metrix.com/?q=en/publications/conference-presentations/towards-a-multilingualcomprehensive-and-open-scientific.
[4]	Shu, F., Julien, C. A., Zhang, L., Qiu, J., Zhang, J., and Larivière, V. (2019). Comparing journal and paper level classifications of science. Journal of Informetrics, 13 (1): 202–25. https://www.sciencedirect.com/science/article/pii/S1751157718303298.
[5]	Sjogårde, P., and Ahlgren, P. (2020). Granularity of algorithmically constructed publication-level classifications of research publications: Identification of specialties. Quant. Sci. Stud. 1 (1): 207–38. https://www.mitpressjournals.org/doi/abs/10.1162/qss_a_00004.
[6]	Waltman, L., and van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of American Social Information Science and Technology, 63 (12): 2378–92. https://arxiv.org/abs/1203.0532.
[7]	Adele, P., and Alden, D. (2017). Classification of Journal Articles in a Search for New Experimental Thermophysical Property Data: A Case Study, Integrated Material and Manufacturing Innovations (2017) 6: 187–196. https://www.doi.org/10.1007/s40192-017-0096-1
[8]	Chen, D., Hans-michael, M., and Paul, W. S. (2006). Automatic Document Classification of Biological Literature, 11: 1–11. https://doi.org/10.1186/1471-2105-7-370.
[9]	Caragea, C., Adrian, S., Saurabh, K., Doina, C., and Prasenjit, M. (2011). Classifying Scientific Publications Using Abstract Features. Association for the Advancement of Artificial Intelligence. https://www.aaai.org/.
[10]	Roul, R. K., and Jajati K. S. (2017). A New Technique Classification of Research Articles Hierarchically : A New Technique. In H.S. Behera and D.P. Mohapatra (Eds.), Computational Intelligence in Data Mining, Advances in Intelligent Systems and Computing 556. https://doi.org/10.1007/978-981-10-3874-7.
[11]	Kandimalla, B., Shaurya, R., Jian, W., and Giles, C. L. (2021). Large Scale Subject Category Classi Fi Cation of Scholarly Papers With Deep Attentive Neural Networks. Frontiers in Research Metrics and Analytics 5 (2): 1–12. https://doi.org/10.3389/frma.2020.600382.
[12]	Pan, Z., Patrick, S., Setareh, R., Zhengtong, P., and Setareh R.. 2022. Ontology-Driven Scientific Literature Classification Using Clustering and Self-Supervised Learning. In Easychair Preprint.
[13]	Chowdhury Shovan and Schoen Marco P. (2020) Research Paper Classification using Supervised Machine Learning Techniques. (2020). Intermountain Engineering, Technology and Computing (IETC), https://doi.org/10.1109/IETC47856.2020.9249211
[14]	Denning, P. J. (1997). Computer Science: The Discipline, In A. Ralston and D. Hemmendinger (Eds.), 2000 Edition of Encyclopedia of Computer Science.
[15]	Bird, S., Klein, E. and Loper, E. (2009). Natural language processing with Python: Analyzing text with the natural language toolkit. O’Reilly Media, Inc.
[16]	Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., and R. Weiss. (2011). Scikit-Learn: Machine Learning in Python. Journal of Machine Learning Research, 12: 2825–2830.

Cite This Article

Plain Text BibTeX RIS

APA Style

Oluwafemi Oriola, Lawrence Ojo, Ojonoka Atawodi. (2022). Automatic Classification of Computing Literatures via Article and Reference Correlation. American Journal of Computer Science and Technology, 5(4), 204-209. https://doi.org/10.11648/j.ajcst.20220504.12

Copy | Download

ACS Style

Oluwafemi Oriola; Lawrence Ojo; Ojonoka Atawodi. Automatic Classification of Computing Literatures via Article and Reference Correlation. Am. J. Comput. Sci. Technol. 2022, 5(4), 204-209. doi: 10.11648/j.ajcst.20220504.12

Copy | Download

AMA Style

Oluwafemi Oriola, Lawrence Ojo, Ojonoka Atawodi. Automatic Classification of Computing Literatures via Article and Reference Correlation. Am J Comput Sci Technol. 2022;5(4):204-209. doi: 10.11648/j.ajcst.20220504.12

Copy | Download

@article{10.11648/j.ajcst.20220504.12,
  author = {Oluwafemi Oriola and Lawrence Ojo and Ojonoka Atawodi},
  title = {Automatic Classification of Computing Literatures via Article and Reference Correlation},
  journal = {American Journal of Computer Science and Technology},
  volume = {5},
  number = {4},
  pages = {204-209},
  doi = {10.11648/j.ajcst.20220504.12},
  url = {https://doi.org/10.11648/j.ajcst.20220504.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajcst.20220504.12},
  abstract = {Automatic literature classification via machine learning has witnessed increasing attention in various research circles, especially computing community because of the availability of large body of research articles in diverse fields. Existing works have largely drawn features from segments of articles such as abstracts, contents and their metadata with little or no attention for references. This paper posited that correlating article and reference features would enhance the performance of machine learning algorithms. Therefore, we exploited the correlation of TFIDF of articles and references using association rule and cosine similarity-based correlation methods for classification of computing literatures. We focused on Adekunle Ajasin University Research Repository. Based on the ACM’s and Denning’s taxonomies, the research articles in the database were labelled by experienced computing professionals. Logistic Regression, Support Vector Machine and Multilayer Perceptron Neural Network with N-Gram features were explored as classifiers. For ACM’s taxonomy, the highest accuracy and F1-score of 0.56 and 0.41, respectively were obtained for association rule-based correlation; 0.62 and 0.51, respectively for similarity-based correlation; and 0.59 and 0.46, respectively for the existing article-based classification. For Denning’s taxonomy, the highest accuracy and F1-score of 0.41 and 0.40, respectively were obtained for association rule-based correlation; 0.41 and 0.36, respectively for similarity-based correlation; and 0.38 and 0.37, respectively for the existing article-based classification. These results show that both methods of correlation have better prospect than the popular abstract-based classification method in automatic classification of computing literatures.},
 year = {2022}
}

Copy | Download

TY  - JOUR
T1  - Automatic Classification of Computing Literatures via Article and Reference Correlation
AU  - Oluwafemi Oriola
AU  - Lawrence Ojo
AU  - Ojonoka Atawodi
Y1  - 2022/10/21
PY  - 2022
N1  - https://doi.org/10.11648/j.ajcst.20220504.12
DO  - 10.11648/j.ajcst.20220504.12
T2  - American Journal of Computer Science and Technology
JF  - American Journal of Computer Science and Technology
JO  - American Journal of Computer Science and Technology
SP  - 204
EP  - 209
PB  - Science Publishing Group
SN  - 2640-012X
UR  - https://doi.org/10.11648/j.ajcst.20220504.12
AB  - Automatic literature classification via machine learning has witnessed increasing attention in various research circles, especially computing community because of the availability of large body of research articles in diverse fields. Existing works have largely drawn features from segments of articles such as abstracts, contents and their metadata with little or no attention for references. This paper posited that correlating article and reference features would enhance the performance of machine learning algorithms. Therefore, we exploited the correlation of TFIDF of articles and references using association rule and cosine similarity-based correlation methods for classification of computing literatures. We focused on Adekunle Ajasin University Research Repository. Based on the ACM’s and Denning’s taxonomies, the research articles in the database were labelled by experienced computing professionals. Logistic Regression, Support Vector Machine and Multilayer Perceptron Neural Network with N-Gram features were explored as classifiers. For ACM’s taxonomy, the highest accuracy and F1-score of 0.56 and 0.41, respectively were obtained for association rule-based correlation; 0.62 and 0.51, respectively for similarity-based correlation; and 0.59 and 0.46, respectively for the existing article-based classification. For Denning’s taxonomy, the highest accuracy and F1-score of 0.41 and 0.40, respectively were obtained for association rule-based correlation; 0.41 and 0.36, respectively for similarity-based correlation; and 0.38 and 0.37, respectively for the existing article-based classification. These results show that both methods of correlation have better prospect than the popular abstract-based classification method in automatic classification of computing literatures.
VL  - 5
IS  - 4
ER  -

Copy | Download

Author Information

Oluwafemi Oriola

Department of Computer Science, Adekunle Ajasin University, Akungba-Akoko, Nigeria
Lawrence Ojo

Department of Computer Science, Adekunle Ajasin University, Akungba-Akoko, Nigeria
Ojonoka Atawodi

School of Computing, University of Southern Mississippi, Hattiesburg, US

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Oluwafemi Oriola, Lawrence Ojo, Ojonoka Atawodi. (2022). Automatic Classification of Computing Literatures via Article and Reference Correlation. American Journal of Computer Science and Technology, 5(4), 204-209. https://doi.org/10.11648/j.ajcst.20220504.12

Copy | Download

ACS Style

Oluwafemi Oriola; Lawrence Ojo; Ojonoka Atawodi. Automatic Classification of Computing Literatures via Article and Reference Correlation. Am. J. Comput. Sci. Technol. 2022, 5(4), 204-209. doi: 10.11648/j.ajcst.20220504.12

Copy | Download

AMA Style

Oluwafemi Oriola, Lawrence Ojo, Ojonoka Atawodi. Automatic Classification of Computing Literatures via Article and Reference Correlation. Am J Comput Sci Technol. 2022;5(4):204-209. doi: 10.11648/j.ajcst.20220504.12

Copy | Download

@article{10.11648/j.ajcst.20220504.12,
  author = {Oluwafemi Oriola and Lawrence Ojo and Ojonoka Atawodi},
  title = {Automatic Classification of Computing Literatures via Article and Reference Correlation},
  journal = {American Journal of Computer Science and Technology},
  volume = {5},
  number = {4},
  pages = {204-209},
  doi = {10.11648/j.ajcst.20220504.12},
  url = {https://doi.org/10.11648/j.ajcst.20220504.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajcst.20220504.12},
  abstract = {Automatic literature classification via machine learning has witnessed increasing attention in various research circles, especially computing community because of the availability of large body of research articles in diverse fields. Existing works have largely drawn features from segments of articles such as abstracts, contents and their metadata with little or no attention for references. This paper posited that correlating article and reference features would enhance the performance of machine learning algorithms. Therefore, we exploited the correlation of TFIDF of articles and references using association rule and cosine similarity-based correlation methods for classification of computing literatures. We focused on Adekunle Ajasin University Research Repository. Based on the ACM’s and Denning’s taxonomies, the research articles in the database were labelled by experienced computing professionals. Logistic Regression, Support Vector Machine and Multilayer Perceptron Neural Network with N-Gram features were explored as classifiers. For ACM’s taxonomy, the highest accuracy and F1-score of 0.56 and 0.41, respectively were obtained for association rule-based correlation; 0.62 and 0.51, respectively for similarity-based correlation; and 0.59 and 0.46, respectively for the existing article-based classification. For Denning’s taxonomy, the highest accuracy and F1-score of 0.41 and 0.40, respectively were obtained for association rule-based correlation; 0.41 and 0.36, respectively for similarity-based correlation; and 0.38 and 0.37, respectively for the existing article-based classification. These results show that both methods of correlation have better prospect than the popular abstract-based classification method in automatic classification of computing literatures.},
 year = {2022}
}

Copy | Download

TY  - JOUR
T1  - Automatic Classification of Computing Literatures via Article and Reference Correlation
AU  - Oluwafemi Oriola
AU  - Lawrence Ojo
AU  - Ojonoka Atawodi
Y1  - 2022/10/21
PY  - 2022
N1  - https://doi.org/10.11648/j.ajcst.20220504.12
DO  - 10.11648/j.ajcst.20220504.12
T2  - American Journal of Computer Science and Technology
JF  - American Journal of Computer Science and Technology
JO  - American Journal of Computer Science and Technology
SP  - 204
EP  - 209
PB  - Science Publishing Group
SN  - 2640-012X
UR  - https://doi.org/10.11648/j.ajcst.20220504.12
AB  - Automatic literature classification via machine learning has witnessed increasing attention in various research circles, especially computing community because of the availability of large body of research articles in diverse fields. Existing works have largely drawn features from segments of articles such as abstracts, contents and their metadata with little or no attention for references. This paper posited that correlating article and reference features would enhance the performance of machine learning algorithms. Therefore, we exploited the correlation of TFIDF of articles and references using association rule and cosine similarity-based correlation methods for classification of computing literatures. We focused on Adekunle Ajasin University Research Repository. Based on the ACM’s and Denning’s taxonomies, the research articles in the database were labelled by experienced computing professionals. Logistic Regression, Support Vector Machine and Multilayer Perceptron Neural Network with N-Gram features were explored as classifiers. For ACM’s taxonomy, the highest accuracy and F1-score of 0.56 and 0.41, respectively were obtained for association rule-based correlation; 0.62 and 0.51, respectively for similarity-based correlation; and 0.59 and 0.46, respectively for the existing article-based classification. For Denning’s taxonomy, the highest accuracy and F1-score of 0.41 and 0.40, respectively were obtained for association rule-based correlation; 0.41 and 0.36, respectively for similarity-based correlation; and 0.38 and 0.37, respectively for the existing article-based classification. These results show that both methods of correlation have better prospect than the popular abstract-based classification method in automatic classification of computing literatures.
VL  - 5
IS  - 4
ER  -

Copy | Download