Dettaglio pubblicazione

2021, ...SUMMER SCHOOL FRANCESCO TURCO. PROCEEDINGS, Pages -

Using Natural Language Processing to uncover main topics in defect recognition literature (04c Atto di convegno in rivista)

Bernabei M., Colabianchi S., Costantino F., Patriarca R.

The issue of defect detection is particularly important namely in plant engineering, where it is crucial to ensure high-quality production by minimizing the number of defective parts. In the last years, the interest in the subject has grown a lot and the methods and approaches proposed for defect recognition are multiple. Therefore, when dealing with defect recognition researchers are faced with an increasing number of articles that slows them down in identifying the set of articles of their interest. This work aims to provide a baseline classification of articles based on emerging issues such as the investigated material, the production typology in which the material is included, and the type of analysis to be effected. For these reasons, the paper proposes an automatic solution based on text mining techniques. Specifically, the study applies Natural Language Processing (NLP) to articles’ titles, abstracts, and keywords using two different approaches: K-Means clustering algorithm and Latent Dirichlet Allocation (LDA). K-Means is used to cluster the collection of documents into related groups based on the contents of the particular documents. LDA instead is used to classify the papers using the concept of topic modeling. Articles have been collected from Scopus database. The scope of the research is limited to journal and conference articles, published in English excluding articles classified as reviews, as well as book chapters, books, notes, erratum.

Gruppo di ricerca: Industrial systems engineering

keywords