1 Scopus citations

Abstract

IMPORTANCE: The marked explosion and fragmentation of bibliographic databases that include large parts pertaining to medical subspecialties has created the opportunity to identify new areas of research using the citations at the interface of subspecialty information. Bibliographic databases such as PubMed are useful to researchers when they wish to identify specific citations of their interest. However, they are not useful because of their size for the purpose of identifying new areas of research. OBJECTIVE: To present a method and two computer applications that identify areas for new research by finding abstracts at the interface between subspecialty parts of PubMed. DESIGN: Here we present a new method and computer applications that aim to ameliorate the problem by examining all abstracts that fulfill the general search terms from PubMed. Using text-mining algorithms of the abstracts to extract all non-Trivial words, the researcher can repeatedly cluster the publications by commonality of the words in the abstracts to find unusual or unexpected combinations of words that may lead to new research. When single words are not descriptive enough to identify unique and unexpected ideas for potential new research, we allow the extraction of principal phrases from those abstracts instead. Here we define a principal phrase as a phrase that is common by itself, i.e. not common only as part of another common phrase, does not cross punctuation marks, and is informative (e.g. "and this disease"is not an informative phrase). FINDINGS: We present four examples of identifying new research areas by examining PubMed outcomes after searches for "takotsubo", "embolic stroke"excluding "atrial fibrillation", "impedance mismatch", and "aortic and stenosis". New areas of research were identified including comparisons of the clinical picture and pathophysiology of Takotsubo with scorpion envenomation, and the importance of impedance mismatch in pulmonary and renal circulation. CONCLUSION AND RELEVANCE: In conclusion, we have developed a method and two computer applications to mine words and/or principal phrases from the abstracts retrieved from PubMed or other databases to identify new ideas for research.

Original languageEnglish (US)
Title of host publicationProceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2020
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450379649
DOIs
StatePublished - Sep 21 2020
Event11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2020 - Virtual, Online, United States
Duration: Sep 21 2020Sep 24 2020

Publication series

NameProceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2020

Conference

Conference11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2020
Country/TerritoryUnited States
CityVirtual, Online
Period9/21/209/24/20

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Software
  • Biomedical Engineering
  • Health Informatics

Keywords

  • Abstracts
  • Clustering
  • Phrase Mining
  • Phrases
  • Text Mining

Fingerprint

Dive into the research topics of 'Abstract Mining'. Together they form a unique fingerprint.

Cite this