Unsupervised Machine Learning Approach for Summarizing Textual Information
April 25, 2023
How effective can a machine learning algorithm be in discovering patterns in textual data and how comparable the results are to emergent qualitative coding? According to a recent study published in the International Journal of Qualitative Methods, Emergent Coding and Topic Modeling: A Comparison of Two Qualitative Analysis Methods on Teacher Focus Group Data, Topic Modeling (TM) is a viable method for coding qualitative data quickly. Westat researchers Atsushi Miyaoka, MA, Lauren Decker-Woodrow, PhD, and Nancy Hartman, PhD, co-authored this research.
The key findings of their research include a high level of agreement between TM and emergent qualitative coding, TM was ineffective in capturing more nuanced information than the qualitative coding, TM can be used prior to qualitative analysis to identify nodes, and TM can be used after qualitative analysis to add validity.
“The rapid nature of technology allowed us for faster coding. We saw how effective Topic Modeling is in processing and summarizing findings. Furthermore, we used R, an open-source software (i.e., no fees or licenses) so there is no cost loss” notes Atsushi Miyaoka.