Emil Rijcken
Nov 7, 2021

--

Hi Silvia, thank you for your message and for sharing your GitHub link!

Do you know how NPMI is calculated here? The finding surprises me since C_v incorporates NPMI.

Based on the link, I still prefer using C_v. If NPMI is used as a form of topic coherence, it must also have a configuration for the other coherence 'dimensions'. If that is the case, it is one of the configurations tested in Roeder's paper (https://dl.acm.org/doi/abs/10.1145/2684822.2685324?casa_token=-izNxeG935wAAAAA:eLas7MiX6Sa0MR96ftl1DkzWDrqs5Ht_8K-umnBETLndyvXryVG11WJAguBJ3NSCj-k44Wunk7SuGQ). In this paper, he compares all possible configurations on various datasets and finds C_v to correlate highest with human interpretation. Although different coherence scores should correlate positively, I favour the coherence score with the highest correlation to human interpretation.

--

--