Emil Rijcken
Nov 7, 2021

Hi Silvia, thank you for your message and for sharing your GitHub link!

Do you know how NPMI is calculated here? The finding surprises me since C_v incorporates NPMI.

Based on the link, I still prefer using C_v. If NPMI is used as a form of topic coherence, it must also have a configuration for the other coherence 'dimensions'. If that is the case, it is one of the configurations tested in Roeder's paper (https://dl.acm.org/doi/abs/10.1145/2684822.2685324?casa_token=-izNxeG935wAAAAA:eLas7MiX6Sa0MR96ftl1DkzWDrqs5Ht_8K-umnBETLndyvXryVG11WJAguBJ3NSCj-k44Wunk7SuGQ). In this paper, he compares all possible configurations on various datasets and finds C_v to correlate highest with human interpretation. Although different coherence scores should correlate positively, I favour the coherence score with the highest correlation to human interpretation.

Emil Rijcken
Emil Rijcken

Written by Emil Rijcken

PhD candidate in Natural Language Processing

No responses yet