Word Error Rate
Word error rate (WER) is a metric used to assess speech recognition systems. After aligning the predicted text with the reference (correct) text via dynamic programming, it is defined as WER = (S+D+I)/N, where S = number of substitutions, D = number of deletions, I = number of insertions, and N = number of words in the reference text.