Skip to content

axEvalUtil

Variable: axEvalUtil

const axEvalUtil: object

Type declaration

emScore()

emScore: (prediction, groundTruth) => boolean

Calculates the Exact Match (EM) score between a prediction and ground truth.

The EM score is a strict metric used in machine learning to assess if the predicted answer matches the ground truth exactly, commonly used in tasks like question answering.

Parameters

prediction: string

The predicted text.

groundTruth: string

The actual correct text.

Returns

boolean

A boolean indicating if the prediction exactly matches the ground truth.

f1Score()

f1Score: (prediction, groundTruth) => number

Calculates the F1 score between a prediction and ground truth.

The F1 score is a harmonic mean of precision and recall, widely used in NLP to measure a model’s accuracy in considering both false positives and false negatives, offering a balance for evaluating classification models.

Parameters

prediction: string

The predicted text.

groundTruth: string

The actual correct text.

Returns

number

The F1 score as a number.

novelF1ScoreOptimized()

novelF1ScoreOptimized: (history, prediction, groundTruth, returnRecall) => number

Calculates a novel F1 score, taking into account a history of interaction and excluding stopwords.

This metric extends the F1 score by considering contextual relevance and filtering out common words that might skew the assessment of the prediction’s quality, especially in conversational models or when historical context is relevant.

Parameters

history: string

The historical context or preceding interactions.

prediction: string

The predicted text.

groundTruth: string

The actual correct text.

returnRecall: boolean = false

Optionally return the recall score instead of F1.

Returns

number

The novel F1 or recall score as a number.

Defined in

src/ax/dsp/eval.ts:143