Evaulation Guide

Overview

The Videlicet evaulation feature allow you to measure and score the difference between a manually translated page and a machine translated page. It is assumed that the manually transcribed page is of better quality than the machine translated page. Based on scores, you can adjust your prompt, and rerun the evaulation to generate a more accurate translation. Videlicet uses Levenshtein distance to measure the difference between the two pages.

How to create an evaulation

Navigate to the Evals tab on the main toolbar. You will see all previously created evaulations on this page. To create a new evaluation, select the “New Eval” button.

Select the prompt to use for evaulation

Once in the Evals, select the Prompt, Scorers, and Extraction to use for evaulation.

The Prompt is a previously generated prompt found in the Prompts screen. The associated label for this prompt, will have a manually translated page already created.

Select the extraction that you want to use with the evaluation. Setting up Eval

You can also see extractions on the Prompt Page: Prompt eval list

Ensure that the page has a manual translation by viewing it on the “Browse Page”: Checking for Translation

After you run the evaluation by clicking the “Evaluate” button, you can view the evaluation below: Evaulation Results

The Levenshtein distance score average was 83% accurate. You can modify the prompt and rerun to create a more accurate translation.