Evals
Automatic Evaluation
Automatic evaluations in Murnitur AI are initiated directly from the user interface, providing a streamlined process for assessing LLM performance.
Initial Setup
-
Upload a Test Dataset: Begin by uploading an evaluation dataset. This dataset should be in CSV format and can contain any headers. However, it must include the following columns:
context
ground_truth
retrieval_context
(optional)
-
Download the Template: To ensure your dataset is correctly formatted, you can download the template here.
Evaluation Run
Go to AI Evaluations in the sidebar and click on the New Evaluation Run button.
- Choose Preset
- Configure LLM Model
- Select Evaluation Dataset
- Choose Evaluation Metrics
- Result