Automatic Evaluation - Murnitur

On this page

Initial Setup
Evaluation Run

Automatic evaluations in Murnitur AI are initiated directly from the user interface, providing a streamlined process for assessing LLM performance.

Initial Setup

Upload a Test Dataset: Begin by uploading an evaluation dataset. This dataset should be in CSV format and can contain any headers. However, it must include the following columns:
- context
- ground_truth
- retrieval_context (optional)
Download the Template: To ensure your dataset is correctly formatted, you can download the template here.

Evaluation Run

Go to AI Evaluations in the sidebar and click on the New Evaluation Run button.

Choose Preset

Configure LLM Model

Select Evaluation Dataset

Choose Evaluation Metrics

Result

Human Evaluation Custom Evaluation