Evaluate Text Result
The Evaluate Text Result node allows you to evaluate text outputs against specific criteria using Griptape's Eval Engine. This node is useful for validating AI-generated content, checking factual accuracy, or assessing the quality of text outputs.
Inputs
-
Examples (Property): Choose from preset examples or create your own evaluation
-
Options:
- Choose a preset..
- Paraphrase
- Factual
- Analogy
-
-
Input (Input/Property): The input text to be evaluated
- Supports multiline text input
-
Expected Output (Input/Property): The expected or reference output text
- Single line text input
-
Actual Output (Input/Property): The actual output text to be evaluated
- Single line text input
-
Criteria (Input/Property): The evaluation criteria to use
- Supports multiline text input
- Example: "Does the output accurately paraphrase the input without losing meaning?"
Outputs
-
Score (Output): A float value between 0 and 1 representing the evaluation score
- 1.0 indicates perfect match
- 0.0 indicates complete mismatch
-
Reason (Output): A detailed explanation of the evaluation result
- Provides feedback on why the score was given
- Explains any discrepancies found
Example Usage
Paraphrase Evaluation
Input: "The quick brown fox jumps over the lazy dog."
Expected Output: "A swift brown fox leaps above a sleeping dog."
Actual Output: "A fast fox jumps over a dog that's not awake."
Criteria: "Does the output accurately paraphrase the input without losing meaning?"
Factual Evaluation
Input: "The capital of France is Paris."
Expected Output: "Paris is the capital city of France."
Actual Output: "France's capital is Paris."
Criteria: "Is the output factually correct based on the input?"
Analogy Evaluation
Input: "A bird is to sky as a fish is to ______."
Expected Output: "water"
Actual Output: "concrete"
Criteria: "Does the output correctly complete the analogy?"
Notes
- The node uses Griptape's Eval Engine to perform the evaluation
- The evaluation is based on the provided criteria
- The score is normalized between 0 and 1
- The reason provides detailed feedback about the evaluation
- You can use preset examples or create your own custom evaluations