Module 7 Overview

Module 7 Overview#

Theme#

Evaluation for NLP systems

Essential Question#

Why are output quality and factuality hard to measure?

Module Components#

Book prose: conceptual framing, domain scenario, methods, and failure modes
Assignment: evidence-backed production of a specific artifact
Slides: presentation sequence for seminar or lecture delivery
Narration: spoken version of the slide flow
Instructor notes: facilitation plan, discussion prompts, and grading cues
Rubric: criteria for evaluating the module artifact
Notebook: executable lab aligned with the module theme using synthetic support messages, retrieval snippets, intent labels, and factuality checks

Module Artifact#

NLP evaluation packet with task framing, retrieval/evaluation design, and deployment guardrails focused on evaluation for nlp systems: Create an evaluation set with rubrics and automated checks.

Professional Setting#

Students work as if advising a product team evaluating an NLP workflow before using it in customer-facing communication. Their work must be intelligible to product manager, support lead, privacy reviewer, and model evaluator.