Module 1: Text preprocessing and linguistic signals#
AINS6004 — Natural Language Processing
Essential Question#
What is lost and gained when language becomes data?
Scenario#
a product team evaluating an NLP workflow before using it in customer-facing communication
Stakeholders: product manager, support lead, privacy reviewer, and model evaluator
Core Moves#
Define the decision boundary
Compare baseline and alternative
Interpret evidence and assumptions
Identify failure modes
Recommend next action
Lab & Assignment#
Compare tokenization choices on a small corpus.
Artifact: NLP evaluation packet with task framing, retrieval/evaluation design, and deployment guardrails focused on text preprocessing and linguistic signals: Compare tokenization choices on a small corpus.