Mid-Term Project β€” DATS 2102: Data Visualization for Data Science

A project assignment for the first half of the course (Weeks 1–6).


🎯 Objectives

By mid-semester, you will:

  • Apply foundational data visualization techniques (Weeks 1–6).
  • Demonstrate mastery of:
    • environment setup & reproducible notebooks,
    • tidy data principles & visual encodings,
    • distributions & variation,
    • wrangling with pandas,
    • perception-based design principles,
    • fair and effective comparisons.
  • Produce a mini data story using 2–3 datasets.

πŸ“– Project Description

Select a real-world dataset (from provided sources or external datasets of interest). Using the tools and concepts learned in the first six weeks, create a narrative notebook that:

  1. Introduces the dataset and research question(s).
  2. Cleans, reshapes, and prepares the data for visualization, demonstrating core pandas wrangling: selection/filtering, sorting, grouping + aggregation, joins/merges, and tidy reshaping.
  3. Produces at least 6–8 visualizations, including:
    • At least one distribution plot (histogram/KDE/boxplot/ECDF).
    • At least one comparison plot (dot plot, slope chart, or small multiples).
    • At least one of your own visualizations revised and improved by reflecting on perception principles, showing how thoughtful design choices enhance clarity and fairness.
    • At least one visualization with clear text/labels/annotations.
  4. Applies best practices for choice of color, scales, and labeling.
  5. Provides a written narrative explaining insights, choices, and design considerations.

πŸ“¦ Deliverables

  • Jupyter Notebook with all code, markdown explanations, and charts.
  • Rendered HTML file (via Quarto).
  • A short reflective essay (300–500 words) addressing:
    • What challenges did you face in cleaning/visualizing the data?
    • How did perception/design principles guide your choices?
    • Which visualization best communicates your main insight, and why?

πŸ“Š Suggested Datasets


πŸ—“οΈ Timeline

  • Final Submission (Deadline: March 22)

🧾 Grading Rubric (20 pts total)

  • Data Wrangling & Preparation (4 pts): Appropriate cleaning, filtering, and reshaping.
  • Variety of Visualizations (5 pts): Includes required chart types; demonstrates range.
  • Application of Principles (4 pts): Perception, scales, baselines, labeling.
  • Narrative & Reflection (4 pts): Clear storyline; thoughtful discussion of design choices.
  • Technical Quality (3 pts):Β The notebook runs cleanly, is reproducible, and is well-organized.

βœ… Submission Checklist

Before submitting, make sure:

  • Your assignment has fulfilled all the basic requirements listed above.
  • The visualizations and your reflections in the Jupyter notebook are properly organized and displayed.
  • Use Quarto to render the notebook into HTML and zip the files for submission.