Module 13: Machine Learning in Neuroscience

Lesson Flow

Learn

Goals and Concepts

Start with the capability target and concept set for this module.

Practice

Studio Activity

Apply the ideas in a guided activity tied to realistic outputs.

Check

Assessment Rubric

Use the rubric to verify competency and identify improvement targets.

Teach

Slides and Worksheets

Open the teaching deck, worksheet, and editable slide source.

Interactive Lab

Practice in short loops: checkpoint quiz, microtask decision, and competency progress tracking.

Checkpoint Quiz

Q1. Which output most clearly demonstrates module competency? A measurable artifact linked to an explicit method A general reflection about interest in neuroscience A list of future goals without evidence

Competency is shown through measurable, method-linked evidence.

Q2. What should always accompany a technical claim in this curriculum? A limitation or uncertainty statement A motivational quote A citation count

Every claim should include boundaries and uncertainty.

Q3. What is the best next step after identifying a gap in understanding? Define a concrete practice task and verification criterion Skip to a harder module Ignore the gap if progress is slow

Progress improves when gaps become explicit practice targets.

Microtask Decision

Choose the action that best improves scientific reliability.

Record method, version, and decision rationale before sharing results Share results quickly and document details later Rely on memory for why decisions were made

Progress Tracker

State is saved locally in your browser for this module.

Read capability target
Complete studio activity
Review assessment rubric
Pass checkpoint quiz
Complete microtask
Complete annotation challenge

0% complete

Annotation Challenge

Click the hotspot with the strongest evidence for the requested feature.

Selected hotspot: none

Capability target

Design and critique an ML analysis pipeline for connectomics that includes feature rationale, evaluation plan, leakage controls, and interpretation limits.

Why this module matters

ML can accelerate connectomics analysis, but naive workflows produce misleading biological claims. This module emphasizes model validity, not just model performance.

Concept set

1) Feature engineering defines the hypothesis space

Technical: feature choices encode assumptions about what variation is biologically meaningful.
Plain language: your model can only learn what your features allow.
Misconception guardrail: adding more features always improves science.

2) Evaluation must match biological use

Technical: metrics should align with downstream decisions (for example, class-specific recall for rare but critical cell types).
Plain language: high overall accuracy can still fail where it matters most.
Misconception guardrail: one summary metric is enough.

3) Leakage and shift are endemic in connectomics

Technical: spatial adjacency, reconstruction provenance, and shared preprocessing can leak signal across train/test splits.
Plain language: your model may be “cheating” without obvious signs.
Misconception guardrail: random split always gives valid generalization estimates.

Hidden curriculum scaffold

Unspoken ML norms trainees need explicitly:
- justify split strategy before training.
- report failure cases with examples, not only aggregate metrics.
- include model-card style limitations and intended use.
Mentoring supports:
- provide leakage checklist template.
- require one “where model fails” figure.
- review scientific usefulness, not just benchmark score.

Core workflow: connectomics ML protocol

Define task and biological decision context.
Construct feature set with rationale and preprocessing log.
Choose split strategy that blocks leakage pathways.
Train baseline + candidate models and compare error profiles.
Report metrics, limitations, and deployment constraints.

60-minute tutorial run-of-show

**00:00-08:00 Task framing and leakage examples**
**08:00-20:00 Feature rationale workshop**
**20:00-34:00 Split strategy and baseline modeling**
**34:00-46:00 Error analysis and biologically relevant metrics**
**46:00-56:00 Model-card limitation writing**
**56:00-60:00 Competency checkpoint**

Studio activity: leakage-resistant ML mini-pipeline

Scenario: You need to classify neurite fragments into coarse categories for downstream proofreading prioritization.

Tasks

Propose feature set and leakage-safe split design.
Train one baseline and one improved model (or pseudocode plan).
Report two standard metrics and one biologically targeted metric.
Draft a model limitation statement with non-supported use cases.

Expected outputs

Feature + split design sheet.
Metric table with interpretation notes.
Limitation statement.

Assessment rubric

Minimum pass
- Feature and split decisions are justified.
- Metrics include at least one biologically targeted criterion.
- Limitation statement is specific and actionable.
Strong performance
- Identifies and mitigates likely leakage channels.
- Uses error analysis to propose next data improvements.
- Distinguishes exploratory model from deployment-ready model.
Common failure modes
- Leakage-prone random splits for spatially correlated data.
- Overfocus on aggregate accuracy.
- Claims of biological insight unsupported by model diagnostics.

Teaching resources

Upstream data context: Module 12
Downstream morphology/classification: Module 14
Technical track context: Connectome Analysis and NeuroAI
Quality context: Connectome Quality tool

Quick practice prompt

For one candidate model, write:

one plausible leakage pathway,
one metric blind spot,
one limitation you would report publicly.

Teaching Materials

Slide Deck

Classroom-ready deck links for teaching and delivery.

Open slide deck page

Open rendered HTML deck

Download PowerPoint (.pptx)

Activity Worksheet

Learner worksheet aligned to the studio activity and rubric.

Open worksheet

Slide Source

Marp source file for editing and rendering.

course/decks/marp/modules/module13.marp.md