Case Study
This project started as a model experiment. The workflow, dataset boundary, and limitations are explicit so the story stays honest for readers of the case study and anyone reproducing from the repo. A live demo runs on a separate host so you can try uploads in the browser.
Demo is a narrow snake vs no-snake experiment and should not be treated as species identification or field-safe classification.
Open the live demo in a new tab. Upload a photo and get a narrow snake vs no-snake prediction from the current public build.
Note: the demo may take a few seconds to wake on first load while the host cold-starts.
The live demo uses a real-photo iNaturalist-trained Keras model. The model file stays out of normal git history, but the release package mirrors the model, deployment config, held-out metrics, threshold sweep, confusion matrix, training curves, sample predictions, and checksum manifest.
The held-out split is 255 snake images and 1,028 no-snake images at threshold 0.76. The snake recall number is why this page keeps the safety boundary explicit: the demo can miss snakes and is not field-safe wildlife software.
Most of the value is in dataset hygiene, fixed evaluation splits, confusion-driven review, and refusing to let aggregate accuracy hide weak classes. That discipline transfers directly to larger vision projects where bad predictions fail quietly in production.
Data
Capture raw images and label sources. This is where class imbalance and label quality risks start.
| Artifact | Purpose |
|---|---|
| Stratified train/val split | Keep class ratios stable so headline metrics track the same class mix across runs. |
| Augmentation policy (logged) | Log image-level changes so augmentation stays comparable run to run. |
| Confusion matrix + structured error review | Surface which classes get confused before touching model depth or width. |
| Run folder (config + metrics snapshot) | Reproduce any reported number without guessing which code version produced it. |
The repo holds the full training flow, split configuration, and evaluation scripts. The live demo is intentionally narrow so visitors can try the behavior without mistaking it for a general-purpose classifier.
No questions or comments yet. Sign in with GitHub to leave the first one.