Public app plus private research track

SmartBreeds.io

Dog-breed photo search backed by a calibrated fine-grained vision study on a Tsinghua Dogs subset.

0.968 Global RAPS coverage at mean set size 2.59.
0.9165 Label-conditional coverage after Mondrian evaluation.
0.60 Worst observed class, bluetick, under label-conditional RAPS.

Current research result

Aggregate calibration is strong. Class coverage is still the hard part.

The Tsinghua100 dense run uses 8,000 images across 100 breeds. DINOv2-small prototype scores are temperature-scaled, then evaluated with RAPS conformal prediction sets.

  • Top-1 accuracy is 0.846 and ECE is 0.051 after temperature scaling.
  • Selected RAPS reaches 0.968 aggregate coverage with mean set size 2.59.
  • Per-class disaggregation exposes weak breeds instead of hiding them behind aggregate metrics.
Synthetic dog lineup for SmartBreeds research preview

96-second research preview

The case study embeds the video with captions, the charts, and the weak-class tables.

Measured, not inflated

The useful story is narrower than a product claim.

SmartBreeds is not presented as a final benchmark or a deployed guarantee. The result is a reproducible calibration study with visible failure modes.

8,000 Tsinghua100 subset images in the dense run.
0.846 Top-1 accuracy from DINOv2-small prototypes.
0.051 Expected calibration error after temperature scaling.
2.59 Mean selected RAPS set size at 0.968 coverage.

Reliability visuals

Charts are part of the claim boundary.

These figures are public-safe research visuals. Dataset-derived dog photos stay private until the license review is complete.

Reliability diagram for the Tsinghua100 dense run
Temperature scaling brings confidence closer to observed accuracy.
Per-class coverage chart for Tsinghua100
Per-class coverage reveals weak breeds that aggregate coverage can hide.

Reliability readout

Quantity Value Scope
ECE 0.0508 2,000-image test split
Top-1 0.8455 DINOv2-small prototypes
0.9-1.0 bin 0.968 confidence / 0.976 accuracy 918 predictions

Lowest global RAPS classes

Breed Coverage Mean set size
great_dane 0.80 3.45
lhasa 0.85 2.35
tibetan_mastiff 0.85 2.50

Next research gate

Weak-class coverage gets priority over bigger claims.

The next work targets lhasa, tibetan mastiff, great dane, and the worst Mondrian class. The research question is whether structured pooling can tighten sets without burying class-specific failures.