Bias vs. Variance

Statistics

I don't know about others but when I started digging into machine-learning I had some problems understanding bias-variance until I found a nice target shooting analogy.

Target

Out little target chart :) with some params.

  import numpy as np
  color = hue(1)
  size = 50
  p = plot([], figsize=7)
  p += plot(circle((0,0), 1))
  p += plot(circle((0,0), 3))
  p += plot(circle((0,0), 6))
  p

High bias and high variance

Worst-case scenario where hits are all over the places (high variance) and far away from the center of target, top-right skewed (high bias).

  variance = 5
  bias = 3
  samples = variance * np.random.random_sample((15, 2)) + bias
  hbhv = p + plot(point(samples, rgbcolor=color, size=size))
  hbhv

../bvsv-hbhv.png

High bias and low variance

This time the shots are focused (low variance) within a small area that is still far away (high bias) from the target.

  variance = 2
  bias = 3
  samples = variance * np.random.random_sample((15, 2)) + bias
  hblv = p + plot(point(samples, rgbcolor=color, size=size))
  hblv

../bvsv-hblv.png

Low bias and high variance

This time hits are very close to center (low bias) but hits are spread all over the places (high variance) again.

  variance = 4
  bias = 0.5
  samples = variance * np.random.random_sample((15, 2)) + bias
  lbhv = p + plot(point(samples, rgbcolor=color, size=size))
  lbhv

../bvsv-lbhv.png

Low bias and low variance

The best case scenario, focused (low variance) and close to center (low bias).

  variance = 1.5
  bias = 0.5
  samples = variance * np.random.random_sample((15, 2)) + bias
  lblv = p + plot(point(samples, rgbcolor=color, size=size))
  lblv

../bvsv-lblv.png

Happy shooting!!!

comments powered by Disqus