In the following illustration we render the gradient of some $loss(x,y)$ with respect to $x$ where
⋅ $x∈ℝ^2$ is pixel coordinates on screen (hypothetical current value to be optimized),
⋅ $y∈ℝ^2$ is the mouse's position (target value).
mouse mode
grid size
amplitude
noise amplitude
noise frequence space
noise frequence time
display mode
motion length
motion speed
color (phase=|loss|)
color speed
$MseLoss(x,y) = mean( ⟨x-y|x-y⟩ )$
$log(MseLoss(x,y))$
$CosLoss(x,y) = 1 - CosSim(x,y) = 1 - \frac{⟨x|y⟩}{|x||y|}$
$DotLoss(x,y) = MSE(1, \comment{DotSim(x,y)}{:=⟨x|y⟩/⟨y|y⟩})$
$|1 - DotSim(x,y)|$ is a poor choice
$DotLoss(y,x) ≠ DotLoss(x,y)$
$GenSim(x,y) = \frac{⟨x|y⟩}{|x|^α|y^{2-α}|}$ with $α∈[0,2]$ generalizes $DotSim$ and $CosSim$
we minimize $GenLoss(x,y) = MSE(GenSim(x,y)_α,GenSim(y,y)_α)$
α
$GenLoss(x,y) = MSE(GenSim(x,y)_α,GenSim(y,y)_α)$
$GenLoss(y,x)$
β
$sigmoid((CosSim(x,y)-1)⋅β) ⋅ DotLoss(x,y)$
$sigmoid((CosSim(x,y)-1)⋅β) ⋅ ∇(DotLoss(x,y))$
$... + CosLoss$
$... + ∇CosLoss$
mse
cos
dot
dot-v2
dot-v3
normalize
softmax weight
weighted sum
softmax by gradient amplitude
max, is a poor choice (compare gradient, not loss values)