负采样

负采样

两种负采样

第一种负采样

采样方式为:

<u,i0>,<u,i1>,<u,i2>,<u,i3>,<u,i4>,<u,i5>\begin{aligned} <u, i_0>,\\ <u, i_1>,\\ <u, i_2>,\\ <u, i_3>,\\ <u, i_4>,\\ <u, i_5> \end{aligned}

其中,i0i_0为正样本,i1, i2,...,i5i_1,\ i_2, ... , i_5为负样本。

损失计算方式为

loss=p0Πi=15(1pi)loss=log(p0)i=15log(1pi)=log(11+exp(s0))i=15log(111+exp(si))\begin{aligned} \text{loss}&=-p_0\mathop{\Pi}_{i=1}^5(1-p_i)\\ \Rightarrow\text{loss}&=-log(p_0)-\sum_{i=1}^5log(1-p_i)\\ &=-log\left(\frac{1}{1+\text{exp}(-s_0)}\right)-\sum_{i=1}^5log\left(1-\frac{1}{1+\text{exp}(-s_i)}\right)\\ \end{aligned}

第二种负采样

采样方式为:

<u,i0,i1,i2,i3,i4,i5><u, i_0, i_1, i_2, i_3, i_4, i_5>

其中,i0i_0为正样本,i1, i2,...,i5i_1,\ i_2, ... , i_5为负样本。

损失计算方式为

loss=exp(s0)i=05exp(si)loss=(s0logi=05exp(si))=s0+logi=05exp(si)\begin{aligned} &\text{loss}=-\frac{\text{exp}(s_0)}{\sum_{i=0}^5\text{exp}(s_i)}\\ \Rightarrow &\text{loss}=-(s_0-\text{log}\sum_{i=0}^5\text{exp}(s_i))=-s_0+\text{log}\sum_{i=0}^5\text{exp}(s_i) \end{aligned}

两种负采样的本质区别

对两种负采样的损失函数分别求梯度。

第一种负采样求梯度

losss0=[log(11+exp(s0))i=15log(111+exp(si))]s0=p0(1p0)p0=1+p0losssii=1,2,...,5=[log(11+exp(s0))i=15log(111+exp(si))]si=pi(1pi)1pi=pi\begin{aligned} \frac{\partial \text{loss}}{\partial s_0}&=\frac{\partial\left[-log\left(\frac{1}{1+\text{exp}(-s_0)}\right)-\sum_{i=1}^5log\left(1-\frac{1}{1+\text{exp}(-s_i)}\right)\right]}{\partial s_0}\\ &=-\frac{p_0(1-p_0)}{p_0}\\ &=-1+p_0\\ \frac{\partial \text{loss}}{\partial s_i}|_{i=1,2,...,5}&=\frac{\partial\left[-log\left(\frac{1}{1+\text{exp}(-s_0)}\right)-\sum_{i=1}^5log\left(1-\frac{1}{1+\text{exp}(-s_i)}\right)\right]}{\partial s_i}\\ &=-\frac{p_i(1-p_i)}{1-p_i}\\ &=p_i\\ \end{aligned}

其中,上式的推导用到了 sigmoid导数的特点:f(z)=f(z)(1f(z))f'(z)=f(z)(1-f(z))

注意,这里的

i=05losssi=1+i=05pi0\begin{aligned} \sum_{i=0}^5\frac{\partial \text{loss}}{\partial s_i}=-1+\sum_{i=0}^5p_i\neq0 \end{aligned}

所有的变量sis_i的梯度的和不是1

第二种负采样求梯度

losss0=[(s0logi=05exp(si))]s0=1+exp(s0)i=05exp(si)=1+p0losssii=1,2,...,5=[(s0logi=05exp(si))]si=exp(si)i=05exp(si)=pi\begin{aligned} \frac{\partial \text{loss}}{\partial s_0}&=\frac{\partial[-(s_0-\text{log}\sum_{i=0}^5\text{exp}(s_i))]}{\partial s_0}\\ &=-1+\frac{\text{exp}(s_0)}{\sum_{i=0}^5\text{exp}(s_i)}\\ &=-1+p_0\\ \frac{\partial \text{loss}}{\partial s_i}|_{i=1,2,...,5}&=\frac{\partial[-(s_0-\text{log}\sum_{i=0}^5\text{exp}(s_i))]}{\partial s_i}\\ &=\frac{\text{exp}(s_i)}{\sum_{i=0}^5\text{exp}(s_i)}\\ &=p_i\\ \end{aligned}

注意:所有变量的梯度值相加等于1:

i=05losssi=1+i=05pi=1+1=0\begin{aligned} \sum_{i=0}^5\frac{\partial \text{loss}}{\partial s_i}=-1+\sum_{i=0}^5p_i=-1+1=0 \end{aligned}

所以,每个变量sis_i的梯度更新值是相互制约的,总和等于1

这就是两种负采样的本质区别,即所有的变量sis_i的梯度的和是否等于1

Last updated