Here I will explain how to compute particular cases of matrix derivatives using tensor notation. First, a simple and quite intuitive rule for determining derivatives of matrix components with respect to an arbitrary matrix component is stated.
∂Xkl∂Xij=δikδjl We'll first begin by exposing the differentiation of simple expressions involving traces of the matrix with which we differentiate and traces of products of constant matrices with the variable matrix. (1), along with the product rule are the protagonists in all these manipulations.
∂X∂Tr(X)=I To solve this first problem notice that the trace of X in tensor notation is written as Xii, that is, with same first and last indices
∂X∂Tr(X)=∂Xkl∂Xii=δikδil=δkl=I where the last equality is true because the derivative is a second-order tensor whose indices must be the same as the ones used to derivate.
∂X∂Tr(XA)=AT For this situation we use the fact that dot product of tensors must have equal adjacent indices between terms
∂X∂Tr(XA)=∂Xkl∂(XijAji)=∂Xkl∂XijAji=δikδjlAji=Alk=(AT)kl=AT The same problem but with the transpose is tackled in the same fashion
∂X∂Tr(XTA)=∂Xkl∂(XijTAji)=∂Xkl∂XjiAji=δjkδilAji=Akl=AT As for the product of X with two constant matrices, there are two index concatenations in tensorial notation
∂X∂Tr(AXB)=∂Xpq∂(AijXjkBki)=∂Xpq∂XjkAijBki=δjpδkqAijBki=AipBqi=BqiAip=(BA)qp=(BA)pqT=(BA)T=ATBT and swapping X for XT
∂X∂Tr(AXTB)=∂Xpq∂(AijXjkTBki)=∂Xpq∂XkjAijBki=δkpδjqAijBki=AiqBpi=BpiAiq=(BA)pq=BA ∂X∂Tr(X2)=∂Xpq∂(XijXji)=∂Xpq∂XijXji+Xij∂Xpq∂Xji=δipδjqXji+Xijδjpδiq=Xqp+Xqp=2(XT)pq=2XT ∂Xpq∂Tr(X2B)=∂Xpq∂(XijXjkBki)=∂Xpq∂XijXjkBki+Xij∂Xpq∂XjkBki=δipδjqXjkBki+XijδjpδkqBki=XqkBkp+XipBqi=(XB)qp+BqiXip=(XB)qp+(BX)qp=(XB+BX)pqT=(XB+BX)T ∂X∂Tr(XTBX)=∂Xpq∂(XjiBjkXki)=∂Xpq∂XjiBjkXki+XjiBjk∂Xpq∂Xki=δjpδiqBjkXki+XjiBjkδkpδiq=BpkXkq+XjqBjp=(BX)pq+(BT)pjXjq=(BX)pq+(BTX)pq=(BX+BTX)pq=BX+BTX ∂X∂Tr(XBXT)=∂Xpq∂(XijBjkXik)=∂Xpq∂XijBjkXik+XijBjk∂Xpq∂Xik=δipδjqBjkXik+XijBjkδipδkq=BqkXpk+XpjBjq=Xpk(BT)kq+(XB)pq=(XBT)pq+(XB)pq=(XBT+XB)pq=XBT+XB ∂X∂Tr(AXBX)=∂Xpq∂(AijXjkBklXli)=Aij∂Xpq∂XjkBklXli+AijXjkBkl∂Xpq∂Xli=AijδjpδkqBklXli+AijXjkBklδlpδiq=AipBqlXli+AqjXjkBkp=(AT)pi(XT)il(BT)lq+(AXB)qp=(ATXTBT)pq+(AXB)pqT=(ATXTBT+(AXB)T)pq=ATXTBT+BTXTAT ∂X∂Tr(ATXCAX)=∂Xpq∂(AjiXjkCklAlmXmi)=Aji∂Xpq∂XjkCklAlmXmi+AjiXjkCklAlm∂Xpq∂Xmi=AjiδjpδkqCklAlmXmi+AjiXjkCklAlmδmpδiq=ApiCqlAlmXmi+AjqXjkCklAlp=Api(XT)im(AT)ml(CT)lq+(AT)pl(CT)lk(XT)kjAjq=(AXTATCT)pq+(ATCTXTA)pq=AXTATCT+ATCTXTA ∂X∂Tr(XTBXC)=∂Xpq∂(XjiBjkXklCli)=δjpδiqBjkXklCli+XjiBjkδkpδlqCli=BpkXklClq+XjiBjpCqi=(BXC)pq+(BT)pjXji(CT)iq=(BXC)pq+(BTXC)pq=BXC+BTXC ∂X∂Tr(AXBXTC)=∂Xpq∂(XjiBjkXklCli)=δjpδiqBjkXklCli+XjiBjkδkpδlqCli=BpkXklClq+BjpXjiCqi=(BXC)pq+(BTXCT)pq=BXC+BTXCT