To illustrate this new derivative, we can denote the Jacobian by \( F' \) where \( F = f^j_i \), so that
\( F(X) = \sum_i
f^j_i (x^i) \)
is a 1-tensor (a vector), because the function is applied to the vector (and because the only remaining index is j), if the function was not applied to the vector, then it would be a matrix.
\( F'(X) = \sum_i {\nabla}_k \otimes
f^j_i (x^i) \)
is a 2-tensor (a matrix) just like the Jacobian is. But then we notice that
\( F''(X) = \sum_i {\nabla}_l \otimes {\nabla}_k \otimes
f^j_i (x^i) \)
is a 3-tensor (not a matrix) but it is just as useful as a derivative, because we can use it in a power series expansion, but not for linear functions. For linear functions \( F''(X) = 0^j_{kl} \) but for non-linear functions, it can be non-zero (but we also have to write f differently, and not as a summation).
So we have to rewrite the expressions above as:
\( F'(X) = {\nabla}_k \otimes F (X) \)
\( F''(X) = {\nabla}_l \otimes {\nabla}_k \otimes F (X) \)
but what we want is not the derivatives in general, but the derivatives at zero, for the power series expansion, so we substitute X=0:
\( F'(0^i) = {\nabla}_j \otimes F (0^i) \)
\( F''(0^i) = {\nabla}_k \otimes {\nabla}_j \otimes F (0^i) \)
now every time we do tensor contraction of these with a vector, we effectively reduce the rank by 1, so \( \sum_j {\nabla}_j \otimes F (0^i) \otimes x^j \) is a 1-tensor (or vector), and \( \sum_k \sum_j {\nabla}_k \otimes {\nabla}_j \otimes F (0^i) \otimes x^j \otimes x^k \) is also a 1-tensor (or vector), which means they are of the same dimensions, which means we can add them, which means we can use these to form a power series.
\( F(X) = \sum_i
f^j_i (x^i) \)
is a 1-tensor (a vector), because the function is applied to the vector (and because the only remaining index is j), if the function was not applied to the vector, then it would be a matrix.
\( F'(X) = \sum_i {\nabla}_k \otimes
f^j_i (x^i) \)
is a 2-tensor (a matrix) just like the Jacobian is. But then we notice that
\( F''(X) = \sum_i {\nabla}_l \otimes {\nabla}_k \otimes
f^j_i (x^i) \)
is a 3-tensor (not a matrix) but it is just as useful as a derivative, because we can use it in a power series expansion, but not for linear functions. For linear functions \( F''(X) = 0^j_{kl} \) but for non-linear functions, it can be non-zero (but we also have to write f differently, and not as a summation).
So we have to rewrite the expressions above as:
\( F'(X) = {\nabla}_k \otimes F (X) \)
\( F''(X) = {\nabla}_l \otimes {\nabla}_k \otimes F (X) \)
but what we want is not the derivatives in general, but the derivatives at zero, for the power series expansion, so we substitute X=0:
\( F'(0^i) = {\nabla}_j \otimes F (0^i) \)
\( F''(0^i) = {\nabla}_k \otimes {\nabla}_j \otimes F (0^i) \)
now every time we do tensor contraction of these with a vector, we effectively reduce the rank by 1, so \( \sum_j {\nabla}_j \otimes F (0^i) \otimes x^j \) is a 1-tensor (or vector), and \( \sum_k \sum_j {\nabla}_k \otimes {\nabla}_j \otimes F (0^i) \otimes x^j \otimes x^k \) is also a 1-tensor (or vector), which means they are of the same dimensions, which means we can add them, which means we can use these to form a power series.