05/22/2008, 12:58 AM

Usually, most tensors are used for general relativity, and other metric space related activities, so each index is between 1-3 (or 1-4 for general relativity). But here, on the other hand, each index is between 1-2, which is the number of variables we are using, or the dimension of the vector space that the big-vector-function is defined over. So for the 3-dimensional case, a (n, m)-tensor would have components, and for the 2-dimensional case (which I used earlier), a (n, m)-tensor would have components. However, this still a special case, since tensor spaces are (in general) a Cartesian product (well, tensor product, technically) of vector spaces, so when all of the vector spaces have the same dimension, then the above is true, but it is possible to form the tensor product of a 2-D and a 4-D space, but I can't think of any examples of this.

Relating matrix notation and tensor notation might also be helpful. In general, matrix multiplication is:

in matrix notation and in tensor notation (see here).

Hmm, you said you needed more basics, so I will try and cover the basics. There are two major kinds of tensor multiplication. One is called tensor product (or "vector space" tensor product on MathWorld) and the other is called tensor contraction (same on MathWorld). In some respects, these are analogous to "column-row" matrix multiplication and "row-column" matrix multiplication respectively.

A "column-row" matrix multiplication (a special case of tensor product) can be visualized as:

and a "row-column" matrix multiplication (a special case of tensor contraction) can be visualized as:

So we can see that in general, the tensor product of an (a, b)-tensor and a (c, d)-tensor will give a (a+c, b+d)-tensor, because none of the indices will go away. We can also see that the tensor contraction will decrease the total rank of a single tensor by 2, so we can think of tensor contraction as the tensor product (putting everything in a single tensor, say A), with an extra step of summing indices of that tensor, like for example. Also, since tensor contraction always requires one subscript and one superscript, the tensor contraction of an (a, b)-tensor and a (c, d)-tensor will give a (a+c-1, b+d-1)-tensor, and the tensor contraction of a single (a, b)-tensor with itself will give a (a-1, b-1)-tensor. And just in case it isn't clear by now, an (a, b)-tensor has a "tensor rank" of (a + b), so some places will refer to tensor contraction resulting in a tensor of rank (a + b - 2).

While I was thinking about this, I was modeling with the Mathematica dot operator (see under Properties & Relations), which is a generalization of the dot product and matrix multiplication, only for tensors. Formally, there is no "single" tensor contraction, because the convention is that whenever you find a repeated index variable (for example: k) used as both a superscript and a subscript, then there is a tensor contraction that sums both of those over all possible values of k (for example: 1, 2, 3 for space vectors). However, the Mathematica dot operator is a special case of tensor contraction that assumes that the inner-most indices are being summed over.

So for complicated tensors (and ) this gives:

which is actually 4 tensor contractions (because it uses 4 repeated indices), but

is only 1 tensor contraction (which uses 1 repeated index), and this is what is meant by the generalized "dot" operator. The only problem with this kind of tensor contraction is that it can only be applied once, and it is really the first case that we need for tensor power series, not the second case. But because the first case is so hard to write, I invented the notation to express this. The reason why I chose this notation, is because it is similar to our iteration notation , which is supposed to be a sort of "power" where composition is the "multiplication". So noticing that it just seemed natural to use this notation. What do you think? Would be a better notation? I think it is hard to find a good notation, because the process of "wrapping" up the x's into a tensor requires (tensor product), but as soon as you try and do tensor contraction with , then it requires (Mathematica dot) with each x separately, I'm really not sure what the best way to write this is. Since tensor notation was designed for these difficulties, it might be best to stay with it.

I think one place where the notation gets really confusing is evaluating the multiple-gradient at zero (or whatever the expansion point is). Re-reading my earlier posts, I get the impression that F(0) could be interpreted as apply F to the zero vector, then take the multiple-gradients, but what is intended is to apply the multiple-gradients, then evaluate this "derivative"-like thing at the zero vector.

Relating matrix notation and tensor notation might also be helpful. In general, matrix multiplication is:

in matrix notation and in tensor notation (see here).

Hmm, you said you needed more basics, so I will try and cover the basics. There are two major kinds of tensor multiplication. One is called tensor product (or "vector space" tensor product on MathWorld) and the other is called tensor contraction (same on MathWorld). In some respects, these are analogous to "column-row" matrix multiplication and "row-column" matrix multiplication respectively.

A "column-row" matrix multiplication (a special case of tensor product) can be visualized as:

and a "row-column" matrix multiplication (a special case of tensor contraction) can be visualized as:

So we can see that in general, the tensor product of an (a, b)-tensor and a (c, d)-tensor will give a (a+c, b+d)-tensor, because none of the indices will go away. We can also see that the tensor contraction will decrease the total rank of a single tensor by 2, so we can think of tensor contraction as the tensor product (putting everything in a single tensor, say A), with an extra step of summing indices of that tensor, like for example. Also, since tensor contraction always requires one subscript and one superscript, the tensor contraction of an (a, b)-tensor and a (c, d)-tensor will give a (a+c-1, b+d-1)-tensor, and the tensor contraction of a single (a, b)-tensor with itself will give a (a-1, b-1)-tensor. And just in case it isn't clear by now, an (a, b)-tensor has a "tensor rank" of (a + b), so some places will refer to tensor contraction resulting in a tensor of rank (a + b - 2).

While I was thinking about this, I was modeling with the Mathematica dot operator (see under Properties & Relations), which is a generalization of the dot product and matrix multiplication, only for tensors. Formally, there is no "single" tensor contraction, because the convention is that whenever you find a repeated index variable (for example: k) used as both a superscript and a subscript, then there is a tensor contraction that sums both of those over all possible values of k (for example: 1, 2, 3 for space vectors). However, the Mathematica dot operator is a special case of tensor contraction that assumes that the inner-most indices are being summed over.

So for complicated tensors (and ) this gives:

which is actually 4 tensor contractions (because it uses 4 repeated indices), but

is only 1 tensor contraction (which uses 1 repeated index), and this is what is meant by the generalized "dot" operator. The only problem with this kind of tensor contraction is that it can only be applied once, and it is really the first case that we need for tensor power series, not the second case. But because the first case is so hard to write, I invented the notation to express this. The reason why I chose this notation, is because it is similar to our iteration notation , which is supposed to be a sort of "power" where composition is the "multiplication". So noticing that it just seemed natural to use this notation. What do you think? Would be a better notation? I think it is hard to find a good notation, because the process of "wrapping" up the x's into a tensor requires (tensor product), but as soon as you try and do tensor contraction with , then it requires (Mathematica dot) with each x separately, I'm really not sure what the best way to write this is. Since tensor notation was designed for these difficulties, it might be best to stay with it.

I think one place where the notation gets really confusing is evaluating the multiple-gradient at zero (or whatever the expansion point is). Re-reading my earlier posts, I get the impression that F(0) could be interpreted as apply F to the zero vector, then take the multiple-gradients, but what is intended is to apply the multiple-gradients, then evaluate this "derivative"-like thing at the zero vector.