09/28/2007, 03:31 AM

I've already touched on this, but I did want to describe somewhat briefly how I've accelerated convergence of Andrew's slog, with a more detailed description to follow when I have the time.

First of all, let's make the assumption that the solution to the infinite system exists, and furthermore, that the resultant power series has a non-zero radius of convergence. Although heuristically the evidence is fairly strong, we don't know enough to formally prove it. However, like the Riemann hypothesis, we can still make some solid conclusions which we know are true IF the hypothesis is true.

Anyway, based on these assumptions, let's try to figure out how the convergence of the smaller systems works. First, let's start with the infinite system. We'll call the matrix M, and the column vector [1, 0, 0, ...] we'll call B. The coefficients A of the slog are then the solution of the equation MA = B.

Now, below a certain row, say, row 100, let's replace the rows with the corresponding entry from an infinite identity matrix. So rows 1 through 100 will stay the same, but 101 to infinity are going to be changed. In order to do this, we replace the zero in the column vector B with the actual coefficient in A we're trying to solve for. This requires knowing the coefficients in A, obviously.

Now it should be clear how Andrew's slog is converging on the true solution. We're solving for the infinite system, using the modified format the I described above, but instead of replacing the values in B with the true values, we're replacing them with 0's. This introduces errors in the first 100 (or n) coefficients to compensate.

In order to improve convergence, what I've done is taken the coefficients for the two known singularities at 0.31813... + 1.3372357i and its conjugate, and put them in the B vector, rather than 0's. These coefficients aren't the true values from A, but they're much, much, much closer to the true values. This reduces the errors in the final solution by several orders of magnitude.

Notice that we don't actually calculate the solution to the full system. We end up with, for example, a 100 x infinity system. 100 rows, infinity columns. However, in practice, we truncate this down to perhaps 100x2000. And we don't even have to solve the full system. We can precalculate the 100x1900 system, starting at column 101, and subtract from B.

In other words, assume we're given K as a 100x2000 matrix, A as a column vector with 2000 rows, and B as a column vector with 100 rows. Then, using the known values of K, and the approximations for rows 101 through 2000 of A, we solve the new system:

Thus, we only solve a 100x100 system, which actually is solving a 2000x2000 system, with very, very good approximations of the coefficients 101 through 2000. And note that the right hand side can be precalculated in chunks, reducing memory requirements and making it feasible to solve large systems. For example:

Extending further, I can solve a 700x700 system, using a precomputed vector which gives me the solution to a 700x10,000 system, which itself is the solution to a 10,000x10x000 system with approximations for terms 701 to 10,000. As with the original system, the errors stack up about halfway through, so a 700x700 system is only really accurate out to about 300-400 terms. I should be careful about how I say this, because all 700 terms are at least as accurate as the approximating series I would get if I used the coefficients of the two singularities.

The residue is the part that is inaccurate, and by the 300th term, the coefficients seem to be about ten orders of magnitude smaller than the singularities I've removed. And by the 300th coefficient, I'm already dealing with extremely small coefficients as it is. So this 700x700 system is probably accurate to a hefty number of decimal places, at least 40, if not 60 or more, for complex inputs about a unit distance from the origin or less.

First of all, let's make the assumption that the solution to the infinite system exists, and furthermore, that the resultant power series has a non-zero radius of convergence. Although heuristically the evidence is fairly strong, we don't know enough to formally prove it. However, like the Riemann hypothesis, we can still make some solid conclusions which we know are true IF the hypothesis is true.

Anyway, based on these assumptions, let's try to figure out how the convergence of the smaller systems works. First, let's start with the infinite system. We'll call the matrix M, and the column vector [1, 0, 0, ...] we'll call B. The coefficients A of the slog are then the solution of the equation MA = B.

Now, below a certain row, say, row 100, let's replace the rows with the corresponding entry from an infinite identity matrix. So rows 1 through 100 will stay the same, but 101 to infinity are going to be changed. In order to do this, we replace the zero in the column vector B with the actual coefficient in A we're trying to solve for. This requires knowing the coefficients in A, obviously.

Now it should be clear how Andrew's slog is converging on the true solution. We're solving for the infinite system, using the modified format the I described above, but instead of replacing the values in B with the true values, we're replacing them with 0's. This introduces errors in the first 100 (or n) coefficients to compensate.

In order to improve convergence, what I've done is taken the coefficients for the two known singularities at 0.31813... + 1.3372357i and its conjugate, and put them in the B vector, rather than 0's. These coefficients aren't the true values from A, but they're much, much, much closer to the true values. This reduces the errors in the final solution by several orders of magnitude.

Notice that we don't actually calculate the solution to the full system. We end up with, for example, a 100 x infinity system. 100 rows, infinity columns. However, in practice, we truncate this down to perhaps 100x2000. And we don't even have to solve the full system. We can precalculate the 100x1900 system, starting at column 101, and subtract from B.

In other words, assume we're given K as a 100x2000 matrix, A as a column vector with 2000 rows, and B as a column vector with 100 rows. Then, using the known values of K, and the approximations for rows 101 through 2000 of A, we solve the new system:

Thus, we only solve a 100x100 system, which actually is solving a 2000x2000 system, with very, very good approximations of the coefficients 101 through 2000. And note that the right hand side can be precalculated in chunks, reducing memory requirements and making it feasible to solve large systems. For example:

Extending further, I can solve a 700x700 system, using a precomputed vector which gives me the solution to a 700x10,000 system, which itself is the solution to a 10,000x10x000 system with approximations for terms 701 to 10,000. As with the original system, the errors stack up about halfway through, so a 700x700 system is only really accurate out to about 300-400 terms. I should be careful about how I say this, because all 700 terms are at least as accurate as the approximating series I would get if I used the coefficients of the two singularities.

The residue is the part that is inaccurate, and by the 300th term, the coefficients seem to be about ten orders of magnitude smaller than the singularities I've removed. And by the 300th coefficient, I'm already dealing with extremely small coefficients as it is. So this 700x700 system is probably accurate to a hefty number of decimal places, at least 40, if not 60 or more, for complex inputs about a unit distance from the origin or less.