Improving convergence of Andrew's slog
#1
I've already touched on this, but I did want to describe somewhat briefly how I've accelerated convergence of Andrew's slog, with a more detailed description to follow when I have the time.

First of all, let's make the assumption that the solution to the infinite system exists, and furthermore, that the resultant power series has a non-zero radius of convergence. Although heuristically the evidence is fairly strong, we don't know enough to formally prove it. However, like the Riemann hypothesis, we can still make some solid conclusions which we know are true IF the hypothesis is true.

Anyway, based on these assumptions, let's try to figure out how the convergence of the smaller systems works. First, let's start with the infinite system. We'll call the matrix M, and the column vector [1, 0, 0, ...] we'll call B. The coefficients A of the slog are then the solution of the equation MA = B.

Now, below a certain row, say, row 100, let's replace the rows with the corresponding entry from an infinite identity matrix. So rows 1 through 100 will stay the same, but 101 to infinity are going to be changed. In order to do this, we replace the zero in the column vector B with the actual coefficient in A we're trying to solve for. This requires knowing the coefficients in A, obviously.

Now it should be clear how Andrew's slog is converging on the true solution. We're solving for the infinite system, using the modified format the I described above, but instead of replacing the values in B with the true values, we're replacing them with 0's. This introduces errors in the first 100 (or n) coefficients to compensate.

In order to improve convergence, what I've done is taken the coefficients for the two known singularities at 0.31813... + 1.3372357i and its conjugate, and put them in the B vector, rather than 0's. These coefficients aren't the true values from A, but they're much, much, much closer to the true values. This reduces the errors in the final solution by several orders of magnitude.

Notice that we don't actually calculate the solution to the full system. We end up with, for example, a 100 x infinity system. 100 rows, infinity columns. However, in practice, we truncate this down to perhaps 100x2000. And we don't even have to solve the full system. We can precalculate the 100x1900 system, starting at column 101, and subtract from B.

In other words, assume we're given K as a 100x2000 matrix, A as a column vector with 2000 rows, and B as a column vector with 100 rows. Then, using the known values of K, and the approximations for rows 101 through 2000 of A, we solve the new system:

\( \Large K_{\small 1...100} A_{\small 1...100}\ =\ B-\Large K_{\small 101...2000} A_{\small 101...2000} \)

Thus, we only solve a 100x100 system, which actually is solving a 2000x2000 system, with very, very good approximations of the coefficients 101 through 2000. And note that the right hand side can be precalculated in chunks, reducing memory requirements and making it feasible to solve large systems. For example:

\( \Large B^{\normalsize '}\ =\ B-K_{\small 101...500} A_{\small 101...500}-K_{\small 501...1000} A_{\small 501...1000}-K_{\small 1001...1500} A_{\small 1001...1500}-K_{\small 1501...2000} A_{\small 1501...2000} \\
\\[10pt]

\\
A_{\small 1...100}\ =\ K_{\small 1...100}^{\small -1} B^{\normalsize '} \)

Extending further, I can solve a 700x700 system, using a precomputed vector which gives me the solution to a 700x10,000 system, which itself is the solution to a 10,000x10x000 system with approximations for terms 701 to 10,000. As with the original system, the errors stack up about halfway through, so a 700x700 system is only really accurate out to about 300-400 terms. I should be careful about how I say this, because all 700 terms are at least as accurate as the approximating series I would get if I used the coefficients of the two singularities.

The residue is the part that is inaccurate, and by the 300th term, the coefficients seem to be about ten orders of magnitude smaller than the singularities I've removed. And by the 300th coefficient, I'm already dealing with extremely small coefficients as it is. So this 700x700 system is probably accurate to a hefty number of decimal places, at least 40, if not 60 or more, for complex inputs about a unit distance from the origin or less.
~ Jay Daniel Fox
#2
I'll post code when I've had a chance to clean it up, but I've been really busy the past couple weeks. Here are some coefficients I calculated two weeks ago, that I've honestly only just today had a chance to look at and perform a quick sanity test on. The first few (omitting the constant term):

Code:
0.9159460564995333939479691812942268437
0.2493545986721730438836546471260900921
-0.1104647597964313574537187937346440024
-0.09393625509985870822446364031599412950
0.01000323329323155623460883844375422560
0.03589792159454311060893719837092056919
0.006573401099605069030878509226706541076
-0.01230685951818438834898435047586546818
-0.006389802569157469189015008370967404941
0.003273589822817257105090732708082107253
0.003769202952828280970722223160084597794
-0.0002802170195369746576871369468148542093
-0.001775106557196463508225358976394413676
-0.0004279699575246649277874304606275712555
0.0006797232612443379504312897205488076911
0.0004127926181657687663755328531761878708
-0.0001865977837752200311706517694607367697
-0.0002535491984167313806722016634301260175
0.000007474329223085892546373262727349306521
0.0001231669079299400651864105820209693007

And I've attached all 700 coefficients (slog_700_accel.txt). Beyond this, just use the coefficients resulting from adding the basic logarithms at the fixed points.

I've also attached the coefficients for the "residue", for the first 700 terms (slog_700_residue.txt). To use these, simply calculate the logarithms for the two primary fixed points, then add the residue. Again, please note that I've omitted the constant term.

Using the residue negates the need for more than 700 terms. Picking branches must be done carefully, obviously, and just adjust your constant term to ensure that slog(0) is -1, or whatever value you want. The residue is only valid inside a radius of convergence limited by the primary fixed points, though the root test for these would indicate that you can get decent precision a little further out, as long as you stay away from the singularities themselves. At any rate, it's best to use iterated logarithms or expontentials to find the point closest to the origin, evaluate at the chosen point, then iterate back to where you started (adding or subtracting an integer as necessary).

As far as accuracy, it's only about 20 decimal digits or so per coefficient, at least for the first few hundred terms, but the total accuracy is much better, probably 40+ digits inside the unit circle at the origin. I haven't had time to really dig in and investigate it thoroughly yet, however.


Attached Files
.txt   slog_700_accel.txt (Size: 54.5 KB / Downloads: 1,234)
.txt   slog_700_residue.txt (Size: 53.5 KB / Downloads: 1,078)
~ Jay Daniel Fox
#3
And while it should be exceedingly obvious, I'll go ahead and mention anyway that these numbers were for base e.
~ Jay Daniel Fox
#4
Hmm, it would seem that my accelerated solution produces much better results than I previously had anticipated.

First, we have to remember that I've made an unproven (but very probably valid) assumption that Andrew's matrix solution converges on a solution with a logarithmic singularity at the upper and lower primary fixed points (at +/- 0.318...+1.337...*I), allowing acceleration as I've described.

So, to provide some very preliminary numerical evidence that this assumption is valid, I've solved a 1250x1250 unaccelerated system. For now, I'll call unaccelerated systems the "natural" solutions, using Henryk's suggestion.

I solved the 1250x1250 natural system at 2560 and 2816 bits, in order to calculate precision. In the 2560 system, every term had at least 994 bits of absolute precision (i.e., relative to 1.0), and at least 993 bits of relative precision (i.e., relative to the coefficient in question). I then used the 2816 bit solution, which should have approximately 256 additional bits of precision, though for the purposes of this experiment I probably only need 50-100 bits of relative precision.

Okay, the following graph might be a bit difficult to follow. The blue lines are the pseudo-relative error in bits of various natural solutions. The red lines are the error of various accelerated solutions.

The x axis represents the coefficient's index (0 being the first term, the \( z^1 \) coefficient). The y axis is the pseudo-relative error in bits, with a negative ten (-10) representing an error of \( \frac{1}{2^{10}}=\frac{1}{1024} \).

Finally, I should explain what I mean by pseudo-relative error. The coefficients themselves decrease approximately exponentially, but they have that approximately 7-term cycle that makes the graph of the relative error spike twice every 7th term or so, which makes analysis difficult. Therefore, I've used the magnitude of the distance from the origin of the primary fixed points as an exponential scaling factor (it's a little more complicated than that, but that's the gist of it.)

   

The blue lines, from darkest to lightest (and, of course, from most error to least error): 50, 100, 200, 300, and 500 terms of the natural solution.

The red lines, from darkest to lightest: 10, 20, 30, and 40.

Yes, that's right. The 10-term accelerated solution is about as accurate as the 150-term natural solution. The 20-term accelerated solution is about as accurate as the 500-term natural solution. The 30- and 40-term solutions don't look much different from each other, but don't be fooled: that's due to inaccuracies in the 1250-term natural solution.

I hope the provides strong numerical evidence that the natural solution converges very slowly on the accelerated solution. If you don't agree, the following graph will be somewhat pointless.

The following graph uses a 1200-term accelerated solution as its baseline for determining pseudo-relative error. The additional (lightest) blue line is for the 1250-term natural solution:

   

Assuming you agree that the acceleration technique is valid, you can see that the 20-term accelerated solution is more accurate than the 500-term natural solution, and the 30-term accelerated solution is more accurate than the 1250-term natural solution!!!

So how accurate is the 1200-term accelerated solution? I have no idea. If you take the square root of the number of terms in the various natural solutions, you get a number approximately equal to the size of a comparable accelerated system. If this trend holds, then in theory the 1200-term solution is more accurate than a one-million-term natural solution. However, I don't have enough data points to make so bold a claim. I'll hedge my bets and assume it's more accurate than a 100,000-term natural solution.

PS: Sorry I don't have labels on the graphs themselves. I'll add those at some point, once I figure out how to do it within SAGE.

Edit: Updated graphs, to reflect a correction to the exponential scaling I was using. I forgot a factor of n, where n is the index, because these are logarithmic singularities. The graphs now correctly show errors in the 0-bit range for the natural solutions, i.e., pseudo-relative errors as large as the terms themselves.
~ Jay Daniel Fox
#5
Hmm, even as nicely as the accelerated solutions converge, it would seem that I'm only getting about 21-22 digits of accuracy from a 900-term solution (relative to a 1200-term solution, so granted, even that figure is only approximate).

Accuracy is better when we get close to integer tetrations. For \( {}^{\pi} e \), the 900- and 1200-term solutions agree to a little over 23 digits of accuracy. Therefore, I'm hopeful that the 1200-term solution is accurate to a few more digits than that, perhaps 25-27 digits.

So, to about 25 digits of accuracy, give or take, \( {}^{\pi} e \) is 37149801960.55698549872339920573987
~ Jay Daniel Fox
#6
I realize now why I was so sure that I'd get more accuracy out of my accelerated solutions.

It turns out I was almost right. You see, I can get hundreds of digits of precision out of my accelerated solutions, though only dozens of digits of accuracy.

Precision
I qualify precision to mean that \( \mathrm{slog}\left(\exp(z)\right)=\mathrm{slog}(z)+1 \).

In this sense, my accelerated 1200 term solution is precise to well over 400 decimal digits at about z=-0.55, with accuracy dropping off exponentially, or linearly if measured in digits, as we move in either direction from that point (because either z or exp(z) is moving away from the origin). By z=-1 or z=0, precision has dropped off to about 170 digits, which is twice what I'd originally hoped for (80 digits).

Edit: If I include the additional terms for the logarithms used to accelerate the solution (i.e., if I use the "residue" and then add the logarithms), then precision is centered at z=0 and is essentially limited by the imprecision of the matrix solver for now.

I don't even see imprecision at 2048 bits until z=-0.398 to the left and z=0.324 to the right. By z=-0.5 to the left, precision is still in excess of 500 digits, and by 0.5 to the right, it's in excess of 450 digits. It's not until about -1.175 and 0.788 that I see precision drop below 100 digits.

As it is, this amount of precision is extremely good. If I could just find a way to get similar accuracy!


Accuracy
I qualify accuracy to mean that the slog of a finite solution, evaluated at non-integers (typically at z=0.5), is equal to the theoretical value of the natural solution of the infinite system (or at least, taking a limit of finite solutions with size going to infinity, assuming such a limit exists). Accuracy is obviously harder to measure, but we can ballpark it, to within a few orders of magnitude, for example.

In this sense, my accelerated 1200-term solution is accurate to about 20-25 digits at z=0.5, which is far less accuracy than I'd hoped for.
~ Jay Daniel Fox
#7
Wow! Just out of curiosity, Jay, have you made any progress on this front? This is quite fascinating. I read Andrew's paper, and while there are still open issues, the idea of an extension of tetration to real numbers that is both \( C^{\infty} \) and satisfies \( x[4]y = x^{x[4](y-1)} \) for all real y really appeals to me. I think the basic approach is in the right direction: in general, functions induced by the hyper operators have extreme growth rates, and something about this fact causes analysis to be very difficult. But their inverse functions have sane rates of growth (that is to say, extremely slow), and should be much easier to study.
#8
quickfur Wrote:the idea of an extension of tetration to real numbers that is both \( C^{\infty} \) and satisfies \( x[4]y = x^{x[4](y-1)} \) for all real y really appeals to me.

Not only \( C^{\infty} \) but even analtytic! However still this does not determine the function ... there are other choices as you can read somewhere through the forum Wink *which I see you do already extensively*
#9
bo198214 Wrote:
quickfur Wrote:the idea of an extension of tetration to real numbers that is both \( C^{\infty} \) and satisfies \( x[4]y = x^{x[4](y-1)} \) for all real y really appeals to me.

Not only \( C^{\infty} \) but even analtytic! However still this does not determine the function ... there are other choices as you can read somewhere through the forum Wink *which I see you do already extensively*
There are other choices that are analytic and satisfies \( x[4]y = x^{x[4](y-1)} \) for all real y (in the domain)? I thought the other alternatives were either less than \( C^{\infty} \) or only satisfies \( x[4]y = x^{x[4](y-1)} \) for integer y. I find the latter quite unappealing, because you could easily just use linear functions to interpolate between the integer values of the function, but that doesn't mean that it's the "right" extension of the function to the reals!

As for reading this forum extensively... not really. I've only barely skimmed some of the most interesting-looking threads. Although, I did have quite an interest in the real analogs of the higher-order operations in the past.
#10
quickfur Wrote:There are other choices that are analytic and satisfies \( x[4]y = x^{x[4](y-1)} \) for all real y (in the domain)?

Absolutely, otherwise the problem of finding uniqueness criterions for extension to the reals would already be solved.
In the above formula let \( f(x)=b[4]x \) for some base \( b \), then the equation is \( f(x+1)=b^{f(x)} \).
Now we can perturb \( f \) in the following way let \( \phi(x+1)=\phi(x) \) be a periodic analytic function (for example \( \phi(x)=\sin(2\pi x) \)) and consider
\( g(x)=f(x+\phi(x)) \)
then
\( g(x+1)=f(x+1+\phi(x+1))=f(x+\phi(x)+1)=b^{f(x+\phi(x))}=b^{g(x)} \) is another analytic function satisfying this equation (and every other solution is a perturbation by a periodic function).
If you choose \( \phi(x) \) low in amplitude the function \( g(x) \) even remains strictly monotonic increasing.


Possibly Related Threads…
Thread Author Replies Views Last Post
  Revisting my accelerated slog solution using Abel matrix inversion jaydfox 22 45,744 05/16/2021, 11:51 AM
Last Post: Gottfried
  sum(e - eta^^k): convergence or divergence? Gottfried 6 19,619 08/17/2010, 11:05 PM
Last Post: tommy1729
  A note on computation of the slog Gottfried 6 19,436 07/12/2010, 10:24 AM
Last Post: Gottfried
  intuitive slog base sqrt(2) developed between 2 and 4 bo198214 1 7,356 09/10/2009, 06:47 PM
Last Post: bo198214
  sexp and slog at a microcalculator Kouznetsov 0 5,885 01/08/2009, 08:51 AM
Last Post: Kouznetsov
  Convergence of matrix solution for base e jaydfox 6 17,756 12/18/2007, 12:14 AM
Last Post: jaydfox
  SAGE code implementing slog with acceleration jaydfox 4 13,473 10/22/2007, 12:59 AM
Last Post: jaydfox
  Dissecting Andrew's slog solution jaydfox 15 35,694 09/20/2007, 05:53 AM
Last Post: jaydfox
  Computing Andrew's slog solution jaydfox 16 36,773 09/20/2007, 03:53 AM
Last Post: andydude



Users browsing this thread: 1 Guest(s)