65816: Floating Point – Finding a Balance Between Speed and Precision

I first got interested in the 65816 because of floating=point support. I had just finished porting Cosmic Conquest, an old text-based strategy game, to my 6502 system and was looking for other old games to port. I remembered an old text-based Star Trek game but on tracking it down discovered that it used floating point. Looking at the various packages available for the 6502 I realized that adding floating-point support was going to take up a significant portion of my remaining ROM. I decided to hold off on floating-point until I had a less memory constrained system.

After building my 65816 system (and probably more importantly an emulator to support development on it) I decided to give floating-point a try. Floating-point isn’t that mysterious once you understand the basics of how to represent floating-point values in binary. As usual, Wikipedia has a good summary on the IEEE-754 Floating Point Standard. Basically, the standard breaks a floating-point value into a sign, an exponent and a mantissa, representing the significant digits of the value. Mathematical operations are performed normally on the mantissas, after shifting the smaller one to equalize the exponents. The operation’s resulting exponent and sign are determined by conventional means and the final result is “normalized” to the standard format.

Still, it’s nice to start off with a ready-made package and there are several targeting the 65xx. The code repository over at 6502.org lists several floating-point packages. I went with the package developed by Marco Granati specifically for the 65816. It’s a very complete floating-point package supporting all of the functions you’d expect to find on a scientific calculator as well as those required to convert from a string format and back.

It didn’t take long to port the source code over to the CA6 assembler and write support functions for my Forth operating system. My first real test of the package was to produce the Mandelbrot set, something I guess most hobbyists try out at one point or another. Here is my version using a modified version of the Rosetta Code for Forth.

Mandelbrot Set

üüüüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôòòòïèàèíâùùùùùùùùùùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôòòòòíêÛØêíïôôôôùùùùùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôòòòà·Îp cÄÛåòòòôôôô÷ùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôïïïííÝ      Áêíïòòòòòô÷÷÷ùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôôôôàà åêèÎØÛÛÖÄ   ÆØàÛå êïïïíÌï÷÷÷ùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôòòòïíèØ  B¼                à¨º∟Îèò÷÷÷÷ùùùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷òòòòòòïïïÝâ ¨                      Äèïòò÷÷÷÷ùùùùùùùùù
üüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷òòòòòòïïíêâ                           Áêíò÷÷÷÷÷ùùùùùùùù
üüüüüüüüüüüüüüüüü÷÷÷÷÷èíèåØèèíííêêàw                              ïô÷÷÷÷÷ùùùùùùù
üüüüüüüüüüüüüüüüüòííèèª É     Äââà                              Ýàòôô÷÷÷÷÷ùùùùùù
üüüüüüüüüüüüüüüüòòííØ            Ó                              âíòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüü    ¿            =                             Ýïòòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüü    ¿            =                             Ýïòòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüüòòííØ            Ó                              âíòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüüüòííèèª É     Äââà                              Ýàòôô÷÷÷÷÷ùùùùùù
üüüüüüüüüüüüüüüüü÷÷÷÷÷èíèåØèèíííêêàw                              ïô÷÷÷÷÷ùùùùùùù
üüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷òòòòòòïïíêâ                           Áêíò÷÷÷÷÷ùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷òòòòòòïïïÝâ ¨                      Äèïòò÷÷÷÷ùùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôòòòïíèØ  B¼                à¨º∟Îèò÷÷÷÷ùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôôôôàà åêèÎØÛÛÖÄ   ÆØàÛå êïïïíÌï÷÷÷ùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôïïïííÝ      Áêíïòòòòòô÷÷÷ùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôòòòà·Îp cÄÛåòòòôôôô÷ùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôòòòòíêÛØêíïôôôôùùùùùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôòòòïèàèíâùùùùùùùùùùùùùùùùùùùùùùùùùù

There was just a small problem though. The plot above took about 26 minutes to produce on my fastest PC using my 65816 emulator which I estimate runs at an equivalent of about 800 kHz on that machine. This would still take over 2 1/2 minutes running on my 65816 build at 8 MHz. That was just too slow to be of much interest.

I wanted something faster, so I looked for ways to speed up the floating-point package. I noted a few possibilities when I added the floating package (fp) to my system:

Use the system direct page. The fp package uses a dedicated direct page which must be shifted to and back for each fp operation.
Use a dedicated fp stack. The fp package uses registers which must be loaded and the result retrieved from for each fp operation.
Use 16-bit registers unless needed otherwise. The fp package requires every fp operation to be called with 8-bit registers. Since my system uses 16-bit registers, this required a switch before and after every fp operation.

But probably the biggest slowdown in my Mandelbrot calculation above was the floating-point package using 128-bits of precision. Unfortunately, it didn’t offer an option to use a lower precision. Thus, each floating-point operation required churning through a lot of bytes regardless of if they were needed or not for the particular calculation.

Did I really need 128-bits of precision? Curious, I decided to convert the key floating-point routines needed for the Mandelbrot calculation to 32-bit precision, incorporating the improvements noted above as well. Here the output with the 32-bit floating-point routines:

Mandelbrot Set with 32-bit Floating Point

üüüüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôòòòïèàèíâùùùùùùùùùùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôòòòòíêÛØêíïôôôôùùùùùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷ôôôôùôùùòòòà·ùp cÄÛåòùòôôôô÷ùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôïïïííÝ      Áêíïòòòòòô÷÷÷ùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôôôôàà åêèÎØÛÛÖÄ   ÆØàÛå êïïïíÌï÷÷÷ùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôòòòïíèØ  B¼                à¨º∟Îèò÷÷÷÷ùùùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷òòòòòòïïïÝâ â                      Äèïòò÷÷÷÷ùùùùùùùùù
üüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷òòòòòòïïíêâ   ù ùù                    Áêíò÷÷÷÷÷ùùùùùùùù
üüüüüüüüüüüüüüüüü÷÷÷÷÷èíèåØèèíííêêàw                              ïô÷÷÷÷÷ùùùùùùù
üüüüüüüüüüüüüüüüüòííèèª É     Äââà                     Ý        Ýàòôô÷÷÷÷÷ùùùùùù
üüüüüüüüüüüüüüüüòòííØ            Ó                              âíòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüü                                               Ýïòòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüü                                               Ýïòòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüüòòííØ            Ó                              âíòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüüüòííèèª É     Äââà                     Ý        Ýàòôô÷÷÷÷÷ùùùùùù
üüüüüüüüüüüüüüüüü÷÷÷÷÷èíèåØèèíííêêàw                              ïô÷÷÷÷÷ùùùùùùù
üüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷òòòòòòïïíêâ   ù ùù                    Áêíò÷÷÷÷÷ùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷òòòòòòïïïÝâ â                      Äèïòò÷÷÷÷ùùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôòòòïíèØ  B¼                à¨º∟Îèò÷÷÷÷ùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôôôôàà åêèÎØÛÛÖÄ   ÆØàÛå êïïïíÌï÷÷÷ùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôïïïííÝ      Áêíïòòòòòô÷÷÷ùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷ôôôôùôùùòòòà·ùp cÄÛåòùòôôôô÷ùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôòòòòíêÛØêíïôôôôùùùùùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôòòòïèàèíâùùùùùùùùùùùùùùùùùùùùùùùùùù

This took under 5 minutes to produce, or about 5.5 times faster than the 128-bit floating-point package version. It should run under 30 seconds on my hardware build running at 8 MHz. That’s more attractive, but no way am I going to be able to create a functional Mandelbrot viewer for my system.

But what about the loss of precision? Comparing the two plots you can see slight discrepancies in the 32-bit plot. A few of these are corrected by increasing the number of iterations per point (I used 100 iterations per point in the plots above), but not for others (like the anomaly in the center right of the 32-bit Mandelbrot set). Thus, it seems as if 32-bit precision won’t provide as precise a result even when looking at the course granularity shown above.

I’ll have to try it on something more useful. Did anyone say Star Trek?

Follow-up: What I thought was lack of precision was really a bug

The folks over at 6502.org pointed out that the anomalies in my 32-bit plot where likely due to a bug in my code rather than insufficient precision in 32-bit floating point. Interestingly, I was using the Mandelbrot plot to test my 32-bit floating point more fully, but I was too dense to realize that the anomalies pointed to a bug in my code rather than insufficient precision. Their comments provide additional tips on generating these types of plots, which should help speed up things even more.

Excel Formula to Convert Decimal to 32-bit IEEE-754 Floating Point

In trying to track down the anomalies in my 32-bit Mandelbrot plot I developed a spreadsheet to perform the same calculations. I then modified my code to provide a table of key calculations at the point of one of the anomalies. Unfortunately, these were in 32-bit IEEE-754 floating point format as I don’t have a 32-bit floating point to string routine yet. I started manually converting the values in my Excel spreadsheet to 32-bit IEEE-754 floating point format for comparison, but that got old pretty quickly. A Google search pointed me to a similar question. That solution was for 64-bit floating point, but with a bit of work I converted it to 32-bit.

Excel formula to covert decimal number in cell A1 to 32-bit IEEE-754 floating point

=IF(A1=0,0,CONCATENATE(DEC2HEX(BITRSHIFT(127+INT(LOG(ABS(A1),2))+IF(A1<0,2^8,0),1),2),DEC2HEX((ABS(A1)/(2^INT(LOG(ABS(A1),2)))-1)*(2^23)/2^8+BITAND(127+INT(LOG(ABS(A1),2))+IF(A1<0,2^8,0),1)*2^15,4),DEC2HEX(MOD((ABS(A1)/(2^INT(LOG(ABS(A1),2)))-1)*(2^23),2^8),2)))

This hasn’t been extensively tested so use at your own risk. One caveat is that this rounds toward zero rather than the nearest value. This caused a bit of a problem comparing with the output from my Mandelbrot code which spit out value rounded the other way. Allowing for this difference was trivial though.

Tracking Down the Problem(s)

Using the Excel tool, it was easy to identify what values were causing problems. I reproduced the problem by entering some of the values manually and performing the same calculations as in the Mandelbrot code. When entering a particular small negative value, I noticed that the value was stored as a positive value. That was strange. I had run many tests with no problems entering negative floating-point values. Turns out when adding a value to the floating-point stack I didn’t fully initialize the memory meaning at times some values picked up characteristics of values previously on the stack. Properly initializing the stack before adding a value solved this problem.

Running my Mandelbrot code again, I got the same error as before. Obviously, I had more than a single error. I noticed I could reproduce the problem by adding a very small number to a negative number. I then noticed that the result of the calculation was the internal representation of infinity. Stepping through the code for the problematic calculation I noticed that I was comparing an exponent in the Y register to an immediate value with the CMP instruction instead of the CPY instruction, a classic error. Correcting this and my anomalies went away. Here’s the updated 32-bit Mandelbrot plot.

Updated 32-bit Mandelbrot Plot

üüüüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôòòòïèàèíâùùùùùùùùùùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôòòòòíêÛØêíïôôôôùùùùùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôòòòà·Îp cÄÛåòòòôôôô÷ùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôïïïííÝ      Áêíïòòòòòô÷÷÷ùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôôôôàà åêèÎØÛÛÖÄ   ÆØàÛå êïïïíÌï÷÷÷ùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôòòòïíèØ  B¼                à¨º∟Îèò÷÷÷÷ùùùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷òòòòòòïïïÝâ ¨                      Äèïòò÷÷÷÷ùùùùùùùùù
üüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷òòòòòòïïíêâ                           Áêíò÷÷÷÷÷ùùùùùùùù
üüüüüüüüüüüüüüüüü÷÷÷÷÷èíèåØèèíííêêàw                              ïô÷÷÷÷÷ùùùùùùù
üüüüüüüüüüüüüüüüüòííèèª É     Äââà                              Ýàòôô÷÷÷÷÷ùùùùùù
üüüüüüüüüüüüüüüüòòííØ            Ó                              âíòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüü                 =                             Ýïòòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüü                 =                             Ýïòòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüüòòííØ            Ó                              âíòôô÷÷÷÷÷÷ùùùùù
üüüüüüüüüüüüüüüüüòííèèª É     Äââà                              Ýàòôô÷÷÷÷÷ùùùùùù
üüüüüüüüüüüüüüüüü÷÷÷÷÷èíèåØèèíííêêàw                              ïô÷÷÷÷÷ùùùùùùù
üüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷òòòòòòïïíêâ                           Áêíò÷÷÷÷÷ùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷òòòòòòïïïÝâ ¨                      Äèïòò÷÷÷÷ùùùùùùùùù
üüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôòòòïíèØ  B¼                à¨º∟Îèò÷÷÷÷ùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷ôôôôôôàà åêèÎØÛÛÖÄ   ÆØàÛå êïïïíÌï÷÷÷ùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôïïïííÝ      Áêíïòòòòòô÷÷÷ùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôôôòòòà·Îp cÄÛåòòòôôôô÷ùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôôôòòòòíêÛØêíïôôôôùùùùùùùùùùùùùùùùùùùùù
üüüüüüüüüüüüüüüüüüüüüüüüüüü÷÷÷÷÷÷÷÷÷÷÷÷÷÷ôôôôòòòïèàèíâùùùùùùùùùùùùùùùùùùùùùùùùùù

That looks a lot better and very similar to the 128-bit plot. Comparing closely you can see a few differences at the Mandelbrot set boundary, meaning the 128-bit precision is able to resolve a few more points at this resolution than 32-bit floating-point. However, the two plots are remarkably similar given the reduced precision.

I also found that you can get a fairly good plot using just 20 iterations (the plots above use 100 iterations to check if a point is in the set or not. Most points are resolved in a lot fewer interations. Here is a 32-bit plot using at most 20 interations per point.

Updated 32-bit Mandelbrot Plot with a maximum of 20 iterations per point

òòòòòòòòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØØØØØÌÌÌÌ¿¿¿²f¥råååååååååååååååååååååååååå
òòòòòòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØØØØÌÌÌÌÌÌ¿¿¿¿¥L?¥²ÌÌÌÌååååååååååååååååååååå
òòòòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØØØÌÌÌÌÌÌÌÌ¿¿¿f +    L⌂¿¿¿ÌÌÌÌØååååååååååååååååå
òòòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØØÌÌÌÌÌÌÌÌ²²²¥¥Y       ¥²¿¿¿¿¿ÌØØØåååååååååååååå
òòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØÌÌÌÌÌÌff ⌂+?LL3     ?fL⌂ ²²²¥ ²ØØØåååååååååååå
òòòòòòòòòòòòòòòòòòòØØØØØØØØØØÌÌÌ¿¿¿²¥?                    f   +¿ØØØØåååååååååå
òòòòòòòòòòòòòòòòòòòØØØØØØØØ¿¿¿¿¿¿²²²Yr                         ²¿¿ØØØØååååååååå
òòòòòòòòòòòòòòòòòòØØØØØØØ¿¿¿¿¿¿²²¥r                            ¥¿ØØØØØåååååååå
òòòòòòòòòòòòòòòòòØØØØØ¥⌂?¥¥¥f                               ²ÌØØØØØååååååå
òòòòòòòòòòòòòòòòò¿¥¥         rrf                              Yf¿ÌÌØØØØØåååååå
òòòòòòòòòòòòòòòò¿¿¥¥?            &                              r¥¿ÌÌØØØØØØååååå
òòòòòòòòòòòòòòòò                                               Y²¿¿ÌÌØØØØØØååååå
òòòòòòòòòòòòòòòò                                               Y²¿¿ÌÌØØØØØØååååå
òòòòòòòòòòòòòòòò¿¿¥¥?            &                              r¥¿ÌÌØØØØØØååååå
òòòòòòòòòòòòòòòòò¿¥¥         rrf                              Yf¿ÌÌØØØØØåååååå
òòòòòòòòòòòòòòòòòØØØØØ¥⌂?¥¥¥f                               ²ÌØØØØØååååååå
òòòòòòòòòòòòòòòòòòØØØØØØØ¿¿¿¿¿¿²²¥r                            ¥¿ØØØØØåååååååå
òòòòòòòòòòòòòòòòòòòØØØØØØØØ¿¿¿¿¿¿²²²Yr                         ²¿¿ØØØØååååååååå
òòòòòòòòòòòòòòòòòòòØØØØØØØØØØÌÌÌ¿¿¿²¥?                    f   +¿ØØØØåååååååååå
òòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØÌÌÌÌÌÌff ⌂+?LL3     ?fL⌂ ²²²¥ ²ØØØåååååååååååå
òòòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØØÌÌÌÌÌÌÌÌ²²²¥¥Y       ¥²¿¿¿¿¿ÌØØØåååååååååååååå
òòòòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØØØÌÌÌÌÌÌÌÌ¿¿¿f +    L⌂¿¿¿ÌÌÌÌØååååååååååååååååå
òòòòòòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØØØØÌÌÌÌÌÌ¿¿¿¿¥L?¥²ÌÌÌÌååååååååååååååååååååå
òòòòòòòòòòòòòòòòòòòòòòòòòòòØØØØØØØØØØØØØØÌÌÌÌ¿¿¿²f¥råååååååååååååååååååååååååå

This plot was completed about three times faster than the 100-iteration plot above. It’s definitely not as refined as the 100-iteration plot. Looking closely, you can see that the 20-iteration plot identifies some points as being in the Mandelbrot set that weren’t identified as such in the 100-iteration plot. For those points it just takes more iterations to verify that a point is actually in the set or not. But still, 20 iterations give a good bit of the boundary detail given the reduction in time to produce the plot.

Where floating-point precision really matters

In tracking down my errors, I noticed that my code deviated from the Excel version for certain calculations and that the deviation grew as iterations increased. This was due to rounding error and the difference in precision between how Excel represents values compared to 32-bit floating point, which can only represent between 6 and 9 digits. For example, here are the IEEE-754 32-bit representations for a given floating-point value.

IEEE-754 32-bit Value Precision

                       IEEE-754
2.16921189769717e-4    39637557
2.1692118976971e-4     39637557
2.169211897697e-4      39637557
2.16921189769e-4       39637557
2.1692118976e-4        39637557
2.1692118e-4           39637556
2.169211e-4            39637550

You can see that the 32-bit IEEE-754 values don’t change after a given number of significant digits.

Some other interesting examples:

Some values can’t be represented in 32-bit IEEE-754 floating point

           IEEE-754                   IEEE-754                   IEEE-754
8,388,608  4B000000       16,777,216  4B800000       33,553,432  4BFFFE0C
8,388,609  4B000001       16,777,217  4B800001       33,553,433  4BFFFE0D
8,388,610  4B000002       16,777,218  4B800001       33,553,434  4BFFFE0D
8,388,611  4B000003       16,777,219  4B800001       33,553,435  4BFFFE0E
8,388,612  4B000004       16,777,220  4B800002       33,553,436  4BFFFE0E

You can see that not all values can be represented in 32-bit IEEE-754 floating point. When a number can’t be represented exactly the extra ones are omitted, effectively rounding the value to the next representable value.

Internally, my 32-bit code uses an extra byte of “guard bits” to help overcome some rounding error. However, I still saw some significant rounding error in some of my Mandelbrot calculations compared to those in Excel which has fifteen digits of precision (similar to 64-bit IEEE-754 floating-point). But even when my code diverged due to rounding errors quite significantly from the Excel calculations, the Mandelbrot algorithm was very forgiving, almost always converging on the same value if sufficient iterations were allowed.