65816: Build 4 – Revisiting PLD Coding

In a long postscript to 65816: Build 4 – Adding Expanded RAM I discussed some alternatives I tried to achieve the following memory map in my ATF22V10C PLD.

RAM       $0000-$FEFF
ROM       $FF00-$1FFFF
IO        $20000-$200FF
EXRAM     $20100-$7FFFF

There’s nothing magical about this memory map, it’s just something I wanted to try, for reasons I’ll discuss in a operating system post if I ever get to it.

I couldn’t get this map to fit in the PLD with the following logic.

FIELD Address = [A18..A8];

RAM       = Address:[0000..FEFF];
ROM       = Address:[FF00..1FFFF];
IO        = Address:[20000..200FF];
EXRAM     = Address:[20100..7FFFF];

!RAM_CS   = RAM # EXRAM;
!ROM_CS   = ROM & RW;
!IO_CS    = IO;

RAM_CS requires 70 product terms but the maximum the PLD supports per pin is 16.

This simpler logic wouldn’t work either.

FIELD Address = [A18..A8];
RAM       = Address:[0000..FFFF];
ROM       = Address:[FF00..1FFFF];
IO        = Address:[20000..200FF];
EXRAM     = Address:[20000..7FFFF];

!RAM_CS   = (RAM & !ROM) # (EXRAM & !IO);
!ROM_CS   = ROM & RW;
!IO_CS    = IO;

Notice the difference. The address regions are overlapping. Here RAM_CS requires 18 product terms. Closer, but not quite there.

The following logic did work though.

FIELD Address = [A18..A8];
ROM       = Address:[FF00..1FFFF];
IO        = Address:[20000..200FF];

!ROM_CS   = ROM & RW;
!IO_CS    = IO;
!RAM_CS   = ROM_CS & IO_CS;

Here RAM_CS requires a single product term. Later I was to learn that a product term is a grouping of inputs or intermediate variables logically ANDed together. Multiple product terms are required for a pin when more terms are needed to be ORed together to match the require logic. I’m sure there is more to it, but that seems to be the basics for address decoding.

I questioned however, if this was strickly equivalent to what I was trying to accomplish. Sure, it worked on my current, barebones operating system, but would something come up later to bite me?

I asked the folks over at 6502.org if the two were equivalent. Adrien Kohlbecker kindly went through the effort to prove that in fact they weren’t equivalent. The latter case also writes to RAM when ROM is written to. This didn’t bother me too much and I was prepared to leave it there as my system doesn’t (or shouldn’t at least) write to ROM and even if it did it couldn’t corrupt the RAM my system was using as both RAM and ROM share the same address space.

Others provided suggestions as well, which kept me working at the problem. GFoot suggested the following.

FIELD Address = [A18..A8];
RAM       = Address:[0000..FFFF] & ROM_CS;
ROM       = Address:[FF00..1FFFF];
IO        = Address:[20000..200FF];
EXRAM     = Address:[20100..7FFFF] & IO_CS;

!RAM_CS   = RAM # EXRAM;
!ROM_CS   = ROM & RW;
!IO_CS    = IO;

This worked as well. It turns out that the logic expressions for pins can also be used as a variable in other logic expressions and that these are esecially effective because they add to the product terms available in producing a pin’s logic.

In researching one of GFoot’s suggestions on PinNodes however, I came across a tutorial that discussed product term minimizations. Wow, that sounded exactly like something I needed. With a bit of poking, I saw that I was using the Quick minimization option. The Quine-McCluskey option was more advanced and recommended as “Use only if necessary”. Well, this seemed like as good a case for it as any. I gave it a try.

Compiling my desired memory map with the Quine-McCluskey option completed without error, using 13 product terms for RAM_CS and only 2 for ROM_CS. Other intermediate terms used just as many as before, but obviously the Quine-McCluskey minimization did its job and kept these from passing through to the output pins.

A lot of drama I guess to figure out some options for using WINCUPL. But I admit that I learned a lot in the process and that’s always a good thing.

One other thing I learned, don’t cut corners. GFoot asked if his code worked in WINCUPL’s simulation. I had to admit that I hadn’t prepared a simulation input file for the 65816 expanded memory map. It seemed redundant given that everything appeared to be working correctly. Now I realize I could probably have answered my original question if I had updated my simulation file. Just another task to do. I’m not really looking forward to it. The file doesn’t have a great format. It was a bit of a pain to create for a basic 6502 memory map, but I’d added a bunch of logic for the build 4 6502 and the conversion to the 65816 with an expanded memory map makes it worse. It’s on the to do list. Will I ever get to it? Probably not.

PS

It’s worth noting that 6502.org member BigEd had a suggestion that just didn’t occur to me. It’s funny how I get stuck in a certain way of looking at things. I’d been trying to get a single page of ROM at the top of bank 0, at $FF00-$FFFF. I’d been able to get a 4k block to work at $F000-$FFFF with a straightforward address range coding, but the single page solution was a bit convoluted and not perfect (as discussed above). BigEd suggested a 1k block, $FC00-$FFFF, to fit within the 16 product term limit for my PLD. It compiled just fine with the overlapping address range I discussed above.

FIELD Address = [A18..A8];
RAM       = Address:[0000..FFFF];
ROM       = Address:[FC00..1FFFF];
EXRAM     = Address:[20000..7FFFF];
IO        = Address:[20000..200FF];

!RAM_CS   = (RAM & !ROM) # (EXRAM & !IO);
!ROM_CS   = ROM & RW;
!IO_CS    = IO;

Actually, for the system I’m building, a 1k block of ROM in bank 0 isn’t too bad, much better than 4k. It just didn’t occur to me to check if it would work. Of course, the 4k block would likely work just as well also, but would be far less satisfying from a development perspective.

PPS

BDD pointed out that my use of negative logic wasn’t standard and suggested that it may be a factor in the logic not fitting in the PLD as WINCUPL would use extra product terms to do the conversion. Perhaps that is true in some version of WINCUPL, but in the test I ran the product terms used were the same whether the logic was expressed either way.

Here is the code with positive logic. Note that the pins are negated to show that they are active low.

Device   g22V10 ;

/* Input */
Pin 1        = CLK;
Pin 2        = RW;
Pin [3..10]  = [A15..A8];
Pin 11       = A16;
Pin 22       = A17;
Pin 13       = VIA_IRQ1;
Pin 14       = A18;
Pin 15       = ACIAs_IRQ;

/* Output */
Pin 16 = !CLKB;
Pin 17 = !WE;
Pin 18 = !ROM_CS;
Pin 19 = !RAM_CS;
Pin 20 = !OE;
Pin 21 = !IO_CS;
Pin 23 = !IRQ;

/* Local Variables */
FIELD Address = [A18..A8];

/* Logic */
/* This requires Quine-McCluskey minimization to compile */
RAM       = Address:[0000..FEFF];
ROM       = Address:[FF00..1FFFF];
IO        = Address:[20000..200FF];
EXRAM     = Address:[20100..7FFFF];

CLKB      = CLK;
WE        = CLK & RW;
OE        = CLK & !RW;
RAM_CS    = !(RAM # EXRAM);
ROM_CS    = !(ROM & RW);
IO_CS     = !IO;
IRQ       = !(VIA_IRQ1 & ACIAs_IRQ);

For me, this logic just seems off. Look at CLKB for example. The logic says it’s equal to CLK. What? That’s not right, it’s equal to the inverted clock. That’s what happens on the Pin entry.

I think the negative logic makes more sense.

Device   g22V10 ;

/* Input */
Pin 1        = CLK;
Pin 2        = RW;
Pin [3..10]  = [A15..A8];
Pin 11       = A16;
Pin 22       = A17;
Pin 13       = VIA_IRQ1;
Pin 14       = A18;
Pin 15       = ACIAs_IRQ;

/* Output */
Pin 16 = CLKB;
Pin 17 = WE;
Pin 18 = ROM_CS;
Pin 19 = RAM_CS;
Pin 20 = OE;
Pin 21 = IO_CS;
Pin 23 = IRQ;

/* Local Variables */
FIELD Address = [A18..A8];

/* Logic */
/* This requires Quine-McCluskey minimization to compile */
RAM       = Address:[0000..FEFF];
ROM       = Address:[FF00..1FFFF];
IO        = Address:[20000..200FF];
EXRAM     = Address:[20100..7FFFF];

CLKB      = !CLK;
!WE       = CLK & !RW;
!OE       = CLK & RW;
!RAM_CS   = RAM # EXRAM;
!ROM_CS   = ROM & RW;
!IO_CS    = IO;
IRQ       = VIA_IRQ1 & ACIAs_IRQ;

Here, CLKB is stated as specifically equal to the inverted clock signal and its Pin is simple equal to CLKB, not it’s inversion. I don’t need to look at two lines to figure out what’s going on. (as an aside I wonder if this all could have been collapsed into

Pin 16 = !CLK;

the answer is no, you’ll get an error saying that the variable CLK is already defined. Obviously WINCUPL wants unique variables defined on the pins with separate logic to indicate how the inputs and outputs are related.)

So, for now I’ll continue to express my logic in a “negative” way, pun intended. Even if there is some inefficiency in the PLD as a result, my view is, if it fits and works, who cares.

That last caveat, “it works” is key of course, and Gfoot pointed out that different coding could cause a difference in propagation delays. I didn’t test for this specifically, but from the WINCUPL DOC files, which were basically the same with respect to product terms used, I think WINCUPL is optimizing away the difference in logic. I suppose a final verification would be to check if the JED files were the same. I’ll leave that for another challenge.

PPPS

I decided to do the JED file comparison I mentioned above and, in the process, learned a lot more. Comparing the two JED files, for the positive and negative logic coding shown above, I found that for the seven outputs changed from active high to active low a total of 10 fuses where changed, either from not blown to blown or vice versa. While interesting, this wasn’t particularly helpful.

Digging in the users manual and poking through WINCUPL’s options I saw a mention of a fuse plot. Activating the option for the above runs I discovered that the DOC file contains the details of the fuses used for each pin. Comparing the two fuse plots clarified what I was seeing in the JED files.

Switching from negative to positive logic for my seven output pins resulted in the following:

  • the mode fuse on four pins changed, and
  • one pair of fuses on three pins swapped from blown to not blown or visa versa.

Here’s a summary I prepared.

Pin     Neg     Pos         Mode Fuse   Swapped Fuses
16      CLKB    !CLKB           x           1
17      WE      !WE                         1
18      ROM_CS  !ROM_CS         -
19      RAM_CS  !RAM_CS         -
20      OE      !OE                         1
21      IO_CS   !IO_CS          -
23      IRQ     !IRQ

LEGEND    x : Mode fuse not blown in pos, blown in neg
          - : fuse blown in pos, not blown in neg
          1 : one pair of fuses for pin swapped state

For the ATF22V10C at least, it seems each output pin has a mode fuse to indicate whether it’s active high or low. It’s use or not with positive or negative logic in the case above doesn’t reflect the use of more or less PLD resources as it seems these can’t be used for anything else.

The swapped fuses also don’t reflect the use of more or less resources since in the case above a pair of fuses are swapping state.

What I found most interesting and worthy of an update here:

  • not all of the mode fuses changed,
  • the mode fuse changes weren’t always in the same direction,
  • the pins with swapped fuses all had simple logic equivalencies, and,
  • CLKB was the only pin with a mode and swapped fuse change.

It seems as if WINCUPL is doing some internal optimizations (note I haven’t selected any of WINCUPL’s optimization options for this analysis).

Again, I’ve just looked at this for the ATF22V10C and basic address decoding. It’s possible that the logic used here is easier for WINCUPL to optimize and these results won’t hold for more complicated situations. In that case positive logic might lead to more efficient PLD usage.