Sunday 30 June 2013

My brain is hurting

I'm a bit prone to grandiosity.

The directory I'm working in at the moment is called "OS" - operating system. This on the right is an operating system. Can a watch from the early 80s with 2k of ROM and the equivalent of 64 bytes of RAM be said to have an "Operating System" ?

Well, anyway, I've been writing OS/BIOS type routines for some time today. Having got the Mk2. Indirect write going, I've been tackling the bad 'un that has had be thinking for a while.

The Watchman has three digits (plus an extra '1', plus a colon) in its display area. (So no 24 hour clock ....). These are wired up to six of the segment pins of the SM5 (a1b1a2b2ba3b3) which are multiplexed to with H1-H4 to give 24 possible segment combination of which 23 are used (3 x 7 segments, 1 colon, 1 '1' digit).

The code I've been working on - which now works - does the following - it reads three nibbles from RAM (which are the three digits) and 2 bits from a fourth (one for the colon, one for the '1' digit). For the three digits in turn it looks up the seven segment pattern on for that digit (there are 2 tables, it's a 4 bit processor) and stores it in the correct place on the display.

This is slightly harder than it looks, firstly because of the indirection thing again, and also because the digits are not in the correct order in memory i.e. the digit order is designed for the LCD display not to be coded.

The (hopefully) first and last version is here..... , it has all the characteristics of 4 bit assembler code - the bits of code are in the wrong order (there are four 0,1,2,3 and they are in the order 1,0,2,3), it's incomprehensible even with comments and it's very long winded to do something relatively simple (and there's an entirely different routine which clears the display stuck in the middle of it .....) and half of it is near identical to the other half (the two bits that do the digit->7 segment pattern->write look up)

But it does work :)

So, the next and last messy bit of code is the same sort of mapping, but this one is mapping a coordinate system onto the main bit of the display (the dots and circles). Even this far out, this too is going to be fun to write.

The other main bit is tone generation. Even with a 16Khz clock processor, it's going to be spending most of it's time either doing nothing or generating sound. I would like this too to be automated.

A slightly less barmy answer

I had an idea last night. (Gives you some idea how dull my nights are).

The carry transfer method is a bit slow, the indexed jump method is a bit unwieldy. So why not combine the two ?

So, in the current implementation, the value to be written is halved and the LSB is stored in the carry, which gives us only 8 values to write out. So it's about twice the speed of the bitshifting one (14 cycles rather than 33) and about half as long again, but still fits comfortably in one page.

What it is doing is (in C) something like the code below. It does look very bizarre as a way of copying into memory but it works because we can preserve the carry flag when loading 'addr' into the RAM address register, but we can't preserve the accumulator.

It's a recurring nightmare writing this that there's a really obvious simple solution to this which I can't see...... and these posts will then make me look really dim.

//
// same code effectively as memory[addr] = acc
//
carry = acc % 2           // put LSB of acc in carry bit
acc = acc / 2             // rotate A right e.g. halve it.
switch (acc) {
   case 0:     memory[addr] = 0+carry;break;
   case 1:     memory[addr] = 2+carry;break;
   ..
   case 7:     memory[addr] = 14+carry;break;
}

 

            

How to do indirect writes - the barmy answer.

Yes, I've finally come up with the answer.

Regular reader(s) (he posted optimistically) may remember I have a problem that I cannot do an indirect write in the SM-510.

For example, I cannot have a value in one memory location and use that as an address to store another value in.

I'd come to the conclusion that this was actually impossible to do, short of huge indexed jump type solutions. The reason being there was only the two storage locations inside the CPU and one was needed to load a value into the other from memory, hence it was impossible to load a value into each from memory.

But I was wrong ! There is another storage location - the carry flag.

So what this solution does, is to a bit at a time rotate the value to be written into the carry flag, then rotate that bit into the place where it is being saved. As you can imagine this isn't the most efficient method of transferring four bits but it does work.

It is basically this four times, once for each bit in question:

    rot
    exc 0
    exbla
    exc 0
    rot
    exc 0
    exbla
    exc 0


It might actually be better in practice to do it via an indexed jump solution, because it's quicker, albeit not as demented and smart-alecky.

Here's a challenge if you want to stay awake. Set up a scenario - say Memory location 4 contains 7, which is the memory location you want to set to 6. So A = 6, B = 4 and RAM(B) = 7.  Now figure out how it works  to put 6 in RAM(7) :)

(it's actually simpler than it looks. The first rot shifts the source data bit, the second shifts it into the target data. the exc 0/exbla/exc 0 sequences set up the target data and unpick it again)

Actually my current version adds 1 to RAM(B) (an incb/skip before the last exbla in the fourth bit shift) so it can be called sequentially to copy memory locations one after the other.

The TMS1x00 does this in one instruction basically, TMY.

Saturday 29 June 2013

Exclusive Or the hard way.

I thought following yesterday's post you might want to see what exclusive or on a TMS1000 is like. This is it :)

(It isn't the most incomprehensible thing I've ever written. That goes to my multi-way scrolling and sprite engine for the TI83 calculator - about half the code was self modifying just to make it run fast enough)

ExclusiveOr:
        tcy     XORResult           ; clear XOR Result
        tcmiy   0                           
        tcy     XORCounter          ; set counter to 4
        tcmiy   4
ExclusiveOrLoop:
        tcy     XORResult           ; shift the result one bit left.
        tma
        amaac
        tam                         ; store it back.

        tcy     XORNibble1          ; read Nibble 1
        tma 
        tcy     XORNibble2          ; point at Nibble 2
        tbit1   3                   ; if bit 3 of Nibble 2 is set
        br      XOBit3Set           ; add 8 to A, eg xor of MSB of Nibble
        br      XOBit3Unchanged     ; because add = exor with no carry in
XOBit3Set:                          ; no carry in is why we cant just add
        a8aac
XOBit3Unchanged:                    ; bit 3 of A is XOR bit 3 of N1,N2.
        alec    7                   ; if it is set, set LSB of result

        br      XONextBit
        tcy     XORResult           ; this bit will be shifted into place

        sbit    0
XONextBit:
        tcy     XORNibble1          ; shift nibble 1 left
        tma
        amaac
        tam
        tcy     XORNibble2          ; shift nibble 2 left
        tma
        amaac
        tam

        tcy     XORCounter          ; Decrement XOR Counter
        dman
        tam                         ; write back
        mnez                        ; check counter not zero
        br      ExclusiveOrLoop     ; and keep going till all four bits.
        tcy     XORResult           ; Load the result into A
        tma                                 
        retn

Kevin Savetz's book.

As a slightly relevant aside, I did get a prize in the last RC, kindly donated by Kevin Savetz (I'm guessing this is him :) )

If you ever want an entertaining read and you are a 70s/80s computer geek, it's highly recommended.

Actually a lot of the book could be about me, (if you cross out where it says "Atari" and replace it with "Sharp MZ-80K and BBC Micro.", I never owned any Atari kit before an Atari ST.

Friday 28 June 2013

Some minor changes and thoughts......

Firstly some minor updates. I have changed a couple of things on the assembler and emulator - the assembler now returns an error code on an error (handy for scripts) and the emulator now sounds somewhat better (pitch wise), but also worse as I've made it so it does what the real thing does and modulate another tone. It now sounds generically "cheap buzzer" awful, a bit like playing music underwater.

Now, if you've read this far you are probably wondering who this bloke is (it isn't me, too much hair). This is a fellow called Alan Turing who did a lot of the early development work in computing, including working at Bletchley Park in the war.

He also had this concept of "Turing completeness" which was used to differentiate between "not really computers" like the Code breaking machine Colossus or the work of the German Konrad Zuse. Both of these were nearly-but-not-quite computers, they missed things that meant you couldn't write some programs with it - I think Zuse's early machines didn't have conditional branching.

I nearly abandoned this RetroChallenge because I think the SM-5, while not exactly not Turing complete, is broken.

When I was looking at the COP411 (a NatSemi MCU) I wondered why they had this odd, single, exchange memory location $1F with the accumulator instruction (XAD 3,15). It just looked wierd. Why not allow the swap of any memory location with the accumulator directly ?

Now I know. The SM-5's big problem is that it can't do an indirect write very well.

You have two registers to work with, more or less, known as A and B, A is the accumulator and B is the memory pointer.  Most modern CPUs have an instruction like

sta $4032 

which for a 6502 stores the accumulator at location 0x4032. Most old fashioned 4 bitters don't have this. You load the "Memory Pointer" - B in Sharp/Nat Semi, XY in Texas with the address then write the accumulator, e.g.

lbi $3F
x 0

loads B with $3F and then saves A at that memory location (actually it swaps A and the memory location). 

The big problem with the SM-5 is that it doesn't have a mechanism - that I can work out, that allows you to do an indirect save (in any kind of sensible fashion.

For example, you might want to save the value in A in the memory location whose address is in $40, rather than $40 itself. So suppose memory location $40 contains $17, your 'save' would go to location $17.

I don't think you can do this on an SM-5.  You have the value in A, but to load B, you have to put the address ($40) in B, load memory location into A (then A will be $17), then copy it into B .... but in doing this you have overwritten A. 

If you save A so you can load B this is fine, except that to get A back you have to put the address of where you stored it in B.

That's why (I think) the COP411 has XAD 3,15. It allows you to load A without putting an address in B first.  I reckon this is a last minute "oh XXXX" design decision :)

- there are ways round it but they are all bonkers - e.g. having 16 identical copies of

lbi <some value>
lda 0
lbi 4
x 0
rtn0

except for the save address (the 4) and doing an indexed jump to the right one, and things like that.  But you have to code for the processor you have.

Exclusive OR for example. Neither this nor the TMS1x00 has it (or AND or OR for that matter), and I needed it for the TMS1000 project (Simon). I ended up writing a sort of 'check and rotate in a loop' bit of code to just exclusive or two values, and it took nearly a whole page of the ROM space.


That's why the Sharp SM-5 series is broken (the other things like using T for every instruction going, I can live with .....)

TBH, I did seriously consider abandoning it for another project. But it's not called RetroChallenge for nothing.

 



Monday 24 June 2013

Another bug fix

The TM, TML and RTN instructions weren't doing their job, this is now fixed.

Additionally there is a procedure mechanism using TM which works as follows:

Definition

    proc pname
    ..
    (do stuff)
    ..
    rtn0

Call

    tm pname

it also has a pseudo operation 'extpage' which works like nextpage except that it skips pages 0 and 16 of ROM memory which are used for this mechanism. Procedures can be defined retrospectively. Only 32 (currently) are allowed.

The point of this is that you can write page independent code. The assembler takes care of patching up the procedure links (it does this via pages 0 and 16, and possibly later 17), and you don't have to bother with TML (which only works in certain pages). This can now just be ignored.

The stack is only two levels deep so it's still not going to be very structured though.....