paul@71 | 1 | The Acorn Electron ULA
|
paul@71 | 2 | ======================
|
paul@71 | 3 |
|
paul@46 | 4 | Principal Design and Feature Constraints
|
paul@46 | 5 | ----------------------------------------
|
paul@46 | 6 |
|
paul@116 | 7 | The features of the ULA are limited in sophistication by the amount of time
|
paul@116 | 8 | and resources that can be allocated to each activity supporting the
|
paul@116 | 9 | fundamental features and obligations of the unit. Maintaining a screen display
|
paul@116 | 10 | based on the contents of RAM itself requires the ULA to have exclusive access
|
paul@116 | 11 | to various hardware resources for a significant period of time.
|
paul@116 | 12 |
|
paul@116 | 13 | Whilst other elements of the ULA can in principle run in parallel with the
|
paul@116 | 14 | display refresh activity, they cannot also access the RAM at the same time.
|
paul@116 | 15 | Consequently, other features that might use the RAM must accept a reduced
|
paul@116 | 16 | allocation of that resource in comparison to a hypothetical architecture where
|
paul@116 | 17 | concurrent RAM access is possible at all times.
|
paul@46 | 18 |
|
paul@46 | 19 | Thus, the principal constraint for many features is bandwidth. The duration of
|
paul@46 | 20 | access to hardware resources is one aspect of this; the rate at which such
|
paul@46 | 21 | resources can be accessed is another. For example, the RAM is not fast enough
|
paul@46 | 22 | to support access more frequently than one byte per 2MHz cycle, and for screen
|
paul@46 | 23 | modes involving 80 bytes of screen data per scanline, there are no free cycles
|
paul@46 | 24 | for anything other than the production of pixel output during the active
|
paul@46 | 25 | scanline periods.
|
paul@46 | 26 |
|
paul@116 | 27 | Another constraint is imposed by the method of RAM access provided by the ULA.
|
paul@116 | 28 | The ULA is able to access RAM by fetching 4 bits at a time and thus managing
|
paul@116 | 29 | to transfer 8 bits within a single 2MHz cycle, this being sufficient to
|
paul@116 | 30 | provide display data for the most demanding screen modes. However, this
|
paul@116 | 31 | mechanism's timing requirements are beyond the capabilities of the CPU when
|
paul@116 | 32 | running at 2MHz.
|
paul@116 | 33 |
|
paul@116 | 34 | Consequently, the CPU will only ever be able to access RAM via the ULA at
|
paul@116 | 35 | 1MHz, even when the ULA is not accessing the RAM. Fortunately, when needing to
|
paul@116 | 36 | refresh the display, the ULA is still able to make use of the idle part of
|
paul@116 | 37 | each 1MHz cycle (or, rather, the idle 2MHz cycle unused by the CPU) to itself
|
paul@116 | 38 | access the RAM at a rate of 1 byte per 1MHz cycle (or 1 byte every other 2MHz
|
paul@116 | 39 | cycle), thus supporting the less demanding screen modes.
|
paul@116 | 40 |
|
paul@22 | 41 | Timing
|
paul@22 | 42 | ------
|
paul@22 | 43 |
|
paul@40 | 44 | According to 15.3.2 in the Advanced User Guide, there are 312 scanlines, 256
|
paul@40 | 45 | of which are used to generate pixel data. At 50Hz, this means that 128 cycles
|
paul@40 | 46 | are spent on each scanline (2000000 cycles / 50 = 40000 cycles; 40000 cycles /
|
paul@40 | 47 | 312 ~= 128 cycles). This is consistent with the observation that each scanline
|
paul@37 | 48 | requires at most 80 bytes of data, and that the ULA is apparently busy for 40
|
paul@37 | 49 | out of 64 microseconds in each scanline.
|
paul@22 | 50 |
|
paul@78 | 51 | (In fact, since the ULA is seeking to provide an image for an interlaced
|
paul@78 | 52 | 625-line display, there are in fact two "fields" involved, one providing 312
|
paul@78 | 53 | scanlines and one providing 313 scanlines. See below for a description of the
|
paul@78 | 54 | video system.)
|
paul@78 | 55 |
|
paul@33 | 56 | Access to RAM involves accessing four 64Kb dynamic RAM devices (IC4 to IC7,
|
paul@33 | 57 | each providing two bits of each byte) using two cycles within the 500ns period
|
paul@36 | 58 | of the 2MHz clock to complete each access operation. Since the CPU and ULA
|
paul@36 | 59 | have to take turns in accessing the RAM in MODE 4, 5 and 6, the CPU must
|
paul@36 | 60 | effectively run at 1MHz (since every other 500ns period involves the ULA
|
paul@115 | 61 | accessing RAM) during transfers of screen data.
|
paul@33 | 62 |
|
paul@115 | 63 | The CPU is driven by an external clock (IC8) whose 16MHz frequency is divided
|
paul@138 | 64 | by the ULA (IC1) depending on the screen mode in use. Each 16MHz cycle is
|
paul@115 | 65 | approximately 62.5ns. To access the memory, the following patterns
|
paul@115 | 66 | corresponding to 16MHz cycles are required:
|
paul@37 | 67 |
|
paul@99 | 68 | Time (ns): 0-------------- 500------------- ...
|
paul@99 | 69 | 2 MHz cycle: 0 1 ...
|
paul@99 | 70 | 16 MHz cycle: 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 ...
|
paul@99 | 71 | /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ ...
|
paul@100 | 72 | ~RAS: /---\___________/---\___________ ...
|
paul@100 | 73 | ~CAS: /-----\___/-\___/-----\___/-\___ ...
|
paul@101 | 74 | Address events: A B C A B C ...
|
paul@139 | 75 | Data events: ...F ...S ...F ...S ...
|
paul@139 | 76 | ~WE: W W ...
|
paul@37 | 77 |
|
paul@101 | 78 | ~RAS ops: 1 0 1 0 ...
|
paul@101 | 79 | ~CAS ops: 1 0 1 0 1 0 1 0 ...
|
paul@101 | 80 |
|
paul@138 | 81 | Address ops: a.b. c. a.b. c. ...
|
paul@101 | 82 | Data ops: s f s f ...
|
paul@101 | 83 |
|
paul@139 | 84 | PHI OUT: ----\_______/------------------- ...
|
paul@139 | 85 | CPU (RAM): .....L ....D ...
|
paul@139 | 86 | RnW: .....R ...
|
paul@99 | 87 |
|
paul@139 | 88 | PHI OUT: ----\_______/-------\_______/--- ...
|
paul@139 | 89 | CPU (ROM): D .....L ....D .....L .... ...
|
paul@139 | 90 | RnW: .....R .....R ...
|
paul@97 | 91 |
|
paul@101 | 92 | ~RAS must be high for 100ns, ~CAS must be high for 50ns.
|
paul@101 | 93 | ~RAS must be low for 150ns, ~CAS must be low for 90ns.
|
paul@101 | 94 | Data is available 150ns after ~RAS goes low, 90ns after ~CAS goes low.
|
paul@101 | 95 |
|
paul@64 | 96 | Here, "A" and "B" respectively indicate the row and first column addresses
|
paul@64 | 97 | being latched into the RAM (on a negative edge for ~RAS and ~CAS
|
paul@64 | 98 | respectively), and "C" indicates the second column address being latched into
|
paul@64 | 99 | the RAM. Presumably, the first and second half-bytes can be read at "F" and
|
paul@64 | 100 | "S" respectively, and the row and column addresses must be made available at
|
paul@138 | 101 | "a" and "b" (and "c") respectively at the latest. The TM4164EC4 datasheet
|
paul@138 | 102 | suggests that the addresses can be made available as the ~RAS and ~CAS levels
|
paul@138 | 103 | are brought low. Data can be read at "f" and "s" for the first and second
|
paul@138 | 104 | half-bytes respectively.
|
paul@64 | 105 |
|
paul@64 | 106 | The TM4164EC4-15 has a row address access time of 150ns (maximum) and a column
|
paul@99 | 107 | address access time of 90ns (maximum), which appears to mean that ~RAS must be
|
paul@99 | 108 | held low for at least 150ns and that ~CAS must be held low for at least 90ns
|
paul@99 | 109 | before data becomes available. 150ns is 2.4 cycles (at 16MHz) and 90ns is 1.44
|
paul@99 | 110 | cycles. Thus, "A" to "F" is 2.5 cycles, "B" to "F" is 1.5 cycles, "C" to "S"
|
paul@99 | 111 | is 1.5 cycles.
|
paul@37 | 112 |
|
paul@38 | 113 | Note that the Service Manual refers to the negative edge of RAS and CAS, but
|
paul@38 | 114 | the datasheet for the similar TM4164EC4 product shows latching on the negative
|
paul@38 | 115 | edge of ~RAS and ~CAS. It is possible that the Service Manual also intended to
|
paul@38 | 116 | communicate the latter behaviour. In the TM4164EC4 datasheet, it appears that
|
paul@38 | 117 | "page mode" provides the appropriate behaviour for that particular product.
|
paul@38 | 118 |
|
paul@76 | 119 | The CPU, when accessing the RAM alone, apparently does not make use of the
|
paul@76 | 120 | vacated "slot" that the ULA would otherwise use (when interleaving accesses in
|
paul@76 | 121 | MODE 4, 5 and 6). It only employs a full 2MHz access frequency to memory when
|
paul@103 | 122 | accessing ROM (and potentially sideways RAM). The principal limitation is the
|
paul@103 | 123 | amount of time needed between issuing an address and receiving an entire byte
|
paul@103 | 124 | from the RAM, which is approximately 7 cycles (at 16MHz): much longer than the
|
paul@103 | 125 | 4 cycles that would be required for 2MHz operation.
|
paul@76 | 126 |
|
paul@139 | 127 | Write operations expose some uncertainty about the relationship between the
|
paul@139 | 128 | ULA's RAM access schedule and the PHI OUT clock. The Service Manual shows PHI
|
paul@139 | 129 | IN (which should be the ULA's PHI OUT signal) as being synchronised with ~RAS.
|
paul@139 | 130 | Since the CPU makes its address available potentially as late as 140ns after
|
paul@139 | 131 | its PHI2 clock goes low (this clock being broadly similar to PHI OUT), it
|
paul@139 | 132 | would make no sense to expect the ULA to be able perform a memory access
|
paul@139 | 133 | immediately. What seems more likely is that the CPU makes data available, and
|
paul@139 | 134 | this is written during the next 2MHz cycle.
|
paul@139 | 135 |
|
paul@139 | 136 | For the CPU, "L" indicates the point at which an address is taken from the CPU
|
paul@139 | 137 | address bus, following a negative edge of PHI OUT, with "D" being the point at
|
paul@139 | 138 | which data may be asserted for writing, following a positive edge of PHI OUT.
|
paul@139 | 139 | Here, PHI OUT is driven at 1MHz. Given that ~WE needs to be driven low for
|
paul@139 | 140 | writing or high for reading, and thus propagates RnW from the CPU, this would
|
paul@139 | 141 | need to be done before data would be retrieved and, according to the TM4164EC4
|
paul@139 | 142 | datasheet, even as late as the column address is presented and ~CAS brought
|
paul@139 | 143 | low.
|
paul@139 | 144 |
|
paul@139 | 145 | It must be concluded that where accesses are interleaved between the CPU and
|
paul@139 | 146 | ULA, the CPU access begins concurrently with the ULA access, with the CPU
|
paul@139 | 147 | address and data retained by the ULA, and after the ULA access, the rest of
|
paul@139 | 148 | the CPU transaction occurs in the following 2MHz cycle.
|
paul@139 | 149 |
|
paul@57 | 150 | See: Acorn Electron Advanced User Guide
|
paul@57 | 151 | See: Acorn Electron Service Manual
|
paul@115 | 152 | http://chrisacorns.computinghistory.org.uk/docs/Acorn/Manuals/Acorn_ElectronSM.pdf
|
paul@57 | 153 | See: http://mdfs.net/Docs/Comp/Electron/Techinfo.htm
|
paul@76 | 154 | See: http://stardot.org.uk/forums/viewtopic.php?p=120438#p120438
|
paul@121 | 155 | See: One of the Most Popular 65,536-Bit (64K) Dynamic RAMs The TMS 4164
|
paul@121 | 156 | http://smithsonianchips.si.edu/augarten/p64.htm
|
paul@139 | 157 | See: https://www.mups.co.uk/project/hardware/acorn_electron/
|
paul@139 | 158 | See: Rockwell R650X and R651X Microprocessors (CPU)
|
paul@139 | 159 | See: http://wilsonminesco.com/6502primer/
|
paul@76 | 160 |
|
paul@119 | 161 | A Note on 8-Bit Wide RAM Access
|
paul@119 | 162 | -------------------------------
|
paul@119 | 163 |
|
paul@119 | 164 | It is worth considering the timing when 8 bits of data can be obtained at once
|
paul@119 | 165 | from the RAM chips:
|
paul@119 | 166 |
|
paul@119 | 167 | Time (ns): 0-------------- 500------------- ...
|
paul@119 | 168 | 2 MHz cycle: 0 1 ...
|
paul@119 | 169 | 8 MHz cycle: 0 1 2 3 0 1 2 3 ...
|
paul@119 | 170 | /-\_/-\_/-\_/-\_/-\_/-\_/-\_/-\_ ...
|
paul@119 | 171 | ~RAS: /---\___________/---\___________ ...
|
paul@119 | 172 | ~CAS: /-------\_______/-------\_______ ...
|
paul@119 | 173 | Address events: A B A B ...
|
paul@139 | 174 | Data events: ...E ...E ...
|
paul@139 | 175 | ~WE: W W ...
|
paul@119 | 176 |
|
paul@119 | 177 | ~RAS ops: 1 0 1 0 ...
|
paul@119 | 178 | ~CAS ops: 1 0 1 0 ...
|
paul@119 | 179 |
|
paul@139 | 180 | Address ops: a. b. a. b. ...
|
paul@119 | 181 | Data ops: f s f ...
|
paul@119 | 182 |
|
paul@139 | 183 | PHI OUT: ----\_______/-------\_______/--- ...
|
paul@139 | 184 | CPU: D .....L ....D .....L .... ...
|
paul@139 | 185 | RnW: .....R .....R ...
|
paul@119 | 186 |
|
paul@120 | 187 | Here, "E" indicates the availability of an entire byte.
|
paul@120 | 188 |
|
paul@119 | 189 | Since only one fetch is required per 2MHz cycle, instead of two fetches for
|
paul@119 | 190 | the 4-bit wide RAM arrangement, it seems likely that longer 8MHz cycles could
|
paul@119 | 191 | be used to coordinate the necessary signalling.
|
paul@119 | 192 |
|
paul@120 | 193 | Another conceivable simplification from using an 8-bit wide RAM access channel
|
paul@120 | 194 | with a single access within each 2MHz cycle is the possibility of allowing the
|
paul@120 | 195 | CPU to signal directly to the RAM instead of having the ULA perform the access
|
paul@124 | 196 | signalling on the CPU's behalf. Note that it is this more leisurely signalling
|
paul@124 | 197 | that would allow the CPU to conduct accesses at 2MHz: the "compressed"
|
paul@124 | 198 | signalling being beyond the capabilities of the CPU.
|
paul@120 | 199 |
|
paul@122 | 200 | Note that 16MHz cycles would still be needed for the pixel clock in MODE 0,
|
paul@122 | 201 | which needs to output eight pixels per 2MHz cycle, producing 640 monochrome
|
paul@122 | 202 | pixels per 80-byte line.
|
paul@122 | 203 |
|
paul@124 | 204 | An obvious consideration with regard to 8-bit wide access is whether the ULA
|
paul@124 | 205 | could still conduct the "compressed" signalling for its own RAM accesses:
|
paul@124 | 206 |
|
paul@124 | 207 | Time (ns): 0-------------- 500------------- ...
|
paul@124 | 208 | 2 MHz cycle: 0 1 ...
|
paul@124 | 209 | 16 MHz cycle: 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 ...
|
paul@124 | 210 | /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ ...
|
paul@124 | 211 | ~RAS: /---\___________/---\___________ ...
|
paul@124 | 212 | ~CAS: /-----\___/-\___/-----\___/-\___ ...
|
paul@124 | 213 | Address events: A B C A B C ...
|
paul@139 | 214 | Data events: ...1 ...2 ...1 ...2 ...
|
paul@139 | 215 | ~WE: W W ...
|
paul@124 | 216 |
|
paul@124 | 217 | ~RAS ops: 1 0 1 0 ...
|
paul@124 | 218 | ~CAS ops: 1 0 1 0 1 0 1 0 ...
|
paul@124 | 219 |
|
paul@139 | 220 | Address ops: a.b. c a.b. c ...
|
paul@124 | 221 | Data ops: s f s f ...
|
paul@124 | 222 |
|
paul@139 | 223 | PHI OUT: ----\_______/-------\_______/--- ...
|
paul@139 | 224 | CPU: D .....L ....D .....L .... ...
|
paul@139 | 225 | RnW: .....R .....R ...
|
paul@124 | 226 |
|
paul@124 | 227 | Here, "1" and "2" in the data events correspond to whole byte accesses,
|
paul@124 | 228 | effectively upgrading the half-byte "F" and "S" events in the existing ULA
|
paul@124 | 229 | arrangement.
|
paul@124 | 230 |
|
paul@124 | 231 | Although the provision of access for the CPU would adhere to the relevant
|
paul@124 | 232 | timing constraints, providing only one byte per 2MHz cycle, the ULA could
|
paul@124 | 233 | obtain two bytes per cycle. This would then free up bandwidth for the CPU in
|
paul@124 | 234 | screen modes where the ULA would normally be dominant (MODE 0 to 3), albeit at
|
paul@124 | 235 | the cost of extra buffering. Such buffering could also be done for modes where
|
paul@124 | 236 | the bandwidth is shared (MODE 4 to 6), consolidating pairs of ULA accesses into
|
paul@124 | 237 | single cycles and freeing up an extra cycle for CPU accesses.
|
paul@124 | 238 |
|
paul@131 | 239 | A further consideration is whether the CPU and ULA could access the memory on
|
paul@131 | 240 | interleaved 4MHz cycles, thus replicating the arrangement used by the CPU and
|
paul@131 | 241 | Video ULA on the BBC Micro. One potential obstacle is that the apparent 4MHz
|
paul@131 | 242 | access rate employed by the ULA does not involve the complete process for
|
paul@131 | 243 | accessing the RAM: upon setting up the address and issuing the ~RAS signal,
|
paul@131 | 244 | the ULA is able to make a pair of column accesses on the same "row" of memory,
|
paul@134 | 245 | effectively achieving an average access rate of 4MHz in an 8-bit
|
paul@134 | 246 | configuration.
|
paul@131 | 247 |
|
paul@131 | 248 | However, if arbitrary pairs of column accesses were to be attempted, as would
|
paul@131 | 249 | be required by CPU and ULA interleaving, the ~RAS signal would need to be
|
paul@131 | 250 | re-issued with different addresses being set up. This would expand the time to
|
paul@131 | 251 | access a memory location to beyond the period of a 4MHz cycle, making it
|
paul@131 | 252 | impossible to employ interleaved accesses at such a rate.
|
paul@131 | 253 |
|
paul@134 | 254 | In conclusion, a strict interleaving strategy is not possible, but by using
|
paul@134 | 255 | pixel data buffering and employing two ULA accesses per 2MHz cycle to obtain
|
paul@134 | 256 | two bytes in that cycle, each adjacent 2MHz cycle can be given to the CPU,
|
paul@134 | 257 | thus achieving an effective throughput during display update periods of 3
|
paul@134 | 258 | bytes for every pair of cycles (2 bytes for the ULA, 1 byte for the CPU), and
|
paul@134 | 259 | thus 1.5 bytes per cycle, giving an illusion of 3MHz access to RAM.
|
paul@134 | 260 |
|
paul@135 | 261 | Some other considerations apply to introducing 8-bit wide access. The ULA
|
paul@135 | 262 | employs four pins for data transfer to and from the memory devices (RAM0..3),
|
paul@135 | 263 | and obviously another four pins would be needed in an 8-bit wide scheme.
|
paul@135 | 264 | However, there may have been a physical limitation on the number of pins
|
paul@135 | 265 | permissible on a ULA package or the device's socket. This would necessitate
|
paul@135 | 266 | the reassignment of pins, although few are readily available for such
|
paul@135 | 267 | reassignment.
|
paul@135 | 268 |
|
paul@135 | 269 | One approach might involve connecting the RAM devices to the CPU data bus,
|
paul@135 | 270 | with each line connecting to a different RAM chip. The signalling of the RAM
|
paul@135 | 271 | would remain under the control of the ULA, thus preventing the RAM devices
|
paul@135 | 272 | from interfering with other memory transfer operations, with the ROM
|
paul@135 | 273 | signalling also remaining under the ULA's control. One potential disadvantage
|
paul@135 | 274 | of this scheme would involve the elimination of the separate data paths
|
paul@135 | 275 | between the CPU and ROM and between the ULA and RAM.
|
paul@135 | 276 |
|
paul@135 | 277 | Another approach might involve reclaiming the keyboard input pins (KBD0..3) as
|
paul@135 | 278 | data pins for ULA access to RAM. This would necessitate the reorganisation of
|
paul@135 | 279 | the keyboard interface, perhaps integrating the keyboard matrix more directly
|
paul@135 | 280 | as a kind of ROM device. A bus transceiver could be used to isolate the
|
paul@135 | 281 | keyboard inputs, with a pin being used to control the transceiver, since the
|
paul@135 | 282 | keyboard data lines are pulled high. In effect, the transceiver would act as a
|
paul@135 | 283 | kind of output enable for the keyboard.
|
paul@135 | 284 |
|
paul@135 | 285 | To make the matrix appear within the sideways ROM region of the memory map,
|
paul@135 | 286 | A15 would need to be set to a high value and A14 to a low value. Signals A13
|
paul@135 | 287 | to A0 would then be brought low to select the appropriate column, with the
|
paul@135 | 288 | individual key states being made available via data lines, perhaps D3 to D0.
|
paul@135 | 289 | This mostly retains the existing addressing arrangement and scanning
|
paul@135 | 290 | mechanism. Internally, the ULA would continue to enable access to the keyboard
|
paul@135 | 291 | through the ROM paging mechanism, but instead of integrating separate data
|
paul@135 | 292 | pins into the CPU's data path, it would integrate the keyboard inputs using
|
paul@135 | 293 | the transceiver.
|
paul@135 | 294 |
|
paul@135 | 295 | Enhancement: Keyboard Matrix Scanning
|
paul@135 | 296 | -------------------------------------
|
paul@135 | 297 |
|
paul@135 | 298 | The keyboard scanning mechanism is presumably designed to be as inexpensive as
|
paul@135 | 299 | possible, being driven by software and avoiding extra logic, but at the
|
paul@135 | 300 | expense of occupying large regions of the memory map when paged in. A more
|
paul@135 | 301 | efficient mapping of the keyboard columns could possibly be done using
|
paul@135 | 302 | decoders such as the 74xx138 part which permits the decoding of three inputs
|
paul@135 | 303 | to select one of eight outputs. Using two of these parts, six address lines
|
paul@135 | 304 | would be dedicated to the keyboard columns as follows:
|
paul@135 | 305 |
|
paul@135 | 306 | A5...A3 select up to eight columns via one decoder
|
paul@135 | 307 | A2...A0 select up to eight columns via another decoder
|
paul@135 | 308 |
|
paul@135 | 309 | In this arrangement, only one of the two ranges of pins would be used at any
|
paul@135 | 310 | given time. If the ULA were to require a certain combination of the remaining
|
paul@135 | 311 | address bits, a region as small as 64 bytes could be dedicated to the
|
paul@135 | 312 | keyboard.
|
paul@135 | 313 |
|
paul@135 | 314 | A more efficient arrangement could be used by introducing logic that allows
|
paul@135 | 315 | the decoders to work together to address the keyboard:
|
paul@135 | 316 |
|
paul@135 | 317 | A2...A0 select up to eight columns via both decoders
|
paul@135 | 318 | A3 would enable one decoder if low and the other decoder if high
|
paul@135 | 319 |
|
paul@135 | 320 | With ULA constraints on the remaining address bits, a 16-byte region could be
|
paul@135 | 321 | used to represent the keyboard.
|
paul@135 | 322 |
|
paul@135 | 323 | A further refinement might involve combining the existing columns into groups
|
paul@135 | 324 | of eight keys. This would reduce the number of columns to seven, requiring
|
paul@135 | 325 | only three address lines, with all eight data lines being used to read the
|
paul@135 | 326 | matrix.
|
paul@135 | 327 |
|
paul@135 | 328 | On the BBC Micro, the system 6522 VIA is used to monitor and read from the
|
paul@135 | 329 | keyboard. The memory locations involved with this chip are located in the
|
paul@135 | 330 | region from &FE40 to &FE7F inclusive, although the memory is allocated in a
|
paul@135 | 331 | way that is appropriate to operate that chip, as opposed to merely exposing
|
paul@135 | 332 | the keyboard matrix.
|
paul@135 | 333 |
|
paul@135 | 334 | Enhancement: Hardware Device Selection
|
paul@135 | 335 | --------------------------------------
|
paul@135 | 336 |
|
paul@135 | 337 | An alternative to the existing, rather cumbersome, sideways ROM mapping of the
|
paul@135 | 338 | keyboard might involve making it accessible via a hardware-related memory page
|
paul@135 | 339 | like page FE. With ULA addresses confined to FE0x, and with the ULA itself
|
paul@135 | 340 | having to trap accesses to page FE, the page selection signal might be brought
|
paul@135 | 341 | out of the ULA instead of any dedicated signal for the keyboard. Various
|
paul@135 | 342 | address lines corresponding to A7 through A4, or a subset of these, could be
|
paul@135 | 343 | fed into a decoder to permit the selection of other devices, with the keyboard
|
paul@135 | 344 | being one of these.
|
paul@135 | 345 |
|
paul@135 | 346 | Meanwhile, a more efficient keyboard mapping using the above matrix
|
paul@135 | 347 | enhancement would permit the different keyboard columns to appear as a group
|
paul@135 | 348 | of sixteen or eight bytes. Thus:
|
paul@135 | 349 |
|
paul@135 | 350 | A15...A8 select page FE
|
paul@135 | 351 | A7...A4 select a device or peripheral
|
paul@135 | 352 | A3...A0 select a register or keyboard column
|
paul@135 | 353 |
|
paul@135 | 354 | Conceivably, devices such as sound generators could be mapped to device
|
paul@135 | 355 | regions.
|
paul@135 | 356 |
|
paul@110 | 357 | CPU Clock Notes
|
paul@110 | 358 | ---------------
|
paul@110 | 359 |
|
paul@111 | 360 | "The 6502 receives an external square-wave clock input signal on pin 37, which
|
paul@111 | 361 | is usually labeled PHI0. [...] This clock input is processed within the 6502
|
paul@111 | 362 | to form two clock outputs: PHI1 and PHI2 (pins 3 and 39, respectively). PHI2
|
paul@111 | 363 | is essentially a copy of PHI0; more specifically, PHI2 is PHI0 after it's been
|
paul@111 | 364 | through two inverters and a push-pull amplifier. The same network of
|
paul@111 | 365 | transistors within the 6502 which generates PHI2 is also tied to PHI1, and
|
paul@111 | 366 | generates PHI1 as the inverse of PHI0. The reason why PHI1 and PHI2 are made
|
paul@111 | 367 | available to external devices is so that they know when they can access the
|
paul@111 | 368 | CPU. When PHI1 is high, this means that external devices can read from the
|
paul@111 | 369 | address bus or data bus; when PHI2 is high, this means that external devices
|
paul@111 | 370 | can write to the data bus."
|
paul@111 | 371 |
|
paul@111 | 372 | See: http://lateblt.livejournal.com/88105.html
|
paul@111 | 373 |
|
paul@110 | 374 | "The 6502 has a synchronous memory bus where the master clock is divided into
|
paul@110 | 375 | two phases (Phase 1 and Phase 2). The address is always generated during Phase
|
paul@110 | 376 | 1 and all memory accesses take place during Phase 2."
|
paul@110 | 377 |
|
paul@111 | 378 | See: http://www.jmargolin.com/vgens/vgens.htm
|
paul@110 | 379 |
|
paul@111 | 380 | Thus, the inverse of PHI OUT provides the "other phase" of the clock. "During
|
paul@111 | 381 | Phase 1" means when PHI0 - really PHI2 - is high and "during Phase 2" means
|
paul@111 | 382 | when PHI1 is high.
|
paul@110 | 383 |
|
paul@76 | 384 | Bandwidth Figures
|
paul@76 | 385 | -----------------
|
paul@76 | 386 |
|
paul@76 | 387 | Using an observation of 128 2MHz cycles per scanline, 256 active lines and 312
|
paul@76 | 388 | total lines, with 80 cycles occurring in the active periods of display
|
paul@76 | 389 | scanlines, the following bandwidth calculations can be performed:
|
paul@76 | 390 |
|
paul@76 | 391 | Total theoretical maximum:
|
paul@76 | 392 | 128 cycles * 312 lines
|
paul@76 | 393 | = 39936 bytes
|
paul@76 | 394 |
|
paul@76 | 395 | MODE 0, 1, 2:
|
paul@76 | 396 | ULA: 80 cycles * 256 lines
|
paul@76 | 397 | = 20480 bytes
|
paul@76 | 398 | CPU: 48 cycles / 2 * 256 lines
|
paul@76 | 399 | + 128 cycles / 2 * (312 - 256) lines
|
paul@76 | 400 | = 9728 bytes
|
paul@76 | 401 |
|
paul@76 | 402 | MODE 3:
|
paul@76 | 403 | ULA: 80 cycles * 24 rows * 8 lines
|
paul@76 | 404 | = 15360 bytes
|
paul@76 | 405 | CPU: 48 cycles / 2 * 24 rows * 8 lines
|
paul@76 | 406 | + 128 cycles / 2 * (312 - (24 rows * 8 lines))
|
paul@76 | 407 | = 12288 bytes
|
paul@76 | 408 |
|
paul@76 | 409 | MODE 4, 5:
|
paul@76 | 410 | ULA: 40 cycles * 256 lines
|
paul@76 | 411 | = 10240 bytes
|
paul@76 | 412 | CPU: (40 cycles + 48 cycles / 2) * 256 lines
|
paul@76 | 413 | + 128 cycles / 2 * (312 - 256) lines
|
paul@76 | 414 | = 19968 bytes
|
paul@76 | 415 |
|
paul@76 | 416 | MODE 6:
|
paul@76 | 417 | ULA: 40 cycles * 24 rows * 8 lines
|
paul@76 | 418 | = 7680 bytes
|
paul@76 | 419 | CPU: (40 cycles + 48 cycles / 2) * 24 rows * 8 lines
|
paul@76 | 420 | + 128 cycles / 2 * (312 - (24 rows * 8 lines))
|
paul@76 | 421 | = 19968 bytes
|
paul@76 | 422 |
|
paul@76 | 423 | Here, the division of 2 for CPU accesses is performed to indicate that the CPU
|
paul@76 | 424 | only uses every other access opportunity even in uncontended periods. See the
|
paul@76 | 425 | 2MHz RAM Access enhancement below for bandwidth calculations that consider
|
paul@76 | 426 | this limitation removed.
|
paul@57 | 427 |
|
paul@123 | 428 | A summary of the bandwidth figures is as follows (with extra timing details
|
paul@123 | 429 | described below):
|
paul@123 | 430 |
|
paul@123 | 431 | Standard ULA % Total Slowdown BBC-10s BBC-34s
|
paul@123 | 432 | MODE 0, 1, 2 9728 bytes 24% 4.11 43s 105s
|
paul@123 | 433 | MODE 3 12288 bytes 31% 3.25 34s
|
paul@123 | 434 | MODE 4, 5 19968 bytes 50% 2 20s
|
paul@123 | 435 | MODE 6 19968 bytes 50% 2 20s 50s
|
paul@123 | 436 |
|
paul@123 | 437 | The review of the Electron in Practical Computing (October 1983) provides a
|
paul@123 | 438 | concise overview of the RAM access limitations and gives timing comparisons
|
paul@123 | 439 | between modes and BBC Micro performance. In the above, "BBC-10s" is the
|
paul@123 | 440 | measured or stated time given for a program taking 10 seconds on the BBC
|
paul@123 | 441 | Micro, whereas "BBC-34s" is the apparently measured time given for the
|
paul@123 | 442 | "Persian" program taking 34 seconds to complete on the BBC Micro, with a
|
paul@123 | 443 | "quick" mode presumably switching to MODE 6 using the ULA directly in order to
|
paul@123 | 444 | reduce display bandwidth usage while the program draws to the screen.
|
paul@123 | 445 | Evidently, the measured slowdown is slightly lower than the theoretical
|
paul@123 | 446 | slowdown, most likely due to the running time not being entirely dominated by
|
paul@123 | 447 | RAM access performance characteristics.
|
paul@123 | 448 |
|
paul@40 | 449 | Video Timing
|
paul@40 | 450 | ------------
|
paul@40 | 451 |
|
paul@40 | 452 | According to 8.7 in the Service Manual, and the PAL Wikipedia page,
|
paul@40 | 453 | approximately 4.7µs is used for the sync pulse, 5.7µs for the "back porch"
|
paul@40 | 454 | (including the "colour burst"), and 1.65µs for the "front porch", totalling
|
paul@40 | 455 | 12.05µs and thus leaving 51.95µs for the active video signal for each
|
paul@40 | 456 | scanline. As the Service Manual suggests in the oscilloscope traces, the
|
paul@40 | 457 | display information is transmitted more or less centred within the active
|
paul@40 | 458 | video period since the ULA will only be providing pixel data for 40µs in each
|
paul@40 | 459 | scanline.
|
paul@39 | 460 |
|
paul@39 | 461 | Each 62.5ns cycle happens to correspond to 64µs divided by 1024, meaning that
|
paul@39 | 462 | each scanline can be divided into 1024 cycles, although only 640 at most are
|
paul@40 | 463 | actively used to provide pixel data. Pixel data production should only occur
|
paul@40 | 464 | within a certain period on each scanline, approximately 262 cycles after the
|
paul@40 | 465 | start of hsync:
|
paul@40 | 466 |
|
paul@40 | 467 | active video period = 51.95µs
|
paul@40 | 468 | pixel data period = 40µs
|
paul@40 | 469 | total silent period = 51.95µs - 40µs = 11.95µs
|
paul@40 | 470 | silent periods (before and after) = 11.95µs / 2 = 5.975µs
|
paul@40 | 471 | hsync and back porch period = 4.7µs + 5.7µs = 10.4µs
|
paul@40 | 472 | time before pixel data period = 10.4µs + 5.975µs = 16.375µs
|
paul@40 | 473 | pixel data period start cycle = 16.375µs / 62.5ns = 262
|
paul@40 | 474 |
|
paul@40 | 475 | By choosing a number divisible by 8, the RAM access mechanism can be
|
paul@84 | 476 | synchronised with the pixel production. Thus, 256 is a more appropriate start
|
paul@84 | 477 | cycle, where the HS (horizontal sync) signal corresponding to the 4µs sync
|
paul@84 | 478 | pulse (or "normal sync" pulse as described by the "PAL TV timing and voltages"
|
paul@84 | 479 | document) occurs at cycle 0.
|
paul@84 | 480 |
|
paul@84 | 481 | To summarise:
|
paul@84 | 482 |
|
paul@84 | 483 | HS signal starts at cycle 0 on each horizontal scanline
|
paul@84 | 484 | HS signal ends approximately 4µs later at cycle 64
|
paul@84 | 485 | Pixel data starts approximately 12µs later at cycle 256
|
paul@84 | 486 |
|
paul@84 | 487 | "Re: Electron Memory Contention" provides measurements that appear consistent
|
paul@84 | 488 | with these calculations.
|
paul@40 | 489 |
|
paul@40 | 490 | The "vertical blanking period", meaning the period before picture information
|
paul@78 | 491 | in each field is 25 lines out of 312 (or 313) and thus lasts for 1.6ms. Of
|
paul@78 | 492 | this, 2.5 lines occur before the vsync (field sync) which also lasts for 2.5
|
paul@78 | 493 | lines. Thus, the first visible scanline on the first field of a frame occurs
|
paul@84 | 494 | half way through the 23rd scanline period measured from the start of vsync
|
paul@84 | 495 | (indicated by "V" in the diagrams below):
|
paul@40 | 496 |
|
paul@40 | 497 | 10 20 23
|
paul@40 | 498 | Line in frame: 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
|
paul@40 | 499 | Line from 1: 0 22 3
|
paul@40 | 500 | Line on screen: .:::::VVVVV::::: 12233445566
|
paul@40 | 501 | |_________________________________________________|
|
paul@40 | 502 | 25 line vertical blanking period
|
paul@40 | 503 |
|
paul@40 | 504 | In the second field of a frame, the first visible scanline coincides with the
|
paul@40 | 505 | 24th scanline period measured from the start of line 313 in the frame:
|
paul@40 | 506 |
|
paul@40 | 507 | 310 336
|
paul@40 | 508 | Line in frame: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
|
paul@78 | 509 | Line from 313: 0 23 4
|
paul@40 | 510 | Line on screen: 88:::::VVVVV:::: 11223344
|
paul@40 | 511 | 288 | |
|
paul@40 | 512 | |_________________________________________________|
|
paul@40 | 513 | 25 line vertical blanking period
|
paul@40 | 514 |
|
paul@40 | 515 | In order to consider only full lines, we might consider the start of each
|
paul@40 | 516 | frame to occur 23 lines after the start of vsync.
|
paul@40 | 517 |
|
paul@40 | 518 | Again, it is likely that pixel data production should only occur on scanlines
|
paul@40 | 519 | within a certain period on each frame. The "625/50" document indicates that
|
paul@40 | 520 | only a certain region is "safe" to use, suggesting a vertically centred region
|
paul@84 | 521 | with approximately 15 blank lines above and below the picture. However, the
|
paul@84 | 522 | "PAL TV timing and voltages" document suggests 28 blank lines above and below
|
paul@84 | 523 | the picture. This would centre the 256 lines within the 312 lines of each
|
paul@84 | 524 | field and thus provide a start of picture approximately 5.5 or 5 lines after
|
paul@84 | 525 | the end of the blanking period or 28 or 27.5 lines after the start of vsync.
|
paul@84 | 526 |
|
paul@84 | 527 | To summarise:
|
paul@84 | 528 |
|
paul@84 | 529 | CSYNC signal starts at cycle 0
|
paul@84 | 530 | CSYNC signal ends approximately 160µs (2.5 lines) later at cycle 2560
|
paul@84 | 531 | Start of line occurs approximately 1632µs (5.5 lines) later at cycle 28672
|
paul@40 | 532 |
|
paul@57 | 533 | See: http://en.wikipedia.org/wiki/PAL
|
paul@57 | 534 | See: http://en.wikipedia.org/wiki/Analog_television#Structure_of_a_video_signal
|
paul@57 | 535 | See: The 625/50 PAL Video Signal and TV Compatible Graphics Modes
|
paul@57 | 536 | http://lipas.uwasa.fi/~f76998/video/modes/
|
paul@57 | 537 | See: PAL TV timing and voltages
|
paul@57 | 538 | http://www.retroleum.co.uk/electronics-articles/pal-tv-timing-and-voltages/
|
paul@57 | 539 | See: Line Standards
|
paul@57 | 540 | http://www.pembers.freeserve.co.uk/World-TV-Standards/Line-Standards.html
|
paul@84 | 541 | See: Horizontal Blanking Interval of 405-, 525-, 625- and 819-Line Standards
|
paul@84 | 542 | http://www.pembers.freeserve.co.uk/World-TV-Standards/HBI.pdf
|
paul@84 | 543 | See: Re: Electron Memory Contention
|
paul@84 | 544 | http://www.stardot.org.uk/forums/viewtopic.php?p=134109#p134109
|
paul@57 | 545 |
|
paul@56 | 546 | RAM Integrated Circuits
|
paul@56 | 547 | -----------------------
|
paul@56 | 548 |
|
paul@65 | 549 | Unicorn Electronics appears to offer 4164 RAM chips (as well as 6502 series
|
paul@65 | 550 | CPUs such as the 6502, 6502A, 6502B and 65C02). These 4164 devices are
|
paul@65 | 551 | available in 100ns (4164-100), 120ns (4164-120) and 150ns (4164-150) variants,
|
paul@73 | 552 | have 16 pins and address 65536 bits through a 1-bit wide channel. Similarly,
|
paul@73 | 553 | ByteDelight.com sell 4164 devices primarily for the ZX Spectrum.
|
paul@65 | 554 |
|
paul@56 | 555 | The documentation for the Electron mentions 4164-15 RAM chips for IC4-7, and
|
paul@64 | 556 | the Samsung-produced KM41464 series is apparently equivalent to the Texas
|
paul@56 | 557 | Instruments 4164 chips presumably used in the Electron.
|
paul@56 | 558 |
|
paul@56 | 559 | The TM4164EC4 series combines 4 64K x 1b units into a single package and
|
paul@57 | 560 | appears similar to the TM4164EA4 featured on the Electron's circuit diagram
|
paul@57 | 561 | (in the Advanced User Guide but not the Service Manual), and it also has 22
|
paul@56 | 562 | pins providing 3 additional inputs and 3 additional outputs over the 16 pins
|
paul@57 | 563 | of the individual 4164-15 modules, presumably allowing concurrent access to
|
paul@57 | 564 | the packaged memory units.
|
paul@56 | 565 |
|
paul@56 | 566 | As far as currently available replacements are concerned, the NTE4164 is a
|
paul@57 | 567 | potential candidate: according to the Vetco Electronics entry, it is
|
paul@57 | 568 | supposedly a replacement for the TMS4164-15 amongst many other parts. Similar
|
paul@57 | 569 | parts include the NTE2164 and the NTE6664, both of which appear to have
|
paul@57 | 570 | largely the same performance and connection characteristics. Meanwhile, the
|
paul@58 | 571 | NTE21256 appears to be a 16-pin replacement with four times the capacity that
|
paul@58 | 572 | maintains the single data input and output pins. Using the NTE21256 as a
|
paul@57 | 573 | replacement for all ICs combined would be difficult because of the single bit
|
paul@57 | 574 | output.
|
paul@56 | 575 |
|
paul@57 | 576 | Another device equivalent to the 4164-15 appears to be available under the
|
paul@57 | 577 | code 41662 from Jameco Electronics as the Siemens HYB 4164-2. The Jameco Web
|
paul@57 | 578 | site lists data sheets for other devices on the same page, but these are
|
paul@57 | 579 | different and actually appear to be provided under the 41574 product code (but
|
paul@57 | 580 | are listed under 41464-10) and appear to be replacements for the TM4164EC4:
|
paul@57 | 581 | the Samsung KM41464A-15 and NEC µPD41464 employ 18 pins, eliminating 4 pins by
|
paul@57 | 582 | employing 4 pins for both input and output.
|
paul@57 | 583 |
|
paul@64 | 584 | Pins I/O pins Row access Column access
|
paul@64 | 585 | ---- -------- ---------- -------------
|
paul@64 | 586 | TM4164EC4 22 4 + 4 150ns (15) 90ns (15)
|
paul@64 | 587 | KM41464AP 18 4 150ns (15) 75ns (15)
|
paul@64 | 588 | NTE21256 16 1 + 1 150ns 75ns
|
paul@64 | 589 | HYB 4164-2 16 1 + 1 150ns 100ns
|
paul@64 | 590 | µPD41464 18 4 120ns (12) 60ns (12)
|
paul@64 | 591 |
|
paul@40 | 592 | See: TM4164EC4 65,536 by 4-Bit Dynamic RAM Module
|
paul@136 | 593 | https://www.rocelec.com/part/REITM4164EC4-15L
|
paul@65 | 594 | See: Dynamic RAMS
|
paul@65 | 595 | http://www.unicornelectronics.com/IC/DYNAMIC.html
|
paul@73 | 596 | See: New old stock 8x 4164 chips
|
paul@73 | 597 | http://www.bytedelight.com/?product=8x-4164-chips-new-old-stock
|
paul@56 | 598 | See: KM4164B 64K x 1 Bit Dynamic RAM with Page Mode
|
paul@56 | 599 | http://images.ihscontent.net/vipimages/VipMasterIC/IC/SAMS/SAMSD020/SAMSD020-45.pdf
|
paul@57 | 600 | See: NTE2164 Integrated Circuit 65,536 X 1 Bit Dynamic Random Access Memory
|
paul@57 | 601 | http://www.vetco.net/catalog/product_info.php?products_id=2806
|
paul@56 | 602 | See: NTE4164 - IC-NMOS 64K DRAM 150NS
|
paul@56 | 603 | http://www.vetco.net/catalog/product_info.php?products_id=3680
|
paul@56 | 604 | See: NTE21256 - IC-256K DRAM 150NS
|
paul@56 | 605 | http://www.vetco.net/catalog/product_info.php?products_id=2799
|
paul@56 | 606 | See: NTE21256 262,144-Bit Dynamic Random Access Memory (DRAM)
|
paul@56 | 607 | http://www.nteinc.com/specs/21000to21999/pdf/nte21256.pdf
|
paul@57 | 608 | See: NTE6664 - IC-MOS 64K DRAM 150NS
|
paul@57 | 609 | http://www.vetco.net/catalog/product_info.php?products_id=5213
|
paul@57 | 610 | See: NTE6664 Integrated Circuit 64K-Bit Dynamic RAM
|
paul@57 | 611 | http://www.nteinc.com/specs/6600to6699/pdf/nte6664.pdf
|
paul@57 | 612 | See: 4164-150: MAJOR BRANDS
|
paul@57 | 613 | http://www.jameco.com/webapp/wcs/stores/servlet/Product_10001_10001_41662_-1
|
paul@57 | 614 | See: HYB 4164-1, HYB 4164-2, HYB 4164-3 65,536-Bit Dynamic Random Access Memory (RAM)
|
paul@57 | 615 | http://www.jameco.com/Jameco/Products/ProdDS/41662SIEMENS.pdf
|
paul@57 | 616 | See: KM41464A NMOS DRAM 64K x 4 Bit Dynamic RAM with Page Mode
|
paul@57 | 617 | http://www.jameco.com/Jameco/Products/ProdDS/41662SAM.pdf
|
paul@57 | 618 | See: NEC µ41464 65,536 x 4-Bit Dynamic NMOS RAM
|
paul@57 | 619 | http://www.jameco.com/Jameco/Products/ProdDS/41662NEC.pdf
|
paul@57 | 620 | See: 41464-10: MAJOR BRANDS
|
paul@57 | 621 | http://www.jameco.com/webapp/wcs/stores/servlet/Product_10001_10001_41574_-1
|
paul@39 | 622 |
|
paul@43 | 623 | Interrupts
|
paul@43 | 624 | ----------
|
paul@43 | 625 |
|
paul@43 | 626 | The ULA generates IRQs (maskable interrupts) according to certain conditions
|
paul@43 | 627 | and these conditions are controlled by location &FE00:
|
paul@43 | 628 |
|
paul@43 | 629 | * Vertical sync (bottom of displayed screen)
|
paul@43 | 630 | * 50MHz real time clock
|
paul@43 | 631 | * Transmit data empty
|
paul@43 | 632 | * Receive data full
|
paul@43 | 633 | * High tone detect
|
paul@43 | 634 |
|
paul@43 | 635 | The ULA is also used to clear interrupt conditions through location &FE05. Of
|
paul@43 | 636 | particular significance is bit 7, which must be set if an NMI (non-maskable
|
paul@43 | 637 | interrupt) has occurred and has thus suspended ULA access to memory, restoring
|
paul@43 | 638 | the normal function of the ULA.
|
paul@43 | 639 |
|
paul@43 | 640 | ROM Paging
|
paul@43 | 641 | ----------
|
paul@43 | 642 |
|
paul@43 | 643 | Accessing different ROMs involves bits 0 to 3 of &FE05. Some special ROM
|
paul@43 | 644 | mappings exist:
|
paul@43 | 645 |
|
paul@43 | 646 | 8 keyboard
|
paul@43 | 647 | 9 keyboard (duplicate)
|
paul@43 | 648 | 10 BASIC ROM
|
paul@43 | 649 | 11 BASIC ROM (duplicate)
|
paul@43 | 650 |
|
paul@43 | 651 | Paging in a ROM involves the following procedure:
|
paul@43 | 652 |
|
paul@43 | 653 | 1. Assert ROM page enable (bit 3) together with a ROM number n in bits 0 to
|
paul@43 | 654 | 2, corresponding to ROM number 8+n, such that one of ROMs 12 to 15 is
|
paul@43 | 655 | selected.
|
paul@43 | 656 | 2. Where a ROM numbered from 0 to 7 is to be selected, set bit 3 to zero
|
paul@43 | 657 | whilst writing the desired ROM number n in bits 0 to 2.
|
paul@43 | 658 |
|
paul@81 | 659 | See: http://stardot.org.uk/forums/viewtopic.php?p=136686#p136686
|
paul@81 | 660 |
|
paul@117 | 661 | Keyboard Access
|
paul@117 | 662 | ---------------
|
paul@117 | 663 |
|
paul@117 | 664 | The keyboard pages appear to be accessed at 1MHz just like the RAM.
|
paul@117 | 665 |
|
paul@117 | 666 | See: https://stardot.org.uk/forums/viewtopic.php?p=254155#p254155
|
paul@117 | 667 |
|
paul@37 | 668 | Shadow/Expanded Memory
|
paul@37 | 669 | ----------------------
|
paul@37 | 670 |
|
paul@37 | 671 | The Electron exposes all sixteen address lines and all eight data lines
|
paul@37 | 672 | through the expansion bus. Using such lines, it is possible to provide
|
paul@37 | 673 | additional memory - typically sideways ROM and RAM - on expansion cards and
|
paul@37 | 674 | through cartridges, although the official cartridge specification provides
|
paul@37 | 675 | fewer address lines and only seeks to provide access to memory in 16K units.
|
paul@37 | 676 |
|
paul@37 | 677 | Various modifications and upgrades were developed to offer "turbo"
|
paul@37 | 678 | capabilities to the Electron, permitting the CPU to access a separate 8K of
|
paul@37 | 679 | RAM at 2MHz, presumably preventing access to the low 8K of RAM accessible via
|
paul@37 | 680 | the ULA through additional logic. However, an enhanced ULA might support
|
paul@37 | 681 | independent CPU access to memory over the expansion bus by allowing itself to
|
paul@37 | 682 | be discharged from providing access to memory, potentially for a range of
|
paul@37 | 683 | addresses, and for the CPU to communicate with external memory uninterrupted.
|
paul@33 | 684 |
|
paul@72 | 685 | Sideways RAM/ROM and Upper Memory Access
|
paul@72 | 686 | ----------------------------------------
|
paul@72 | 687 |
|
paul@72 | 688 | Although the ULA controls the CPU clock, effectively slowing or stopping the
|
paul@72 | 689 | CPU when the ULA needs to access screen memory, it is apparently able to allow
|
paul@72 | 690 | the CPU to access addresses of &8000 and above - the upper region of memory -
|
paul@72 | 691 | at 2MHz independently of any access to RAM that the ULA might be performing,
|
paul@72 | 692 | only blocking the CPU if it attempts to access addresses of &7FFF and below
|
paul@72 | 693 | during any ULA memory access - the lower region of memory - by stopping or
|
paul@72 | 694 | stalling its clock.
|
paul@72 | 695 |
|
paul@72 | 696 | Thus, the ULA remains aware of the level of the A15 line, only inhibiting the
|
paul@72 | 697 | CPU clock if the line goes low, when the CPU is attempting to access the lower
|
paul@72 | 698 | region of memory.
|
paul@72 | 699 |
|
paul@79 | 700 | Hardware Scrolling (and Enhancement)
|
paul@79 | 701 | ------------------------------------
|
paul@0 | 702 |
|
paul@0 | 703 | On the standard ULA, &FE02 and &FE03 map to a 9 significant bits address with
|
paul@0 | 704 | the least significant 5 bits being zero, thus limiting the scrolling
|
paul@0 | 705 | resolution to 64 bytes. An enhanced ULA could support a resolution of 2 bytes
|
paul@0 | 706 | using the same layout of these addresses.
|
paul@0 | 707 |
|
paul@0 | 708 | |--&FE02--------------| |--&FE03--------------|
|
paul@0 | 709 | XX XX 14 13 12 11 10 09 08 07 06 XX XX XX XX XX
|
paul@0 | 710 |
|
paul@0 | 711 | XX 14 13 12 11 10 09 08 07 06 05 04 03 02 01 XX
|
paul@0 | 712 |
|
paul@4 | 713 | Arguably, a resolution of 8 bytes is more useful, since the mapping of screen
|
paul@4 | 714 | memory to pixel locations is character oriented. A change in 8 bytes would
|
paul@4 | 715 | permit a horizontal scrolling resolution of 2 pixels in MODE 2, 4 pixels in
|
paul@4 | 716 | MODE 1 and 5, and 8 pixels in MODE 0, 3 and 6. This resolution is actually
|
paul@4 | 717 | observed on the BBC Micro (see 18.11.2 in the BBC Microcomputer Advanced User
|
paul@4 | 718 | Guide).
|
paul@4 | 719 |
|
paul@4 | 720 | One argument for a 2 byte resolution is smooth vertical scrolling. A pitfall
|
paul@4 | 721 | of changing the screen address by 2 bytes is the change in the number of lines
|
paul@4 | 722 | from the initial and final character rows that need reading by the ULA, which
|
paul@9 | 723 | would need to maintain this state information (although this is a relatively
|
paul@9 | 724 | trivial change). Another pitfall is the complication that might be introduced
|
paul@9 | 725 | to software writing bitmaps of character height to the screen.
|
paul@4 | 726 |
|
paul@81 | 727 | See: http://pastraiser.com/computers/acornelectron/acornelectron.html
|
paul@81 | 728 |
|
paul@82 | 729 | Enhancement: Mode Layouts
|
paul@82 | 730 | -------------------------
|
paul@82 | 731 |
|
paul@82 | 732 | Merely changing the screen memory mappings in order to have Archimedes-style
|
paul@82 | 733 | row-oriented screen addresses (instead of character-oriented addresses) could
|
paul@82 | 734 | be done for the existing modes, but this might not be sufficiently beneficial,
|
paul@82 | 735 | especially since accessing regions of the screen would involve incrementing
|
paul@82 | 736 | pointers by amounts that are inconvenient on an 8-bit CPU.
|
paul@82 | 737 |
|
paul@82 | 738 | However, instead of using a Archimedes-style mapping, column-oriented screen
|
paul@82 | 739 | addresses could be more feasibly employed: incrementing the address would
|
paul@82 | 740 | reference the vertical screen location below the currently-referenced location
|
paul@82 | 741 | (just as occurs within characters using the existing ULA); instead of
|
paul@82 | 742 | returning to the top of the character row and referencing the next horizontal
|
paul@82 | 743 | location after eight bytes, the address would reference the next character row
|
paul@82 | 744 | and continue to reference locations downwards over the height of the screen
|
paul@82 | 745 | until reaching the bottom; at the bottom, the next location would be the next
|
paul@82 | 746 | horizontal location at the top of the screen.
|
paul@82 | 747 |
|
paul@82 | 748 | In other words, the memory layout for the screen would resemble the following
|
paul@82 | 749 | (for MODE 2):
|
paul@82 | 750 |
|
paul@82 | 751 | &3000 &3100 ... &7F00
|
paul@82 | 752 | &3001 &3101
|
paul@82 | 753 | ... ...
|
paul@82 | 754 | &3007
|
paul@82 | 755 | &3008
|
paul@82 | 756 | ...
|
paul@82 | 757 | ... ...
|
paul@82 | 758 | &30FF ... &7FFF
|
paul@82 | 759 |
|
paul@82 | 760 | Since there are 256 pixel rows, each column of locations would be addressable
|
paul@82 | 761 | using the low byte of the address. Meanwhile, the high byte would be
|
paul@82 | 762 | incremented to address different columns. Thus, addressing screen locations
|
paul@82 | 763 | would become a lot more convenient and potentially much more efficient for
|
paul@82 | 764 | certain kinds of graphical output.
|
paul@82 | 765 |
|
paul@82 | 766 | One potential complication with this simplified addressing scheme arises with
|
paul@82 | 767 | hardware scrolling. Vertical hardware scrolling by one pixel row (not supported
|
paul@82 | 768 | with the existing ULA) would be achieved by incrementing or decrementing the
|
paul@82 | 769 | screen start address; by one character row, it would involve adding or
|
paul@82 | 770 | subtracting 8. However, the ULA only supports multiples of 64 when changing the
|
paul@82 | 771 | screen start address. Thus, if such a scheme were to be adopted, three
|
paul@82 | 772 | additional bits would need to be supported in the screen start register (see
|
paul@82 | 773 | "Hardware Scrolling (and Enhancement)" for more details). However, horizontal
|
paul@82 | 774 | scrolling would be much improved even under the severe constraints of the
|
paul@82 | 775 | existing ULA: only adjustments of 256 to the screen start address would be
|
paul@82 | 776 | required to produce single-location scrolling of as few as two pixels in MODE 2
|
paul@82 | 777 | (four pixels in MODEs 1 and 5, eight pixels otherwise).
|
paul@82 | 778 |
|
paul@82 | 779 | More disruptive is the effect of this alternative layout on software.
|
paul@82 | 780 | Presumably, compatibility with the BBC Micro was the primary goal of the
|
paul@82 | 781 | Electron's hardware design. With the character-oriented screen layout in
|
paul@82 | 782 | place, system software (and application software accessing the screen
|
paul@82 | 783 | directly) would be relying on this layout to run on the Electron with little
|
paul@82 | 784 | or no modification. Although it might have been possible to change the system
|
paul@82 | 785 | software to use this column-oriented layout instead, this would have incurred
|
paul@82 | 786 | a development cost and caused additional work porting things like games to the
|
paul@82 | 787 | Electron. Moreover, a separate branch of the software from that supporting the
|
paul@82 | 788 | BBC Micro and closer derivatives would then have needed maintaining.
|
paul@82 | 789 |
|
paul@82 | 790 | The decision to use the character-oriented layout in the BBC Micro may have
|
paul@82 | 791 | been related to the choice of circuitry and to facilitate a convenient
|
paul@82 | 792 | hardware implementation, and by the time the Electron was planned, it was too
|
paul@82 | 793 | late to do anything about this somewhat unfortunate choice.
|
paul@82 | 794 |
|
paul@89 | 795 | Pixel Layouts
|
paul@89 | 796 | -------------
|
paul@89 | 797 |
|
paul@89 | 798 | The pixel layouts are as follows:
|
paul@89 | 799 |
|
paul@89 | 800 | Modes Depth (bpp) Pixels (from bits)
|
paul@89 | 801 | ----- ----------- ------------------
|
paul@89 | 802 | 0, 3, 4, 6 1 7 6 5 4 3 2 1 0
|
paul@89 | 803 | 1, 5 2 73 62 51 40
|
paul@89 | 804 | 2 4 7531 6420
|
paul@89 | 805 |
|
paul@89 | 806 | Since the ULA reads a half-byte at a time, one might expect it to attempt to
|
paul@89 | 807 | produce pixels for every half-byte, as opposed to handling entire bytes.
|
paul@89 | 808 | However, the pixel layout is not conducive to producing pixels as soon as a
|
paul@89 | 809 | half-byte has been read for a given full-byte location: in 1bpp modes the
|
paul@89 | 810 | first four pixels can indeed be produced, but in 2bpp and 4bpp modes the pixel
|
paul@89 | 811 | data is spread across the entire byte in different ways.
|
paul@89 | 812 |
|
paul@89 | 813 | An alternative arrangement might be as follows:
|
paul@89 | 814 |
|
paul@89 | 815 | Modes Depth (bpp) Pixels (from bits)
|
paul@89 | 816 | ----- ----------- ------------------
|
paul@89 | 817 | 0, 3, 4, 6 1 7 6 5 4 3 2 1 0
|
paul@89 | 818 | 1, 5 2 76 54 32 10
|
paul@89 | 819 | 2 4 7654 3210
|
paul@89 | 820 |
|
paul@89 | 821 | Just as the mode layouts were presumably decided by compatibility with the BBC
|
paul@89 | 822 | Micro, the pixel layouts will have been maintained for similar reasons.
|
paul@89 | 823 | Unfortunately, this layout prevents any optimisation of the ULA for handling
|
paul@89 | 824 | half-byte pixel data generally.
|
paul@89 | 825 |
|
paul@79 | 826 | Enhancement: The Missing MODE 4
|
paul@79 | 827 | -------------------------------
|
paul@79 | 828 |
|
paul@79 | 829 | The Electron inherits its screen mode selection from the BBC Micro, where MODE
|
paul@79 | 830 | 3 is a text version of MODE 0, and where MODE 6 is a text version of MODE 4.
|
paul@79 | 831 | Neither MODE 3 nor MODE 6 is a genuine character-based text mode like MODE 7,
|
paul@79 | 832 | however, and they are merely implemented by skipping two scanlines in every
|
paul@79 | 833 | ten after the eight required to produce a character line. Thus, such modes
|
paul@79 | 834 | provide a 24-row display.
|
paul@79 | 835 |
|
paul@79 | 836 | In principle, nothing prevents this "text mode" effect being applied to other
|
paul@79 | 837 | modes. The 20-column modes are not well-suited to displaying text, which
|
paul@79 | 838 | leaves MODE 1 which, unlike MODEs 3 and 6, can display 4 colours rather than
|
paul@79 | 839 | 2. Although the need for a non-monochrome 40-column text mode is addressed by
|
paul@79 | 840 | MODE 7 on the BBC Micro, the Electron lacks such a mode.
|
paul@79 | 841 |
|
paul@79 | 842 | If the 4-colour, 24-row variant of MODE 1 were to be provided, logically it
|
paul@79 | 843 | would occupy MODE 4 instead of the current MODE 4:
|
paul@79 | 844 |
|
paul@79 | 845 | Screen mode Size (kilobytes) Colours Rows Resolution
|
paul@79 | 846 | ----------- ---------------- ------- ---- ----------
|
paul@79 | 847 | 0 20 2 32 640x256
|
paul@79 | 848 | 1 20 4 32 320x256
|
paul@79 | 849 | 2 20 16 32 160x256
|
paul@79 | 850 | 3 16 2 24 640x256
|
paul@79 | 851 | 4 (new) 16 4 24 320x256
|
paul@79 | 852 | 4 (old) 10 2 32 320x256
|
paul@79 | 853 | 5 10 4 32 160x256
|
paul@79 | 854 | 6 8 2 24 320x256
|
paul@79 | 855 |
|
paul@79 | 856 | Thus, for increasing mode numbers, the size of each mode would be the same or
|
paul@79 | 857 | less than the preceding mode.
|
paul@79 | 858 |
|
paul@128 | 859 | Enhancement: Display Mode Property Control
|
paul@128 | 860 | ------------------------------------------
|
paul@128 | 861 |
|
paul@128 | 862 | It is rather curious that the ULA supports the mode numbers directly in bits 3
|
paul@128 | 863 | to 5 of &FE07 since these would presumably need to be decoded in order to set
|
paul@128 | 864 | the fundamental properties of the display mode. These properties are as
|
paul@128 | 865 | follows:
|
paul@128 | 866 |
|
paul@128 | 867 | * Screen data retrieval rate: number of fetches per pair of 2MHz cycles
|
paul@128 | 868 | * Pixel colour depth
|
paul@128 | 869 | * Text mode vertical spacing
|
paul@128 | 870 |
|
paul@128 | 871 | From these, the following properties emerge:
|
paul@128 | 872 |
|
paul@129 | 873 | Property Influences
|
paul@129 | 874 | -------- ----------
|
paul@129 | 875 | Character row size (bytes) Retrieval rate
|
paul@129 | 876 |
|
paul@129 | 877 | Number of character rows Text mode setting
|
paul@129 | 878 |
|
paul@129 | 879 | Display size (bytes) Retrieval rate (character row size)
|
paul@129 | 880 | Text mode setting (number of rows)
|
paul@129 | 881 |
|
paul@129 | 882 | Pixel frequency Retrieval rate
|
paul@129 | 883 | Horizontal resolution (pixels) Colour depth
|
paul@128 | 884 |
|
paul@128 | 885 | One can imagine a register bitfield arrangement as follows:
|
paul@128 | 886 |
|
paul@129 | 887 | Field Values Formula
|
paul@129 | 888 | ----- ------ -------
|
paul@129 | 889 | Pixel depth 00: 1 bit per pixel log2(depth)
|
paul@129 | 890 | 01: 2 bits per pixel
|
paul@129 | 891 | 10: 4 bits per pixel
|
paul@129 | 892 |
|
paul@129 | 893 | Retrieval rate 0: twice 2 - fetches per cycle pair
|
paul@129 | 894 | 1: once
|
paul@129 | 895 |
|
paul@129 | 896 | Text mode enable 0: disable/off text mode enabled
|
paul@129 | 897 | 1: enable/on
|
paul@128 | 898 |
|
paul@128 | 899 | This arrangement would require four bits. However, one bit in &FE07 is
|
paul@128 | 900 | seemingly inactive and might possibly be reallocated.
|
paul@128 | 901 |
|
paul@128 | 902 | The resulting combination of properties would permit all of the existing modes
|
paul@128 | 903 | plus some additional ones, including the missing MODE 4 mentioned above. With
|
paul@128 | 904 | the bitfields above ordered from the most significant bits to the least
|
paul@128 | 905 | significant bits providing the low-level "mode" values, the following table
|
paul@128 | 906 | can be produced:
|
paul@128 | 907 |
|
paul@128 | 908 | Screen mode Depth Rate Text Size (K) Colours Rows Resolution
|
paul@128 | 909 | ----------- ----- ---- ---- -------- ------- ---- ----------
|
paul@128 | 910 | 0 (0000) 1 twice off 20 2 32 640x256 (MODE 0)
|
paul@128 | 911 | 1 (0001) 1 twice on 16 2 24 640x256 (MODE 3)
|
paul@128 | 912 | 2 (0010) 1 once off 10 2 32 320x256 (MODE 4)
|
paul@128 | 913 | 3 (0011) 1 once on 8 2 24 320x256 (MODE 6)
|
paul@128 | 914 | 4 (0100) 2 twice off 20 4 32 320x256 (MODE 1)
|
paul@128 | 915 | 5 (0101) 2 twice on 16 4 24 320x256
|
paul@128 | 916 | 6 (0110) 2 once off 10 4 32 160x256 (MODE 5)
|
paul@128 | 917 | 7 (0111) 2 once on 8 4 24 160x256
|
paul@128 | 918 | 8 (1000) 4 twice off 20 16 32 160x256 (MODE 2)
|
paul@128 | 919 | 9 (1001) 4 twice on 16 16 24 160x256
|
paul@128 | 920 | 10 (1010) 4 once off 10 16 32 80x256
|
paul@128 | 921 | 11 (1011) 4 once on 8 16 24 80x256
|
paul@128 | 922 |
|
paul@128 | 923 | The existing modes would be covered in a way that is incompatible with the
|
paul@128 | 924 | existing numbering, thus requiring a table in software, but additional text
|
paul@128 | 925 | modes would be provided for MODE 1, MODE 5 and MODE 2. An additional two lower
|
paul@128 | 926 | resolution modes would also be conceivable within this scheme, requiring the
|
paul@128 | 927 | stretching of 16MHz pixels by a factor of eight to yield 80 pixels per
|
paul@128 | 928 | scanline. The utility of such modes is questionable and such modes might not
|
paul@128 | 929 | be supported.
|
paul@128 | 930 |
|
paul@76 | 931 | Enhancement: 2MHz RAM Access
|
paul@76 | 932 | ----------------------------
|
paul@76 | 933 |
|
paul@76 | 934 | Given that the CPU and ULA both access RAM at 2MHz, but given that the CPU
|
paul@76 | 935 | when not competing with the ULA only accesses RAM every other 2MHz cycle (as
|
paul@76 | 936 | if the ULA still needed to access the RAM), one useful enhancement would be a
|
paul@76 | 937 | mechanism to let the CPU take over the ULA cycles outside the ULA's period of
|
paul@76 | 938 | activity comparable to the way the ULA takes over the CPU cycles in MODE 0 to
|
paul@76 | 939 | 3.
|
paul@76 | 940 |
|
paul@76 | 941 | Thus, the RAM access cycles would resemble the following in MODE 0 to 3:
|
paul@76 | 942 |
|
paul@76 | 943 | Upon a transition from display cycles: UUUUCCCC (instead of UUUUC_C_)
|
paul@76 | 944 | On a non-display line: CCCCCCCC (instead of C_C_C_C_)
|
paul@76 | 945 |
|
paul@76 | 946 | In MODE 4 to 6:
|
paul@76 | 947 |
|
paul@76 | 948 | Upon a transition from display cycles: CUCUCCCC (instead of CUCUC_C_)
|
paul@76 | 949 | On a non-display line: CCCCCCCC (instead of C_C_C_C_)
|
paul@76 | 950 |
|
paul@76 | 951 | This would improve CPU bandwidth as follows:
|
paul@76 | 952 |
|
paul@118 | 953 | Standard ULA Enhanced ULA % Total Bandwidth Speedup
|
paul@118 | 954 | MODE 0, 1, 2 9728 bytes 19456 bytes 24% -> 49% 2
|
paul@118 | 955 | MODE 3 12288 bytes 24576 bytes 31% -> 62% 2
|
paul@118 | 956 | MODE 4, 5 19968 bytes 29696 bytes 50% -> 74% 1.5
|
paul@118 | 957 | MODE 6 19968 bytes 32256 bytes 50% -> 81% 1.6
|
paul@76 | 958 |
|
paul@118 | 959 | (Here, the uncontended total 2MHz bandwidth for a display period would be
|
paul@118 | 960 | 39936 bytes, being 128 cycles per line over 312 lines.)
|
paul@115 | 961 |
|
paul@76 | 962 | With such an enhancement, MODE 0 to 3 experience a doubling of CPU bandwidth
|
paul@76 | 963 | because all access opportunities to RAM are doubled. Meanwhile, in the other
|
paul@76 | 964 | modes, some CPU accesses occur alongside ULA accesses and thus cannot be
|
paul@76 | 965 | doubled, but the CPU bandwidth increase is still significant.
|
paul@76 | 966 |
|
paul@103 | 967 | Unfortunately, the mechanism for accessing the RAM is too slow to provide data
|
paul@109 | 968 | within the time constraints of 2MHz operation. There is no time remaining in a
|
paul@118 | 969 | 2MHz cycle for the CPU to receive and process any retrieved data once the
|
paul@124 | 970 | necessary signalling has been performed.
|
paul@124 | 971 |
|
paul@124 | 972 | The only way for the CPU to be able to access the RAM quickly enough would be
|
paul@124 | 973 | to do away with the double 4-bit access mechanism and to have a single 8-bit
|
paul@124 | 974 | channel to the memory. This would require twice as many 1-bit RAM chips or a
|
paul@124 | 975 | different kind of RAM chip, but it would also potentially simplify the ULA.
|
paul@124 | 976 |
|
paul@124 | 977 | The section on 8-bit wide RAM access discusses the possibilities around
|
paul@124 | 978 | changing the memory architecture, also describing the possibility of ULA
|
paul@124 | 979 | accesses achieving two bytes per 2MHz cycle due to the doubling of the memory
|
paul@124 | 980 | channel, leaving every other access free for the CPU during the display period
|
paul@124 | 981 | in MODE 0 to 3...
|
paul@124 | 982 |
|
paul@124 | 983 | Standard display period: UUUUUUUU
|
paul@124 | 984 | Modified display period: UCUCUCUC
|
paul@124 | 985 |
|
paul@124 | 986 | ...and consolidating accesses in MODE 4 to 6:
|
paul@124 | 987 |
|
paul@124 | 988 | Standard display period: UCUCUCUC
|
paul@124 | 989 | Modified display period: UCCCUCCC
|
paul@124 | 990 |
|
paul@124 | 991 | Together with the enhancements for non-display periods, such an "Enhanced+ ULA"
|
paul@124 | 992 | would perform as follows:
|
paul@124 | 993 |
|
paul@124 | 994 | Standard ULA Enhanced+ ULA % Total Bandwidth Speedup
|
paul@124 | 995 | MODE 0, 1, 2 9728 bytes 29696 bytes 24% -> 74% 3.1
|
paul@124 | 996 | MODE 3 12288 bytes 32256 bytes 31% -> 81% 2.6
|
paul@124 | 997 | MODE 4, 5 19968 bytes 34816 bytes 50% -> 87% 1.7
|
paul@124 | 998 | MODE 6 19968 bytes 36096 bytes 50% -> 90% 1.8
|
paul@124 | 999 |
|
paul@124 | 1000 | Of course, the principal enhancement would be the wider memory channel, with
|
paul@124 | 1001 | more buffering in the ULA being its contribution to this arrangement.
|
paul@103 | 1002 |
|
paul@55 | 1003 | Enhancement: Region Blanking
|
paul@55 | 1004 | ----------------------------
|
paul@4 | 1005 |
|
paul@4 | 1006 | The problem of permitting character-oriented blitting in programs whilst
|
paul@4 | 1007 | scrolling the screen by sub-character amounts could be mitigated by permitting
|
paul@4 | 1008 | a region of the display to be blank, such as the final lines of the display.
|
paul@4 | 1009 | Consider the following vertical scrolling by 2 bytes that would cause an
|
paul@4 | 1010 | initial character row of 6 lines and a final character row of 2 lines:
|
paul@4 | 1011 |
|
paul@4 | 1012 | 6 lines - initial, partial character row
|
paul@4 | 1013 | 248 lines - 31 complete rows
|
paul@4 | 1014 | 2 lines - final, partial character row
|
paul@4 | 1015 |
|
paul@4 | 1016 | If a routine were in use that wrote 8 line bitmaps to the partial character
|
paul@4 | 1017 | row now split in two, it would be advisable to hide one of the regions in
|
paul@4 | 1018 | order to prevent content appearing in the wrong place on screen (such as
|
paul@4 | 1019 | content meant to appear at the top "leaking" onto the bottom). Blanking 6
|
paul@4 | 1020 | lines would be sufficient, as can be seen from the following cases.
|
paul@4 | 1021 |
|
paul@4 | 1022 | Scrolling up by 2 lines:
|
paul@4 | 1023 |
|
paul@4 | 1024 | 6 lines - initial, partial character row
|
paul@4 | 1025 | 240 lines - 30 complete rows
|
paul@4 | 1026 | 4 lines - part of 1 complete row
|
paul@4 | 1027 | -----------------------------------------------------------------
|
paul@4 | 1028 | 4 lines - part of 1 complete row (hidden to maintain 250 lines)
|
paul@4 | 1029 | 2 lines - final, partial character row (hidden)
|
paul@4 | 1030 |
|
paul@4 | 1031 | Scrolling down by 2 lines:
|
paul@4 | 1032 |
|
paul@4 | 1033 | 2 lines - initial, partial character row
|
paul@4 | 1034 | 248 lines - 31 complete rows
|
paul@4 | 1035 | ----------------------------------------------------------
|
paul@4 | 1036 | 6 lines - final, partial character row (hidden)
|
paul@4 | 1037 |
|
paul@24 | 1038 | Thus, in this case, region blanking would impose a 250 line display with the
|
paul@24 | 1039 | bottom 6 lines blank.
|
paul@24 | 1040 |
|
paul@55 | 1041 | See the description of the display suspend enhancement for a more efficient
|
paul@74 | 1042 | way of blanking lines than merely blanking the palette whilst allowing the CPU
|
paul@74 | 1043 | to perform useful work during the blanking period.
|
paul@74 | 1044 |
|
paul@74 | 1045 | To control the blanking or suspending of lines at the top and bottom of the
|
paul@74 | 1046 | display, a memory location could be dedicated to the task: the upper 4 bits
|
paul@74 | 1047 | could define a blanking region of up to 16 lines at the top of the screen,
|
paul@74 | 1048 | whereas the lower 4 bits could define such a region at the bottom of the
|
paul@74 | 1049 | screen. If more lines were required, two locations could be employed, allowing
|
paul@74 | 1050 | the top and bottom regions to occupy the entire screen.
|
paul@55 | 1051 |
|
paul@55 | 1052 | Enhancement: Screen Height Adjustment
|
paul@55 | 1053 | -------------------------------------
|
paul@24 | 1054 |
|
paul@24 | 1055 | The height of the screen could be configurable in order to reduce screen
|
paul@24 | 1056 | memory consumption. This is not quite done in MODE 3 and 6 since the start of
|
paul@24 | 1057 | the screen appears to be rounded down to the nearest page, but by reducing the
|
paul@24 | 1058 | height by amounts more than a page, savings would be possible. For example:
|
paul@24 | 1059 |
|
paul@24 | 1060 | Screen width Depth Height Bytes per line Saving in bytes Start address
|
paul@24 | 1061 | ------------ ----- ------ -------------- --------------- -------------
|
paul@24 | 1062 | 640 1 252 80 320 &3140 -> &3100
|
paul@24 | 1063 | 640 1 248 80 640 &3280 -> &3200
|
paul@24 | 1064 | 320 1 240 40 640 &5A80 -> &5A00
|
paul@24 | 1065 | 320 2 240 80 1280 &3500
|
paul@0 | 1066 |
|
paul@55 | 1067 | Screen Mode Selection
|
paul@55 | 1068 | ---------------------
|
paul@55 | 1069 |
|
paul@55 | 1070 | Bits 3, 4 and 5 of address &FE*7 control the selected screen mode. For a wider
|
paul@55 | 1071 | range of modes, the other bits of &FE*7 (related to sound, cassette
|
paul@55 | 1072 | input/output and the Caps Lock LED) would need to be reassigned and bit 0
|
paul@55 | 1073 | potentially being made available for use.
|
paul@55 | 1074 |
|
paul@58 | 1075 | Enhancement: Palette Definition
|
paul@58 | 1076 | -------------------------------
|
paul@0 | 1077 |
|
paul@0 | 1078 | Since all memory accesses go via the ULA, an enhanced ULA could employ more
|
paul@0 | 1079 | specific addresses than &FE*X to perform enhanced functions. For example, the
|
paul@0 | 1080 | palette control is done using &FE*8-F and merely involves selecting predefined
|
paul@0 | 1081 | colours, whereas an enhanced ULA could support the redefinition of all 16
|
paul@0 | 1082 | colours using specific ranges such as &FE18-F (colours 0 to 7) and &FE28-F
|
paul@0 | 1083 | (colours 8 to 15), where a single byte might provide 8 bits per pixel colour
|
paul@0 | 1084 | specifications similar to those used on the Archimedes.
|
paul@0 | 1085 |
|
paul@4 | 1086 | The principal limitation here is actually the hardware: the Electron has only
|
paul@4 | 1087 | a single output line for each of the red, green and blue channels, and if
|
paul@4 | 1088 | those outputs are strictly digital and can only be set to a "high" and "low"
|
paul@4 | 1089 | value, then only the existing eight colours are possible. If a modern ULA were
|
paul@81 | 1090 | able to output analogue values (or values at well-defined points between the
|
paul@81 | 1091 | high and low values, such as the half-on value supported by the Amstrad CPC
|
paul@81 | 1092 | series), it would still need to be assessed whether the circuitry could
|
paul@81 | 1093 | successfully handle and propagate such values. Various sources indicate that
|
paul@81 | 1094 | only "TTL levels" are supported by the RGB output circuit, and since there are
|
paul@81 | 1095 | 74LS08 AND logic gates involved in the RGB component outputs from the ULA, it
|
paul@81 | 1096 | is likely that the ULA is expected to provide only "high" or "low" values.
|
paul@4 | 1097 |
|
paul@58 | 1098 | Short of adding extra outputs from the ULA (either additional red, green and
|
paul@81 | 1099 | blue outputs or a combined intensity output), another approach might involve
|
paul@81 | 1100 | some kind of modulation where an output value might be encoded in multiple
|
paul@81 | 1101 | pulses at a higher frequency than the pixel frequency. However, this would
|
paul@81 | 1102 | demand additional circuitry outside the ULA, and component RGB monitors would
|
paul@81 | 1103 | probably not be able to take advantage of this feature; only UHF and composite
|
paul@81 | 1104 | video devices (the latter with the composite video colour support enabled on
|
paul@81 | 1105 | the Electron's circuit board) would potentially benefit.
|
paul@58 | 1106 |
|
paul@51 | 1107 | Flashing Colours
|
paul@51 | 1108 | ----------------
|
paul@51 | 1109 |
|
paul@51 | 1110 | According to the Advanced User Guide, "The cursor and flashing colours are
|
paul@51 | 1111 | entirely generated in software: This means that all of the logical to physical
|
paul@51 | 1112 | colour map must be changed to cause colours to flash." This appears to suggest
|
paul@51 | 1113 | that the palette registers must be updated upon the flash counter - read and
|
paul@51 | 1114 | written by OSBYTE &C1 (193) - reaching zero and that some way of changing the
|
paul@51 | 1115 | colour pairs to be any combination of colours might be possible, instead of
|
paul@52 | 1116 | having colour complements as pairs.
|
paul@52 | 1117 |
|
paul@52 | 1118 | It is conceivable that the interrupt code responsible does the simple thing
|
paul@54 | 1119 | and merely inverts the current values for any logical colours (LC) for which
|
paul@54 | 1120 | the associated physical colour (as supplied as the second parameter to the VDU
|
paul@54 | 1121 | 19 call) has the top bit of its four bit value set. These top bits are not
|
paul@52 | 1122 | recorded in the palette registers but are presumably recorded separately and
|
paul@52 | 1123 | used to build bitmaps as follows:
|
paul@52 | 1124 |
|
paul@54 | 1125 | LC 2 colour 4 colour 16 colour 4-bit value for inversion
|
paul@54 | 1126 | -- -------- -------- --------- -------------------------
|
paul@54 | 1127 | 0 00010001 00010001 00010001 1, 1, 1
|
paul@54 | 1128 | 1 01000100 00100010 00010001 4, 2, 1
|
paul@54 | 1129 | 2 01000100 00100010 4, 2
|
paul@54 | 1130 | 3 10001000 00100010 8, 2
|
paul@54 | 1131 | 4 00010001 1
|
paul@54 | 1132 | 5 00010001 1
|
paul@54 | 1133 | 6 00100010 2
|
paul@54 | 1134 | 7 00100010 2
|
paul@54 | 1135 | 8 01000100 4
|
paul@54 | 1136 | 9 01000100 4
|
paul@54 | 1137 | 10 10001000 8
|
paul@54 | 1138 | 11 10001000 8
|
paul@54 | 1139 | 12 01000100 4
|
paul@54 | 1140 | 13 01000100 4
|
paul@54 | 1141 | 14 10001000 8
|
paul@54 | 1142 | 15 10001000 8
|
paul@54 | 1143 |
|
paul@54 | 1144 | Inversion value calculation:
|
paul@54 | 1145 |
|
paul@54 | 1146 | 2 colour formula: 1 << (colour * 2)
|
paul@54 | 1147 | 4 colour formula: 1 << colour
|
paul@54 | 1148 | 16 colour formula: 1 << ((colour & 2) + ((colour & 8) * 2))
|
paul@52 | 1149 |
|
paul@53 | 1150 | For example, where logical colour 0 has been mapped to a physical colour in
|
paul@53 | 1151 | the range 8 to 15, a bitmap of 00010001 would be chosen as its contribution to
|
paul@53 | 1152 | the inversion operation. (The lower three bits of the physical colour would be
|
paul@53 | 1153 | used to set the underlying colour information affected by the inversion
|
paul@53 | 1154 | operation.)
|
paul@53 | 1155 |
|
paul@52 | 1156 | An operation in the interrupt code would then combine the bitmaps for all
|
paul@52 | 1157 | logical colours in 2 and 4 colour modes, with the 16 colour bitmaps being
|
paul@52 | 1158 | combined for groups of logical colours as follows:
|
paul@52 | 1159 |
|
paul@54 | 1160 | Logical colours
|
paul@54 | 1161 | ---------------
|
paul@52 | 1162 | 0, 2, 8, 10
|
paul@52 | 1163 | 4, 6, 12, 14
|
paul@52 | 1164 | 5, 7, 13, 15
|
paul@52 | 1165 | 1, 3, 9, 11
|
paul@52 | 1166 |
|
paul@52 | 1167 | These combined bitmaps would be EORed with the existing palette register
|
paul@52 | 1168 | values in order to perform the value inversion necessary to produce the
|
paul@52 | 1169 | flashing effect.
|
paul@51 | 1170 |
|
paul@54 | 1171 | Thus, in the VDU 19 operation, the appropriate inversion value would be
|
paul@54 | 1172 | calculated for the logical colour, and this value would then be combined with
|
paul@54 | 1173 | other inversion values in a dedicated memory location corresponding to the
|
paul@54 | 1174 | colour's group as indicated above. Meanwhile, the palette channel values would
|
paul@54 | 1175 | be derived from the lower three bits of the specified physical colour and
|
paul@54 | 1176 | combined with other palette data in dedicated memory locations corresponding
|
paul@54 | 1177 | to the palette registers.
|
paul@54 | 1178 |
|
paul@72 | 1179 | Interestingly, although flashing colours on the BBC Micro are controlled by
|
paul@72 | 1180 | toggling bit 0 of the &FE20 control register location for the Video ULA, the
|
paul@72 | 1181 | actual colour inversion is done in hardware.
|
paul@72 | 1182 |
|
paul@55 | 1183 | Enhancement: Palette Definition Lists
|
paul@55 | 1184 | -------------------------------------
|
paul@4 | 1185 |
|
paul@4 | 1186 | It can be useful to redefine the palette in order to change the colours
|
paul@4 | 1187 | available for a particular region of the screen, particularly in modes where
|
paul@4 | 1188 | the choice of colours is constrained, and if an increased colour depth were
|
paul@4 | 1189 | available, palette redefinition would be useful to give the illusion of more
|
paul@4 | 1190 | than 16 colours in MODE 2. Traditionally, palette redefinition has been done
|
paul@4 | 1191 | by using interrupt-driven timers, but a more efficient approach would involve
|
paul@4 | 1192 | presenting lists of palette definitions to the ULA so that it can change the
|
paul@4 | 1193 | palette at a particular display line.
|
paul@4 | 1194 |
|
paul@4 | 1195 | One might define a palette redefinition list in a region of memory and then
|
paul@4 | 1196 | communicate its contents to the ULA by writing the address and length of the
|
paul@4 | 1197 | list, along with the display line at which the palette is to be changed, to
|
paul@4 | 1198 | ULA registers such that the ULA buffers the list and performs the redefinition
|
paul@4 | 1199 | at the appropriate time. Throughput/bandwidth considerations might impose
|
paul@4 | 1200 | restrictions on the practical length of such a list, however.
|
paul@4 | 1201 |
|
paul@128 | 1202 | A simple form of palette definition might be useful in text modes. Within the
|
paul@128 | 1203 | blank region between lines, the foreground palette could be changed to apply
|
paul@128 | 1204 | to the next line. Palette values could be read from a table in RAM, perhaps
|
paul@128 | 1205 | preceding the screen data, with 24 2-byte entries providing palette
|
paul@128 | 1206 | redefinition support in 2- and 4-colour modes.
|
paul@128 | 1207 |
|
paul@79 | 1208 | Enhancement: Display Synchronisation Interrupts
|
paul@79 | 1209 | -----------------------------------------------
|
paul@79 | 1210 |
|
paul@79 | 1211 | When completing each scanline of the display, the ULA could trigger an
|
paul@79 | 1212 | interrupt. Since this might impact system performance substantially, the
|
paul@79 | 1213 | feature would probably need to be configurable, and it might be sufficient to
|
paul@79 | 1214 | have an interrupt only after a certain number of display lines instead.
|
paul@79 | 1215 | Permitting the CPU to take action after eight lines would allow palette
|
paul@79 | 1216 | switching and other effects to occur on a character row basis.
|
paul@79 | 1217 |
|
paul@79 | 1218 | The ULA provides an interrupt at the end of the display period, presumably so
|
paul@79 | 1219 | that software can schedule updates to the screen, avoid flickering or tearing,
|
paul@79 | 1220 | and so on. However, some applications might benefit from an interrupt at, or
|
paul@79 | 1221 | just before, the start of the display period so that palette modifications or
|
paul@79 | 1222 | similar effects could be scheduled.
|
paul@79 | 1223 |
|
paul@55 | 1224 | Enhancement: Palette-Free Modes
|
paul@55 | 1225 | -------------------------------
|
paul@4 | 1226 |
|
paul@4 | 1227 | Palette-free modes might be defined where bit values directly correspond to
|
paul@4 | 1228 | the red, green and blue channels, although this would mostly make sense only
|
paul@4 | 1229 | for modes with depths greater than the standard 4 bits per pixel, and such
|
paul@4 | 1230 | modes would require more memory than MODE 2 if they were to have an acceptable
|
paul@4 | 1231 | resolution.
|
paul@4 | 1232 |
|
paul@55 | 1233 | Enhancement: Display Suspend
|
paul@55 | 1234 | ----------------------------
|
paul@4 | 1235 |
|
paul@4 | 1236 | Especially when writing to the screen memory, it could be beneficial to be
|
paul@4 | 1237 | able to suspend the ULA's access to the memory, instead producing blank values
|
paul@4 | 1238 | for all screen pixels until a program is ready to reveal the screen. This is
|
paul@4 | 1239 | different from palette blanking since with a blank palette, the ULA is still
|
paul@4 | 1240 | reading screen memory and translating its contents into pixel values that end
|
paul@4 | 1241 | up being blank.
|
paul@4 | 1242 |
|
paul@4 | 1243 | This function is reminiscent of a capability of the ZX81, albeit necessary on
|
paul@4 | 1244 | that hardware to reduce the load on the system CPU which was responsible for
|
paul@62 | 1245 | producing the video output. By allowing display suspend on the Electron, the
|
paul@62 | 1246 | performance benefit would be derived from giving the CPU full access to the
|
paul@62 | 1247 | memory bandwidth.
|
paul@4 | 1248 |
|
paul@125 | 1249 | Note that since the CPU is only able to access RAM at 1MHz, there is no
|
paul@125 | 1250 | possibility to improve performance beyond that achieved in MODE 4, 5 or 6
|
paul@125 | 1251 | normally. However, if faster RAM access were to be made possible (see the
|
paul@125 | 1252 | discussion of 8-bit wide RAM access), the CPU could benefit from freeing up
|
paul@125 | 1253 | the ULA's access slots entirely.
|
paul@125 | 1254 |
|
paul@74 | 1255 | The region blanking feature mentioned above could be implemented using this
|
paul@74 | 1256 | enhancement instead of employing palette blanking for the affected lines of
|
paul@74 | 1257 | the display.
|
paul@74 | 1258 |
|
paul@63 | 1259 | Enhancement: Memory Filling
|
paul@63 | 1260 | ---------------------------
|
paul@63 | 1261 |
|
paul@63 | 1262 | A capability that could be given to an enhanced ULA is that of permitting the
|
paul@63 | 1263 | ULA to write to screen memory as well being able to read from it. Although
|
paul@63 | 1264 | such a capability would probably not be useful in conjunction with the
|
paul@63 | 1265 | existing read operations when producing a screen display, and insufficient
|
paul@63 | 1266 | bandwidth would exist to do so in high-bandwidth screen modes anyway, the
|
paul@63 | 1267 | capability could be offered during a display suspend period (as described
|
paul@63 | 1268 | above), permitting a more efficient mechanism to rapidly fill memory with a
|
paul@63 | 1269 | predetermined value.
|
paul@63 | 1270 |
|
paul@63 | 1271 | This capability could also support block filling, where the limits of the
|
paul@63 | 1272 | filled memory would be defined by the position and size of a screen area,
|
paul@63 | 1273 | although this would demand the provision of additional registers in the ULA to
|
paul@63 | 1274 | retain the details of such areas and additional logic to control the fill
|
paul@63 | 1275 | operation.
|
paul@63 | 1276 |
|
paul@69 | 1277 | Enhancement: Region Filling
|
paul@69 | 1278 | ---------------------------
|
paul@69 | 1279 |
|
paul@69 | 1280 | An alternative to memory writing might involve indicating regions using
|
paul@69 | 1281 | additional registers or memory where the ULA fills regions of the screen with
|
paul@69 | 1282 | content instead of reading from memory. Unlike hardware sprites which should
|
paul@69 | 1283 | realistically provide varied content, region filling could employ single
|
paul@69 | 1284 | colours or patterns, and one advantage of doing so would be that the ULA need
|
paul@69 | 1285 | not access memory at all within a particular region.
|
paul@69 | 1286 |
|
paul@69 | 1287 | Regions would be defined on a row-by-row basis. Instead of reading memory and
|
paul@69 | 1288 | blitting a direct representation to the screen, the ULA would read region
|
paul@69 | 1289 | definitions containing a start column, region width and colour details. There
|
paul@69 | 1290 | might be a certain number of definitions allowed per row, or the ULA might
|
paul@69 | 1291 | just traverse an ordered list of such definitions with each one indicating the
|
paul@71 | 1292 | row, start column, region width and colour details.
|
paul@71 | 1293 |
|
paul@71 | 1294 | One could even compress this information further by requiring only the row,
|
paul@71 | 1295 | start column and colour details with each subsequent definition terminating
|
paul@71 | 1296 | the effect of the previous one. However, one would also need to consider the
|
paul@71 | 1297 | convenience of preparing such definitions and whether efficient access to
|
paul@71 | 1298 | definitions for a particular row might be desirable. It might also be
|
paul@71 | 1299 | desirable to avoid having to prepare definitions for "empty" areas of the
|
paul@71 | 1300 | screen, effectively making the definition of the screen contents employ
|
paul@71 | 1301 | run-length encoding and employ only colour plus length information.
|
paul@69 | 1302 |
|
paul@69 | 1303 | One application of region filling is that of simple 2D and 3D shape rendering.
|
paul@69 | 1304 | Although it is entirely possible to plot such shapes to the screen and have
|
paul@69 | 1305 | the ULA blit the memory contents to the screen, such operations consume
|
paul@69 | 1306 | bandwidth both in the initial plotting and in the final transfer to the
|
paul@69 | 1307 | screen. Region filling would reduce such bandwidth usage substantially.
|
paul@69 | 1308 |
|
paul@71 | 1309 | This way of representing screen images would make certain kinds of images
|
paul@71 | 1310 | unfeasible to represent - consider alternating single pixel values which could
|
paul@71 | 1311 | easily occur in some character bitmaps - even if an internal queue of regions
|
paul@71 | 1312 | were to be supported such that the ULA could read ahead and buffer such
|
paul@71 | 1313 | "bandwidth intensive" areas. Thus, the ULA might be better served providing
|
paul@71 | 1314 | this feature for certain areas of the display only as some kind of special
|
paul@71 | 1315 | graphics window.
|
paul@71 | 1316 |
|
paul@55 | 1317 | Enhancement: Hardware Sprites
|
paul@55 | 1318 | -----------------------------
|
paul@0 | 1319 |
|
paul@0 | 1320 | An enhanced ULA might provide hardware sprites, but this would be done in an
|
paul@0 | 1321 | way that is incompatible with the standard ULA, since no &FE*X locations are
|
paul@34 | 1322 | available for allocation. To keep the facility simple, hardware sprites would
|
paul@34 | 1323 | have a standard byte width and height.
|
paul@34 | 1324 |
|
paul@34 | 1325 | The specification of sprites could involve the reservation of 16 locations
|
paul@34 | 1326 | (for example, &FE20-F) specifying a fixed number of eight sprites, with each
|
paul@34 | 1327 | location pair referring to the sprite data. By limiting the ULA to dealing
|
paul@34 | 1328 | with a fixed number of sprites, the work required inside the ULA would be
|
paul@35 | 1329 | reduced since it would avoid having to deal with arbitrary numbers of sprites.
|
paul@0 | 1330 |
|
paul@35 | 1331 | The principal limitation on providing hardware sprites is that of having to
|
paul@35 | 1332 | obtain sprite data, given that the ULA is usually required to retrieve screen
|
paul@35 | 1333 | data, and given the lack of memory bandwidth available to retrieve sprite data
|
paul@35 | 1334 | (particularly from multiple sprites supposedly at the same position) and
|
paul@35 | 1335 | screen data simultaneously. Although the ULA could potentially read sprite
|
paul@35 | 1336 | data and screen data in alternate memory accesses in screen modes where the
|
paul@35 | 1337 | bandwidth is not already fully utilised, this would result in a degradation of
|
paul@35 | 1338 | performance.
|
paul@34 | 1339 |
|
paul@55 | 1340 | Enhancement: Additional Screen Mode Configurations
|
paul@55 | 1341 | --------------------------------------------------
|
paul@24 | 1342 |
|
paul@24 | 1343 | Alternative screen mode configurations could be supported. The ULA has to
|
paul@24 | 1344 | produce 640 pixel values across the screen, with pixel doubling or quadrupling
|
paul@24 | 1345 | employed to fill the screen width:
|
paul@24 | 1346 |
|
paul@24 | 1347 | Screen width Columns Scaling Depth Bytes
|
paul@24 | 1348 | ------------ ------- ------- ----- -----
|
paul@24 | 1349 | 640 80 x1 1 80
|
paul@24 | 1350 | 320 40 x2 1, 2 40, 80
|
paul@24 | 1351 | 160 20 x4 2, 4 40, 80
|
paul@24 | 1352 |
|
paul@24 | 1353 | It must also use at most 80 byte-sized memory accesses to provide the
|
paul@24 | 1354 | information for the display. Given that characters must occupy an 8x8 pixel
|
paul@24 | 1355 | array, if a configuration featuring anything other than 20, 40 or 80 character
|
paul@24 | 1356 | columns is to be supported, compromises must be made such as the introduction
|
paul@24 | 1357 | of blank pixels either between characters (such as occurs between rows in MODE
|
paul@24 | 1358 | 3 and 6) or at the end of a scanline (such as occurs at the end of the frame
|
paul@55 | 1359 | in MODE 3 and 6). Consider the following configuration:
|
paul@24 | 1360 |
|
paul@24 | 1361 | Screen width Columns Scaling Depth Bytes Blank
|
paul@24 | 1362 | ------------ ------- ------- ----- ------ -----
|
paul@24 | 1363 | 208 26 x3 1, 2 26, 52 16
|
paul@24 | 1364 |
|
paul@24 | 1365 | Here, if the ULA can triple pixels, a 26 column mode with either 2 or 4
|
paul@24 | 1366 | colours could be provided, with 16 blank pixel values (out of a total of 640)
|
paul@24 | 1367 | generated either at the start or end (or split between the start and end) of
|
paul@24 | 1368 | each scanline.
|
paul@24 | 1369 |
|
paul@55 | 1370 | Enhancement: Character Attributes
|
paul@55 | 1371 | ---------------------------------
|
paul@24 | 1372 |
|
paul@24 | 1373 | The BBC Micro MODE 7 employs something resembling character attributes to
|
paul@24 | 1374 | support teletext displays, but depends on circuitry providing a character
|
paul@24 | 1375 | generator. The ZX Spectrum, on the other hand, provides character attributes
|
paul@24 | 1376 | as a means of colouring bitmapped graphics. Although such a feature is very
|
paul@24 | 1377 | limiting as the sole means of providing multicolour graphics, in situations
|
paul@24 | 1378 | where the choice is between low resolution multicolour graphics or high
|
paul@24 | 1379 | resolution monochrome graphics, character attributes provide a potentially
|
paul@24 | 1380 | useful compromise.
|
paul@24 | 1381 |
|
paul@24 | 1382 | For each byte read, the ULA must deliver 8 pixel values (out of a total of
|
paul@24 | 1383 | 640) to the video output, doing so by either emptying its pixel buffer on a
|
paul@24 | 1384 | pixel per cycle basis, or by multiplying pixels and thus holding them for more
|
paul@24 | 1385 | than one cycle. For example for a screen mode having 640 pixels in width:
|
paul@24 | 1386 |
|
paul@24 | 1387 | Cycle: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
|
paul@24 | 1388 | Reads: B B
|
paul@24 | 1389 | Pixels: 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
|
paul@24 | 1390 |
|
paul@24 | 1391 | And for a screen mode having 320 pixels in width:
|
paul@24 | 1392 |
|
paul@24 | 1393 | Cycle: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
|
paul@24 | 1394 | Reads: B
|
paul@24 | 1395 | Pixels: 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7
|
paul@24 | 1396 |
|
paul@24 | 1397 | However, in modes where less than 80 bytes are required to generate the pixel
|
paul@24 | 1398 | values, an enhanced ULA might be able to read additional bytes between those
|
paul@24 | 1399 | providing the bitmapped graphics data:
|
paul@24 | 1400 |
|
paul@24 | 1401 | Cycle: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
|
paul@24 | 1402 | Reads: B A
|
paul@24 | 1403 | Pixels: 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7
|
paul@24 | 1404 |
|
paul@24 | 1405 | These additional bytes could provide colour information for the bitmapped data
|
paul@24 | 1406 | in the following character column (of 8 pixels). Since it would be desirable
|
paul@24 | 1407 | to apply attribute data to the first column, the initial 8 cycles might be
|
paul@24 | 1408 | configured to not produce pixel values.
|
paul@24 | 1409 |
|
paul@35 | 1410 | For an entire character, attribute data need only be read for the first row of
|
paul@35 | 1411 | pixels for a character. The subsequent rows would have attribute information
|
paul@35 | 1412 | applied to them, although this would require the attribute data to be stored
|
paul@35 | 1413 | in some kind of buffer. Thus, the following access pattern would be observed:
|
paul@35 | 1414 |
|
paul@112 | 1415 | Reads: A B _ B _ B _ B _ B _ B _ B _ B ...
|
paul@112 | 1416 |
|
paul@112 | 1417 | In modes 3 and 6, the blank display lines could be used to retrieve attribute
|
paul@112 | 1418 | data:
|
paul@112 | 1419 |
|
paul@112 | 1420 | Reads (blank): A _ A _ A _ A _ A _ A _ A _ A _ ...
|
paul@112 | 1421 | Reads (active): B _ B _ B _ B _ B _ B _ B _ B _ ...
|
paul@112 | 1422 | Reads (active): B _ B _ B _ B _ B _ B _ B _ B _ ...
|
paul@112 | 1423 | ...
|
paul@112 | 1424 |
|
paul@112 | 1425 | See below for a discussion of using this for character data as well.
|
paul@35 | 1426 |
|
paul@24 | 1427 | A whole byte used for colour information for a whole character would result in
|
paul@35 | 1428 | a choice of 256 colours, and this might be somewhat excessive. By only reading
|
paul@35 | 1429 | attribute bytes at every other opportunity, a choice of 16 colours could be
|
paul@35 | 1430 | applied individually to two characters.
|
paul@24 | 1431 |
|
paul@24 | 1432 | Cycle: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
paul@24 | 1433 | Reads: B A B -
|
paul@24 | 1434 | Pixels: 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7
|
paul@24 | 1435 |
|
paul@35 | 1436 | Further reductions in attribute data access, offering 4 colours for every
|
paul@35 | 1437 | character in a four character block, for example, might also be worth
|
paul@34 | 1438 | considering.
|
paul@34 | 1439 |
|
paul@24 | 1440 | Consider the following configurations for screen modes with a colour depth of
|
paul@24 | 1441 | 1 bit per pixel for bitmap information:
|
paul@24 | 1442 |
|
paul@35 | 1443 | Screen width Columns Scaling Bytes (B) Bytes (A) Colours Screen start
|
paul@35 | 1444 | ------------ ------- ------- --------- --------- ------- ------------
|
paul@35 | 1445 | 320 40 x2 40 40 256 &5300
|
paul@35 | 1446 | 320 40 x2 40 20 16 &5580 -> &5500
|
paul@35 | 1447 | 320 40 x2 40 10 4 &56C0 -> &5600
|
paul@35 | 1448 | 208 26 x3 26 26 256 &62C0 -> &6200
|
paul@35 | 1449 | 208 26 x3 26 13 16 &6460 -> &6400
|
paul@34 | 1450 |
|
paul@113 | 1451 | Enhancement: Text-Only Modes using Character and Attribute Data
|
paul@113 | 1452 | ---------------------------------------------------------------
|
paul@112 | 1453 |
|
paul@112 | 1454 | In modes 3 and 6, the blank display lines could be used to retrieve character
|
paul@112 | 1455 | and attribute data instead of trying to insert it between bitmap data accesses,
|
paul@112 | 1456 | but this data would then need to be retained:
|
paul@112 | 1457 |
|
paul@112 | 1458 | Reads: A C A C A C A C A C A C A C A C ...
|
paul@112 | 1459 | Reads: B _ B _ B _ B _ B _ B _ B _ B _ ...
|
paul@112 | 1460 |
|
paul@112 | 1461 | Only attribute (A) and character (C) reads would require screen memory
|
paul@112 | 1462 | storage. Bitmap data reads (B) would involve either accesses to memory to
|
paul@112 | 1463 | obtain character definition details or could, at the cost of special storage
|
paul@112 | 1464 | in the ULA, involve accesses within the ULA that would then free up the RAM.
|
paul@112 | 1465 | However, the CPU would not benefit from having any extra access slots due to
|
paul@112 | 1466 | the limitations of the RAM access mechanism.
|
paul@112 | 1467 |
|
paul@113 | 1468 | A scheme without caching might be possible. The same line of memory addresses
|
paul@113 | 1469 | might be visited over and over again for eight display lines, with an index
|
paul@113 | 1470 | into the bitmap data being incremented from zero to seven. The access patterns
|
paul@113 | 1471 | would look like this:
|
paul@113 | 1472 |
|
paul@113 | 1473 | Reads: C B C B C B C B C B C B C B C B ... (generate data from index 0)
|
paul@113 | 1474 | Reads: C B C B C B C B C B C B C B C B ... (generate data from index 1)
|
paul@113 | 1475 | Reads: C B C B C B C B C B C B C B C B ... (generate data from index 2)
|
paul@113 | 1476 | Reads: C B C B C B C B C B C B C B C B ... (generate data from index 3)
|
paul@113 | 1477 | Reads: C B C B C B C B C B C B C B C B ... (generate data from index 4)
|
paul@113 | 1478 | Reads: C B C B C B C B C B C B C B C B ... (generate data from index 5)
|
paul@113 | 1479 | Reads: C B C B C B C B C B C B C B C B ... (generate data from index 6)
|
paul@113 | 1480 | Reads: C B C B C B C B C B C B C B C B ... (generate data from index 7)
|
paul@113 | 1481 |
|
paul@113 | 1482 | The bandwidth requirements would be the sum of the accesses to read the
|
paul@113 | 1483 | character values (repeatedly) and those to read the bitmap data to reproduce
|
paul@113 | 1484 | the characters on screen.
|
paul@113 | 1485 |
|
paul@55 | 1486 | Enhancement: MODE 7 Emulation using Character Attributes
|
paul@55 | 1487 | --------------------------------------------------------
|
paul@24 | 1488 |
|
paul@24 | 1489 | If the scheme of applying attributes to character regions were employed to
|
paul@24 | 1490 | emulate MODE 7, in conjunction with the MODE 6 display technique, the
|
paul@24 | 1491 | following configuration would be required:
|
paul@24 | 1492 |
|
paul@24 | 1493 | Screen width Columns Rows Bytes (B) Bytes (A) Colours Screen start
|
paul@24 | 1494 | ------------ ------- ---- --------- --------- ------- ------------
|
paul@35 | 1495 | 320 40 25 40 20 16 &5ECC -> &5E00
|
paul@35 | 1496 | 320 40 25 40 10 4 &5FC6 -> &5F00
|
paul@24 | 1497 |
|
paul@35 | 1498 | Although this requires much more memory than MODE 7 (8500 bytes versus MODE
|
paul@35 | 1499 | 7's 1000 bytes), it does not need much more memory than MODE 6, and it would
|
paul@35 | 1500 | at least make a limited 40-column multicolour mode available as a substitute
|
paul@35 | 1501 | for MODE 7.
|
paul@24 | 1502 |
|
paul@113 | 1503 | Using the text-only enhancement with caching of data or with repeated reads of
|
paul@113 | 1504 | the same character data line for eight display lines, the storage requirements
|
paul@112 | 1505 | would be diminished substantially:
|
paul@112 | 1506 |
|
paul@112 | 1507 | Screen width Columns Rows Bytes (C) Bytes (A) Colours Screen start
|
paul@112 | 1508 | ------------ ------- ---- --------- --------- ------- ------------
|
paul@112 | 1509 | 320 40 25 40 20 16 &7A94 -> &7A00
|
paul@112 | 1510 | 320 40 25 40 10 4 &7B1E -> &7B00
|
paul@112 | 1511 | 320 40 25 40 5 2 &7B9B -> &7B00
|
paul@112 | 1512 | 320 40 25 40 0 (2) &7C18 -> &7C00
|
paul@112 | 1513 | 640 80 25 80 40 16 &7448 -> &7400
|
paul@112 | 1514 | 640 80 25 80 20 4 &763C -> &7600
|
paul@112 | 1515 | 640 80 25 80 10 2 &7736 -> &7700
|
paul@112 | 1516 | 640 80 25 80 0 (2) &7830 -> &7800
|
paul@112 | 1517 |
|
paul@112 | 1518 | Note that the colours describe the locally defined attributes for each
|
paul@112 | 1519 | character. When no attribute information is provided, the colours are defined
|
paul@112 | 1520 | globally.
|
paul@112 | 1521 |
|
paul@130 | 1522 | Enhancement: Character Generator Support and Vertical Scaling
|
paul@130 | 1523 | -------------------------------------------------------------
|
paul@130 | 1524 |
|
paul@130 | 1525 | When generating a picture, the ULA traverses screen memory, obtaining 40 or 80
|
paul@130 | 1526 | bytes of pixel data for each scanline. It then proceeds to the next row of
|
paul@130 | 1527 | pixel data for each successive scanline, with the exception of the text modes
|
paul@130 | 1528 | where scanlines may be blank (for which the row address does not advance).
|
paul@130 | 1529 | This arrangement provides a conventional bitmapped graphics display.
|
paul@130 | 1530 |
|
paul@130 | 1531 | However, the ULA could instead facilitate the use of character generators. The
|
paul@130 | 1532 | principles involved can be demonstrated by the Jafa Mode 7 Mark 2 Display Unit
|
paul@130 | 1533 | expansion for the Electron which feeds the pixel data from a MODE 4 screen to
|
paul@130 | 1534 | a SAA5050 character generator to create a MODE 7 display. The solution adopted
|
paul@130 | 1535 | involves the replication of 40 bytes of character data across as many pixel
|
paul@130 | 1536 | rows as is necessary for the character generator to receive the appropriate
|
paul@130 | 1537 | character data for all scanlines in any given character row. If only a single
|
paul@130 | 1538 | 40-byte row of character data were to be present for the first scanline of a
|
paul@130 | 1539 | character row, the character generator would only produce the first scanline
|
paul@130 | 1540 | (or the uppermost pixels of the characters) correctly, with the rest of the
|
paul@130 | 1541 | character shapes being ill-defined.
|
paul@130 | 1542 |
|
paul@130 | 1543 | Here, the ULA could facilitate the use of memory-efficient character mode
|
paul@130 | 1544 | representations (such as MODE 7) by holding the row address for a number of
|
paul@130 | 1545 | scanlines, thus providing the same row of screen data for those scanlines,
|
paul@130 | 1546 | then advancing to the next row. Visualised in terms of pixel data, it would be
|
paul@130 | 1547 | like providing a display with a very low vertical resolution. Indeed, being
|
paul@130 | 1548 | able to reduce the vertical resolution of a display mode by a factor of eight
|
paul@130 | 1549 | or ten would be equivalent to the above character generation technique in
|
paul@130 | 1550 | terms of the ULA's screen reading activities.
|
paul@130 | 1551 |
|
paul@130 | 1552 | By combining this vertical scaling or scanline replication with a circuit
|
paul@130 | 1553 | switchable between bitmapped graphics output and character graphics output,
|
paul@130 | 1554 | MODE 7 support could be made available, potentially as a hardware option
|
paul@130 | 1555 | separate from the ULA.
|
paul@130 | 1556 |
|
paul@140 | 1557 | Enhancement: 40-Column Text Modes by Interleaving Screen and Bitmap Accesses
|
paul@140 | 1558 | ----------------------------------------------------------------------------
|
paul@140 | 1559 |
|
paul@140 | 1560 | Suggested here: https://stardot.org.uk/forums/viewtopic.php?p=393243#p393243
|
paul@140 | 1561 |
|
paul@140 | 1562 | The ULA could be run in high-bandwidth mode to fetch character codes from
|
paul@140 | 1563 | screen memory in one cycle and then to use the character code to look up a
|
paul@140 | 1564 | pixel row of a character bitmap, reading that bitmap slice in the following
|
paul@140 | 1565 | cycle. The bitmap would be converted to pixel values that would then be
|
paul@140 | 1566 | emitted over the subsequent two cycles concurrently with the preparation of
|
paul@140 | 1567 | the next character's pixels.
|
paul@140 | 1568 |
|
paul@140 | 1569 | 2MHz cycle: 0 1 2 3 4 5 ...
|
paul@140 | 1570 | Reads: C B C B C B ...
|
paul@140 | 1571 | Pixels: a b ...
|
paul@140 | 1572 |
|
paul@140 | 1573 | The memory access to bitmap data would be computed as follows, assuming the
|
paul@140 | 1574 | normal eight pixel height and single-byte encoding of character bitmaps:
|
paul@140 | 1575 |
|
paul@140 | 1576 | bitmap address = bitmap table base + (character code * 8) + bitmap row
|
paul@140 | 1577 |
|
paul@140 | 1578 | Each successive pixel row on the screen would expose the appropriate row in
|
paul@140 | 1579 | the character bitmap, with this "bitmap row" looping from 0 to 7 repeatedly.
|
paul@140 | 1580 | Spacing between character lines could be introduced as already done in MODE 6.
|
paul@140 | 1581 |
|
paul@112 | 1582 | Enhancement: Compressed Character Data
|
paul@112 | 1583 | --------------------------------------
|
paul@112 | 1584 |
|
paul@112 | 1585 | Another observation about text-only modes is that they only need to store a
|
paul@112 | 1586 | restricted set of bitmapped data values. Encoding this set of values in a
|
paul@112 | 1587 | smaller unit of storage than a byte could possibly help to reduce the amount
|
paul@112 | 1588 | of storage and bandwidth required to reproduce the characters on the display.
|
paul@112 | 1589 |
|
paul@137 | 1590 | Enhancement: High Resolution Graphics and Larger Colour Depths
|
paul@137 | 1591 | --------------------------------------------------------------
|
paul@0 | 1592 |
|
paul@82 | 1593 | Screen modes with higher resolutions and larger colour depths might be
|
paul@82 | 1594 | possible, but this would in most cases involve the allocation of more screen
|
paul@82 | 1595 | memory, and the ULA would probably then be obliged to page in such memory for
|
paul@137 | 1596 | the CPU to be able to sensibly access it all. Higher resolutions would also
|
paul@137 | 1597 | involve a faster pixel clock.
|
paul@137 | 1598 |
|
paul@137 | 1599 | However, we may consider a doubled colour depth and the need for higher
|
paul@137 | 1600 | bandwidth transfers by a ULA having an 8-bit data bus to access the RAM,
|
paul@137 | 1601 | utilising two "page mode" transfers per 2MHz cycle. If such transfers were to
|
paul@137 | 1602 | access consecutive bytes in the same memory region (for example, bytes &3000
|
paul@137 | 1603 | and &3001) this would require a change to the arrangement of screen memory,
|
paul@137 | 1604 | also incurring changes to the memory map for larger modes:
|
paul@137 | 1605 |
|
paul@137 | 1606 | (&3000 &3001) (&3010 &3011) ...
|
paul@137 | 1607 | (&3002 &3003) (&3012 &3013)
|
paul@137 | 1608 | ... ...
|
paul@137 | 1609 | (&300E &300F) (&301E &301F)
|
paul@137 | 1610 |
|
paul@137 | 1611 | If such transfers were to access two adjacent columns of bytes (for example,
|
paul@137 | 1612 | bytes &3000 and &3008), this would still require a change in the step size
|
paul@137 | 1613 | across the screen memory, also incur memory map changes for larger modes, and
|
paul@137 | 1614 | the method for programs to update the screen would be more complicated:
|
paul@137 | 1615 |
|
paul@137 | 1616 | (&3000 &3008) (&3010 &3018) ...
|
paul@137 | 1617 | (&3001 &3009) (&3011 &3019)
|
paul@137 | 1618 | ... ...
|
paul@137 | 1619 | (&3007 &300F) (&3017 &301F)
|
paul@137 | 1620 |
|
paul@137 | 1621 | However, such transfers could instead map the device address bit that is
|
paul@137 | 1622 | toggled between transfers to the most significant system memory address bit.
|
paul@137 | 1623 | Thus, bits in adjacent locations within each RAM device would actually reside
|
paul@137 | 1624 | in different memory regions:
|
paul@137 | 1625 |
|
paul@137 | 1626 | (&3000 &B000) (&3008 &B008) ...
|
paul@137 | 1627 | (&3001 &B001) (&3009 &B009)
|
paul@137 | 1628 | ... ...
|
paul@137 | 1629 | (&3007 &B007) (&300F &B00F)
|
paul@137 | 1630 |
|
paul@137 | 1631 | Since &B000 can also be considered as &3000 combined with &8000, this
|
paul@137 | 1632 | introducing the asserted uppermost bit, address &B000 can be considered as
|
paul@137 | 1633 | &3000 in an upper memory bank.
|
paul@137 | 1634 |
|
paul@137 | 1635 | Other mechanisms might be employed to allow programs to access the uppermost
|
paul@137 | 1636 | bank, but the ULA would be able to access it trivially and unconditionally.
|
paul@0 | 1637 |
|
paul@55 | 1638 | Enhancement: Genlock Support
|
paul@55 | 1639 | ----------------------------
|
paul@46 | 1640 |
|
paul@46 | 1641 | The ULA generates a video signal in conjunction with circuitry producing the
|
paul@46 | 1642 | output features necessary for the correct display of the screen image.
|
paul@46 | 1643 | However, it appears that the ULA drives the video synchronisation mechanism
|
paul@46 | 1644 | instead of reacting to an existing signal. Genlock support might be possible
|
paul@46 | 1645 | if the ULA were made to be responsive to such external signals, resetting its
|
paul@46 | 1646 | address generators upon receiving synchronisation events.
|
paul@46 | 1647 |
|
paul@55 | 1648 | Enhancement: Improved Sound
|
paul@55 | 1649 | ---------------------------
|
paul@0 | 1650 |
|
paul@55 | 1651 | The standard ULA reserves &FE*6 for sound generation and cassette input/output
|
paul@55 | 1652 | (with bits 1 and 2 of &FE*7 being used to select either sound generation or
|
paul@55 | 1653 | cassette I/O), thus making it impossible to support multiple channels within
|
paul@0 | 1654 | the given framework. The BBC Micro ULA employs &FE40-&FE4F for sound control,
|
paul@0 | 1655 | and an enhanced ULA could adopt this interface.
|
paul@0 | 1656 |
|
paul@9 | 1657 | The BBC Micro uses the SN76489 chip to produce sound, and the entire
|
paul@9 | 1658 | functionality of this chip could be emulated for enhanced sound, with a subset
|
paul@9 | 1659 | of the functionality exposed via the &FE*6 interface.
|
paul@9 | 1660 |
|
paul@9 | 1661 | See: http://en.wikipedia.org/wiki/Texas_Instruments_SN76489
|
paul@81 | 1662 | See: http://www.smspower.org/Development/SN76489
|
paul@9 | 1663 |
|
paul@55 | 1664 | Enhancement: Waveform Upload
|
paul@55 | 1665 | ----------------------------
|
paul@0 | 1666 |
|
paul@0 | 1667 | As with a hardware sprite function, waveforms could be uploaded or referenced
|
paul@0 | 1668 | using locations as registers referencing memory regions.
|
paul@0 | 1669 |
|
paul@55 | 1670 | Enhancement: Sound Input/Output
|
paul@55 | 1671 | -------------------------------
|
paul@46 | 1672 |
|
paul@46 | 1673 | Since the ULA already controls audio input/output for cassette-based data, it
|
paul@46 | 1674 | would have been interesting to entertain the idea of sampling and output of
|
paul@46 | 1675 | sounds through the cassette interface. However, a significant amount of
|
paul@46 | 1676 | circuitry is employed to process the input signal for use by the ULA and to
|
paul@46 | 1677 | process the output signal for recording.
|
paul@46 | 1678 |
|
paul@46 | 1679 | See: http://bbc.nvg.org/doc/A%20Hardware%20Guide%20for%20the%20BBC%20Microcomputer/bbc_hw_03.htm#3.11
|
paul@46 | 1680 |
|
paul@55 | 1681 | Enhancement: BBC ULA Compatibility
|
paul@55 | 1682 | ----------------------------------
|
paul@0 | 1683 |
|
paul@0 | 1684 | Although some new ULA functions could be defined in a way that is also
|
paul@0 | 1685 | compatible with the BBC Micro, the BBC ULA is itself incompatible with the
|
paul@0 | 1686 | Electron ULA: &FE00-7 is reserved for the video controller in the BBC memory
|
paul@0 | 1687 | map, but controls various functions specific to the 6845 video controller;
|
paul@0 | 1688 | &FE08-F is reserved for the serial controller. It therefore becomes possible
|
paul@0 | 1689 | to disregard compatibility where compatibility is already disregarded for a
|
paul@0 | 1690 | particular area of functionality.
|
paul@0 | 1691 |
|
paul@0 | 1692 | &FE20-F maps to video ULA functionality on the BBC Micro which provides
|
paul@0 | 1693 | control over the palette (using address &FE21, compared to &FE07-F on the
|
paul@0 | 1694 | Electron) and other system-specific functions. Since the location usage is
|
paul@0 | 1695 | generally incompatible, this region could be reused for other purposes.
|
paul@31 | 1696 |
|
paul@55 | 1697 | Enhancement: Increased RAM, ULA and CPU Performance
|
paul@55 | 1698 | ---------------------------------------------------
|
paul@49 | 1699 |
|
paul@49 | 1700 | More modern implementations of the hardware might feature faster RAM coupled
|
paul@49 | 1701 | with an increased ULA clock frequency in order to increase the bandwidth
|
paul@49 | 1702 | available to the ULA and to the CPU in situations where the ULA is not needed
|
paul@49 | 1703 | to perform work. A ULA employing a 32MHz clock would be able to complete the
|
paul@49 | 1704 | retrieval of a byte from RAM in only 250ns and thus be able to enable the CPU
|
paul@49 | 1705 | to access the RAM for the following 250ns even in display modes requiring the
|
paul@49 | 1706 | retrieval of a byte for the display every 500ns. The CPU could, subject to
|
paul@49 | 1707 | timing issues, run at 2MHz even in MODE 0, 1 and 2.
|
paul@49 | 1708 |
|
paul@49 | 1709 | A scheme such as that described above would have a similar effect to the
|
paul@49 | 1710 | scheme employed in the BBC Micro, although the latter made use of RAM with a
|
paul@49 | 1711 | wider bandwidth in order to complete memory transfers within 250ns and thus
|
paul@49 | 1712 | permit the CPU to run continuously at 2MHz.
|
paul@49 | 1713 |
|
paul@49 | 1714 | Higher bandwidth could potentially be used to implement exotic features such
|
paul@49 | 1715 | as RAM-resident hardware sprites or indeed any feature demanding RAM access
|
paul@49 | 1716 | concurrent with the production of the display image.
|
paul@49 | 1717 |
|
paul@80 | 1718 | Enhancement: Multiple CPU Stacks and Zero Pages
|
paul@80 | 1719 | -----------------------------------------------
|
paul@75 | 1720 |
|
paul@75 | 1721 | The 6502 maintains a stack for subroutine calls and register storage in page
|
paul@75 | 1722 | &01. Although the stack register can be manipulated using the TSX and TXS
|
paul@75 | 1723 | instructions, thereby permitting the maintenance of multiple stack regions and
|
paul@75 | 1724 | thus the potential coexistence of multiple programs each using a separate
|
paul@75 | 1725 | region, only programs that make little use of the stack (perhaps avoiding
|
paul@75 | 1726 | deeply-nested subroutine invocations and significant register storage) would
|
paul@75 | 1727 | be able to coexist without overwriting each other's stacks.
|
paul@75 | 1728 |
|
paul@75 | 1729 | One way that this issue could be alleviated would involve the provision of a
|
paul@75 | 1730 | facility to redirect accesses to page &01 to other areas of memory. The ULA
|
paul@75 | 1731 | would provide a register that defines a physical page for the use of the CPU's
|
paul@75 | 1732 | "logical" page &01, and upon any access to page &01 by the CPU, the ULA would
|
paul@75 | 1733 | change the asserted address lines to redirect the access to the appropriate
|
paul@75 | 1734 | physical region.
|
paul@75 | 1735 |
|
paul@75 | 1736 | By providing an 8-bit register, mapping to the most significant byte (MSB) of
|
paul@75 | 1737 | a 16-bit address, the ULA could then replace any MSB equal to &01 with the
|
paul@75 | 1738 | register value before the access is made. Where multiple programs coexist,
|
paul@75 | 1739 | upon switching programs, the register would be updated to point the ULA to the
|
paul@75 | 1740 | appropriate stack location, thus providing a simple memory management unit
|
paul@75 | 1741 | (MMU) capability.
|
paul@75 | 1742 |
|
paul@80 | 1743 | In a similar fashion, zero page accesses could also be redirected so that code
|
paul@80 | 1744 | could run from sideways RAM and have zero page operations redirected to "upper
|
paul@80 | 1745 | memory" - for example, to page &BE (with stack accesses redirected to page
|
paul@80 | 1746 | &BF, perhaps) - thereby permitting most CPU operations to occur without
|
paul@80 | 1747 | inadvertent accesses to "lower memory" (the RAM) which would risk stalling the
|
paul@80 | 1748 | CPU as it contends with the ULA for memory access.
|
paul@80 | 1749 |
|
paul@80 | 1750 | Such facilities could also be provided by a separate circuit between the CPU
|
paul@80 | 1751 | and ULA in a fashion similar to that employed by a "turbo" board, but unlike
|
paul@80 | 1752 | such boards, no additional RAM would be provided: all memory accesses would
|
paul@80 | 1753 | occur as normal through the ULA, albeit redirected when configured
|
paul@80 | 1754 | appropriately.
|
paul@80 | 1755 |
|
paul@31 | 1756 | ULA Pin Functions
|
paul@31 | 1757 | -----------------
|
paul@31 | 1758 |
|
paul@31 | 1759 | The functions of the ULA pins are described in the Electron Service Manual. Of
|
paul@31 | 1760 | interest to video processing are the following:
|
paul@31 | 1761 |
|
paul@31 | 1762 | CSYNC (low during horizontal or vertical synchronisation periods, high
|
paul@31 | 1763 | otherwise)
|
paul@31 | 1764 |
|
paul@31 | 1765 | HS (low during horizontal synchronisation periods, high otherwise)
|
paul@31 | 1766 |
|
paul@31 | 1767 | RED, GREEN, BLUE (pixel colour outputs)
|
paul@31 | 1768 |
|
paul@31 | 1769 | CLOCK IN (a 16MHz clock input, 4V peak to peak)
|
paul@31 | 1770 |
|
paul@31 | 1771 | PHI OUT (a 1MHz, 2MHz and stopped clock signal for the CPU)
|
paul@31 | 1772 |
|
paul@31 | 1773 | More general memory access pins:
|
paul@31 | 1774 |
|
paul@31 | 1775 | RAM0...RAM3 (data lines to/from the RAM)
|
paul@31 | 1776 |
|
paul@31 | 1777 | RA0...RA7 (address lines for sending both row and column addresses to the RAM)
|
paul@31 | 1778 |
|
paul@38 | 1779 | RAS (row address strobe setting the row address on a negative edge - see the
|
paul@38 | 1780 | timing notes)
|
paul@31 | 1781 |
|
paul@38 | 1782 | CAS (column address strobe setting the column address on a negative edge -
|
paul@38 | 1783 | see the timing notes)
|
paul@31 | 1784 |
|
paul@31 | 1785 | WE (sets write enable with logic 0, read with logic 1)
|
paul@31 | 1786 |
|
paul@31 | 1787 | ROM (select data access from ROM)
|
paul@31 | 1788 |
|
paul@31 | 1789 | CPU-oriented memory access pins:
|
paul@31 | 1790 |
|
paul@31 | 1791 | A0...A15 (CPU address lines)
|
paul@31 | 1792 |
|
paul@31 | 1793 | PD0...PD7 (CPU data lines)
|
paul@31 | 1794 |
|
paul@31 | 1795 | R/W (indicates CPU write with logic 0, CPU read with logic 1)
|
paul@31 | 1796 |
|
paul@31 | 1797 | Interrupt-related pins:
|
paul@31 | 1798 |
|
paul@31 | 1799 | NMI (CPU request for uninterrupted 1MHz access to memory)
|
paul@31 | 1800 |
|
paul@31 | 1801 | IRQ (signal event to CPU)
|
paul@31 | 1802 |
|
paul@31 | 1803 | POR (power-on reset, resetting the ULA on a positive edge and asserting the
|
paul@31 | 1804 | CPU's RST pin)
|
paul@31 | 1805 |
|
paul@31 | 1806 | RST (master reset for the CPU signalled on power-up and by the Break key)
|
paul@31 | 1807 |
|
paul@31 | 1808 | Keyboard-related pins:
|
paul@31 | 1809 |
|
paul@31 | 1810 | KBD0...KBD3 (keyboard inputs)
|
paul@31 | 1811 |
|
paul@31 | 1812 | CAPS LOCK (control status LED)
|
paul@31 | 1813 |
|
paul@31 | 1814 | Sound-related pins:
|
paul@31 | 1815 |
|
paul@31 | 1816 | SOUND O/P (sound output using internal oscillator)
|
paul@31 | 1817 |
|
paul@31 | 1818 | Cassette-related pins:
|
paul@31 | 1819 |
|
paul@31 | 1820 | CAS IN (cassette circuit input, between 0.5V to 2V peak to peak)
|
paul@31 | 1821 |
|
paul@31 | 1822 | CAS OUT (pseudo-sinusoidal output, 1.8V peak to peak)
|
paul@31 | 1823 |
|
paul@31 | 1824 | CAS RC (detect high tone)
|
paul@31 | 1825 |
|
paul@31 | 1826 | CAS MO (motor relay output)
|
paul@31 | 1827 |
|
paul@31 | 1828 | ÷13 IN (~1200 baud clock input)
|
paul@46 | 1829 |
|
paul@72 | 1830 | ULA Socket
|
paul@72 | 1831 | ----------
|
paul@72 | 1832 |
|
paul@72 | 1833 | The socket used for the ULA is a 3M/TexTool 268-5400 68-pin socket.
|
paul@72 | 1834 |
|
paul@46 | 1835 | References
|
paul@46 | 1836 | ----------
|
paul@46 | 1837 |
|
paul@46 | 1838 | See: http://bbc.nvg.org/doc/A%20Hardware%20Guide%20for%20the%20BBC%20Microcomputer/bbc_hw.htm
|
paul@71 | 1839 |
|
paul@71 | 1840 | About this Document
|
paul@71 | 1841 | -------------------
|
paul@71 | 1842 |
|
paul@71 | 1843 | The most recent version of this document and accompanying distribution should
|
paul@71 | 1844 | be available from the following location:
|
paul@71 | 1845 |
|
paul@71 | 1846 | http://hgweb.boddie.org.uk/ULA
|
paul@71 | 1847 |
|
paul@71 | 1848 | Copyright and licence information can be found in the docs directory of this
|
paul@71 | 1849 | distribution - see docs/COPYING.txt for more information.
|