The Faster Computer Part 2.0
After allot of consideration, I came up with a draft of a real
performance PC. The computer I have drafted out isn't much different
from the current designs, just applied a little different from the
norm. Imagine a computer that didn't half to waste clock cycles
swapping out memory blocks in it's L2-cache. Or for that matter, didn't
need to cache at ALL. What if the Front side buss could be dedicated to
interfacing with all the cool gadgets in your PC, and not bogged down
by the memory. Imagine a 200MHz computer that could blow away the
fastest and greatest 1G available.
Current computers use the CPU's Front-Side Buss(FSB) for EVERYTHING!
Next, current computers have a huge bottleneck in the width and speed
of the FSB. Current computers limit the threw put of the CPU to how
fast it can get data threw it's FSB... To RAM, Heard-disk, Graphics
card, etc. What about the peripherals interface, there no where close
to what the CPU's can do. Not much can be done about the standard
interfaces like ISA, PCI, etc. It is up to big corps agreeing on
standards and applying them together. And the other stuff, like the
heard-drives, well there is only so fast you can spin a steal disk
until you run into problems with heating up barrings, platter tinsel
strength, and shorter product lifespan. In my opinion, HDD manufactures
have done amazing things with heard drives, and there transfer rates.
Whats left to improve, the motherboard and the FSB. Computer
manufactures have had no problem with inventing all kinds of CPU
sockets, slots, cards, etc. Well lets pick at the problems with the FSB
and try to fix them.
The manufactures have made a big fuss about this backside L2-Cache
buss. Well what if that was RAM, and not just another more distant from
memory cache. Most of the traffic that goes across the FSB is grabbing
instructions and data from memory to process. If the memory was on it's
own backside buss, the FSB could be more responsive to all the computer
peripherals. The CPU would have a more direct link to the system
memory. The system would definitely improve in performance. Ketch
22, peripherals and ports use memory addresses to function, the
chip set would half to trap these signals and route them directly to
the L2-RAM. Well the current PC's do this already, the Graphics port is
not a standard VGA card, So the chip set traps signals for a non
existent CGA/EGA//VGA card, and routs them to the graphics port. we
could expand this option, to work with all base memory access. It could
definitely work, and not to difficult to implement, well just rename
the backside L2-Cache buss, the backside L2-RAM buss.
Wait a minute, Why stop there. We could change the CPU's cache, to
actual RAM. like L1-RAM, on the chip, extremely fast, and there goes
all the memory load from cache refreshing. Almost all windows computers
have a basic 640kB base memory, some without any memory dims installed.
lets go further, most "E" machines are sold with more then 32Megs of
memory, lets make the L1-RAM 64Megs, or 128Megs. Then you expand by
adding memory to the L2-RAM slots. That works quite well.
What would this look like?
||This CPU and mother board has all the buses of current PC's. There just
configured a little bit differently. Granted the ALU, Execution
handler, and registry are not functional, it's more for demonstration.
The current CPU's have very good cores in them, were just picking at
the deficiencies of the rest of the system.
This CPU would use the FSB Controller (CTRLR) to access RAM. I
mentioned that this would free up the FSB, it dose. The CPU is the
computers primary router of data. when a program transfers data from
ram to something else... it goes threw the CPU. That means one clock
cycle to read one byte of data from the memory, and one clock cycle (at
least) to send the data to something else. The reverse order to get
data from something else and put it in RAM. The backside RAM system
reduces this operation to at least one clock cycle. Thus, The FSB
merely interfaces something else, and not the RAM. The FSB controller
is the CPU's interface engine to the rest of the computer, thus it dose
not mater where the RAM is accessed from, chip set, or CPU. the RAM
access happens just as fast either way. the chip set road dose take up
precious bandwidth on the FSB from everything else. So using the on
chip FSB controller is a very good option, besides that allows some
compatibility trapping to be implemented.
So when this computer boots up, it merely uses the FSB to read data
from the mass-storage controller(heard drive), and the CPU's FSB
controller, puts the data into memory. there is no need for cache here,
the execution system can get data from the L1-RAM just as fast as it
needs it, without blocking every thing else from talking to the CPU.
The L2-RAM has an extremely wide interface, so running at a mere 50MHz,
the L2-RAM can easily keep pace with a 200MHz CPU.
The interface is extremely wide... Noticed that all the lines in the
core system are doubled. One path for send and another for receive.
That means I can send and receive data simultaneously, at double the
threw put of the clock cycle. well that means a 200MHz computer can
transfer 400MB/s, and it all works perfectly. and it dose. There is a
paradox here, how can one CPU handle twice the data threw put of it's
core system, because it is sending and receiving simultaneously. this
removes the need to temporally store data between read and wright
cycles. You see, as a operation is completed it sends the data to where
it needs to go, and at the same time the CPU is getting the next
instruction and data. Granted not all executions will require a send of
data, the sending of data dos not stop the CPU from getting data to
This design presents RAM manufactures with two possible standards to
produce memory modules. One being a simple singular read/wright buss
operating at the CPU's clock speed. The other being a more complex
multi-wright/multi-read buses operating at a fraction of the CPU's
clock speed. the former being what I believe manufactures will be more
willing to produce. the latter requires built in protection from
writing to the same address simultaneously, and a form of parallel to
serial buss caching.
|multi wide ram module.
||Single wide ram module
Part of the reason for the single wide being easier to implement, is
the lack of a reason to protect against multiple wrights on the
same address. reading from the same address that is being written
to, if this should happen. can be sped up by forwarding the data back
out of the control logic, rather then waiting for the write cycle to
finish before reading the data from the DRAM. any other cell can be
read from at the same time writes are happening.
||My Email link. copy and past. email@example.com
or is the status "coffee is good" on yahoo messenger