Re: NVIDIA 8800 integer performance



"Phil Carmody" <thefatphil_demunged@xxxxxxxxxxx> wrote in message news:873b3n84os.fsf@xxxxxxxxxxxxxxxxxxxxxxx
My money is on the 32 bit multiply being implemented in microcode
as 4 16*16->32 multiplies, as the [u]mul24 (24*24->low32)
instructions take only 2 clock ticks.

But to do a 32 bit multiply as 4 16 bit multiplies, you'd also have to do at least two 32 bit additions, which also cost 2 cycles each, so it would be 12 cycles total. Or are you assuming that the ALUs also have an unpublished 2 cycle 16-bit multiply-and-add instruction?

BTW, what is the story behind the stream of nonsensical messages being posted constantly to this group with a certain three letter subject prefix? Some kind of denial of service attack against the newsgroup? An out of control AI experiment?

.



Relevant Pages

  • Re: Lies, damn lies and benchmarks
    ... When running using just the 16-bit registers, ... extra cycles when run on the 386 over the 286 (these were mostly system ... instructions which didn't get run too often anyways), ... The FPU was another story, the 287 FPU was usually run at an asynchronous ...
    (comp.security.misc)
  • Re: SSE2-Sort within a register
    ... register files. ... cycles. ... 128 bit SSEinstructions are split into Doubles ... Most 128 bit SSE and SSE2 ...
    (comp.lang.asm.x86)
  • Re: hobby project - 16 bit digital audio mixer using m68k
    ... how many clock cycles are required by average instructions. ... I would suggest using some more modern processor requiring less ...
    (comp.arch.embedded)
  • Re: Optimization Questions
    ... cycles you'd save would be more than offset by the cycles you'd burn ... instructions go through port 0 and port 1. ... a 16-bit register, writing one afterwards will be fast. ... Pre-read the value in EAX ...
    (comp.lang.asm.x86)
  • Re: AMD vs Intel timing on this code...
    ... The lodsd ... In general it will be the mov/add. ... identically fast unless you pair something with the mov/add instructions. ... and it seems to be about 6 cycles. ...
    (comp.lang.asm.x86)