Re: Salsa20 altivec timings

tomstdenis_at_gmail.com
Date: 09/28/05


Date: 27 Sep 2005 16:55:27 -0700


xmath wrote:
> It's interesting to note that, due to unavoidable data-dependencies, a
> Salsa20 round must take at least 12 cycles, as absolute CPU-independent
> minimum (unless it offers combined add-rotate, rotate-xor, or xor-add
> instructions that execute in one cycle, something I've never seen in
> any CPU).

ARM offers add-rotate but not the others [it can xor-rotate but not
rotate-xor].

Still not bad cycle counts. Coo!

Tom



Relevant Pages

  • Another high end 16/32 bit uC, Wide Vcc, Wide Temp
    ... shift and rotate instructions are always processed during one machine cycle independent of the number of bits to be shifted. ... Also multiplication and most MAC instructions execute in one single cycle. ... Serious Peripherals [1..63 bit SPI and UARTs] ...
    (comp.arch.embedded)
  • Re: G82 discovery?
    ... Just sperimented, found that if you're drilling *symmetrically* on the 4th axis, you can include the Axx move (rotation) to execute further drill cycles, eg ... On many machines, once you start a fixed cycle, the machine will automatically perform the cycle at the end of every block, no matter what's in the block. ... In some cases, where you might need to center drill some holes, and then drill them, and then tap them, or whatever, it can be nice to put the hole locations in a subroutine. ... What can happen, though, is that the drilling or tapping cycle will execute too many times. ...
    (alt.machines.cnc)
  • Re: AMD CodeAnalyst MASM only?
    ... that limited when instructions could dispatch together. ... can execute instructions out of order, so it is a little more difficult to ... unitused, decode cycle, execute cycle, and retire/writeback cycle. ... Next I have the decode field. ...
    (comp.lang.asm.x86)
  • Re: AMD CodeAnalyst MASM only?
    ... > that limited when instructions could dispatch together. ... > can execute instructions out of order, so it is a little more difficult ... > together by which cycle they retire in. ... > unitused, decode cycle, execute cycle, and retire/writeback cycle. ...
    (comp.lang.asm.x86)
  • Understanding PPC405 execution.
    ... execute. ... exeFull = There is a valid instruction in the exe stage (execution ... dcdData = Instruction at decode stage (opcode in decode phase) ... cycle 3: changes in gpr4 ...
    (comp.arch.fpga)