Top Qs
Timeline
Chat
Perspective

Bit manipulation instructions

Hardware-level Bit manipulation instructions From Wikipedia, the free encyclopedia

Remove ads

Bit manipulation instruction sets perform bit manipulation in hardware, as single instructions, rather than several as illustrated with examples in software.[1] Several leading as well as historic architectures have bit manipulation instructions including ARM, WDC 65C02, the TX-2 and the Power ISA.[2]

Bit manipulation is usually divided into subsets as individual instructions can be costly to implement in hardware when the target application has no justification. Conversely, if there is a justification then performance may suffer if the instruction is excluded. Carrying out the cost-benefit analysis is a complex task: one of the most comprehensive efforts in bit manipulation was a collaboration headed by Clare Wolfe, providing justifications, use-cases, c code, proofs and Verilog for each proposed instruction.[3][4]

Particular practical examples include Bit banging of GPIO using a low-cost Embedded controller such as the WDC 65C02, 8051 and Atmel PIC. At the slow clock rate of these CPUs, if bit-set/clear/test bit manipulation were not available the use of that low-cost CPU would, logically, not be viable for the target application.

Note:

In something of a Wikipedia Fourth wall breakage: GPUs and other highly-specialist tasks such as cryptography tend to result in extreme-specialist instructions, wthout which performance would suck. Examples include AES instruction set extensions that cannot in any way be used for any other purpose. GPUs such as Larrabee[5] and Nyuzi attempted to "dial back" this practice to some extent, only to discover why it is done (performance sucks otherwise... seeing a trend, here?).

This page is not about such specialised instructions, nor even of their functionality. It covers useful Categorisation of the existence in CPUs and CPU families, of general-purpose bit-manipulation instructions that happen to greatly improve performance or power conumption of specific algorithms. An example is cryptography making heavy use of rotate, but rotate having many other practical uses elsewhere.

If you encounter any type of bit manipulation instructions, or any CPU that has them, feel free to add them below, bearing in mind that the page's primary purpose is Categorisation, not explicit functional description per se. A helpful task for future readers would be to add such pages describing the functionality to the "See also" section. Enjoy the end of the Fourth Wall...

Remove ads

Hardware bit manipulation

Summarize
Perspective

All the architectures below have instruction subsets and groups where the bit manipulation is provided in hardware.

Intel and AMD (x86)

Power ISA

Power ISA has a large range of bit manipulation instructions,[6] largely due to its history and relationship with IBM mainframes and the z/Architecture:

  • Popcount
  • Count leading zeros
  • bit-extract and bit-deposit
  • 8x8 bit permute (bpermd)
  • Ternary 8-bit Bitwise ternary logic instruction xxeval[7] similar to AVX-512
  • SWAR-style 8, 16 and 32-bit parity instructions
  • bit-matrix multiply and transpose, which are computationally otherwise very expensive.
  • strategic instructions for accelerating Packed BCD.
  • Power v3.1 also introduced a number of additional bit manipulation instructions including swapping the order of bytes within half-words, words, and the whole 64-bit register.

IBM System/360 and successors

Some clarification is needed for this section: IBM z/Architecture has both scalar and up to the 11th edition has vector processor bit manipulation instructions. Modern IBM z/Architecture scalar instructions starting at the z13 have the same scalar bit manipulation instructions as the IBM 3090 and the ESA/390 (only the IBM/370 Vector facility was dropped by the 11th revision of z/Architecture).

Previous revisions of z/Architecture has an optional Vector processor facility that came from the IBM 3090. However it was dropped as of the z13, by the 15th edition, which has Packed SIMD instructions instead.[8]. IBM 370 also had the same vector instructions.

z/Architecture Scalar

These instructions are part of the 11th edition[9] z/Architecture, and are also present in IBM 370 and ESA/390:

  • And-complement and others,
  • bit-extract and deposit,[10]
  • popcount,[11]
  • count leading and trailing zeros,[12]
  • a range of bit byte and masked insert instructions,[13],
  • comprehensive rotate and insert instructions including masked rotate-and-OR,[14] and shift,[15]
  • comprehensive Packed BCD.[16]
  • memory-based test-and-set and various masked-test set/clear bit operations, which move or copy a single bit into Condition Codes.[17]

IBM 3090 Vector

  • Vector count-leading zeros vclz, trailing vctz[18] and vpopct[19]
  • Vector test under mask vtm[20] - sets a Condition Code based on comparing all elements of one register against a second vector as a mask: if all masked-comparisons are all-zero, if all are all-ones or a mix of both.
  • Vector GF(2) multiply and multiply-accumulate, vgfm[21], known as Carryless multiply

ARM

  • ARM11 has bitwise test-ANDed (a bitmasked test) and test-XOR, standard logical bitwise operations including OR-complement; byte halfword and bit-reversing, and conditional byte-selection/merging. Shift and rotate are available on Operand2.[22]
  • ARM Cortex-A has bit-field set, clear, extract and reverse.[23]
  • ARM A64 has SWAR-style half-word byte-swapping, bit-field insert and extract, and bit-reversing.[24]

RISC-V

In the standard extensions RISC-V has scalar bitwise operations including shift and arithmetic shift, but no rotate. The omissions are compensated for with additional extensions.

  • RISC-V Zb* extensions contain a significant number of bit manipulation instructions.[25] The four groups are broken down into useful categories (the integer subset has min/max, rotate and Popcount for example), and have very good researched justifications for their inclusion and the improvements they bring.[26]
  • The RISC-V Vector Extension (RVV) has instructions that qualify as hardware-level bit manipulation, but on Vector masks rather than Scalar registers as is normally the case. For example, a Vector-mask Popcount is available.[27] RVV also has per-element bitwise operations.[28]

Embedded Microcontrollers

Intel

  • The 8086 has TEST, as well as bitwise operations[29]
  • The 8051 has SETB, CLR and CPL - set clear and invert bit instructions - and a considerable percentage of its instructions are bit manipulation.[30] Also included is Or-complement and And-complement, present in RISC-V Zb*.[31]

MOS 6502

  • The WDC 65C02 added bit-manipulation: set, reset and test on individual bits.
  • Rockwell added similar extensions (RMB, SMB, BBR and BBS) to the R65C00 series[32]

Atmel PICs

others

Remove ads

See also

Remove ads

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads