Bit Representation: Difference between revisions

From Dev Wiki
Jump to navigation Jump to search
(Update floating point examples)
(Formatting)
Line 137: Line 137:
* <code>E = exp - bias</code>
* <code>E = exp - bias</code>


For examples of decimal to float representation, see https://www.youtube.com/watch?v=LXF-wcoeT0o
For examples of decimal to float representation, see https://www.youtube.com/watch?v=LXF-wcoeT0o <br>
For examples of float to decimal representation, see https://www.youtube.com/watch?v=8afbTaA-gOQ
For examples of float to decimal representation, see https://www.youtube.com/watch?v=8afbTaA-gOQ



Revision as of 15:01, 4 February 2020

Data Type Sizes

The following table describes how many bytes are needed to represent each data type.
Recall that 1 byte is 8 bits.

Standard 32-bit Standard 64-bit x86-64
Char 1 1 1
Short 2 2 2
Int 4 4 4
Long 4 8 8
Float 8 8 8
Double 8 8 8
Long Double - - 10 / 16
Pointer 4 8 8

Two's Compliment

Also known as "signed integer" representation.

At the bit level, everything is computed and stored the same as unsigned.
However, when representing the number to the user, it's handled differently.

Effectively:

  • Check the largest (leftmost) bit.
    • If it's 0, then compute normally, the same as unsigned.
    • If it's 1, the value is negative, so proceed to the next steps.
  • Drop off the largest (leftmost) bit, as we know the value is negative.
  • Invert remaining bits.
  • Add 1 to these inverted bit.
  • Read in new value as your number, as a negative.

Two's Compliment Examples

Positive Example: 0110

We read this in normally, so we would have:
0 + 4 + 2 + 0 = 6
Our resulting value is 7.

Negative Example: 1110

Our leftmost bit is 1, so we know it's negative.
First, we drop this leftmost bit, giving us 110.
Next, we invert our bits, giving 001.
Now we add 1 to our inverted value:

 001
+  1
 ---
 010

Our final binary value is 010. We can now read this as a negative number, giving:
-(0 + 2 + 0) = -2
Our resulting value is -2

Floats

IEEE floats take the form (-1)^s M 2^E where:

  • S - Sign bit. Represents if positive or negative.
  • M - Significand. Generally a decimal value between [1.0, 2.0).
  • E - Exponent. Always a power of 2.

Standard precision options are stored in memory as follows:

s exp frac
32-bit (Single Precision) 1 Bit 8 Bits 23 Bits
64-bit (Double Precision) 1 Bit 11 Bits 52 Bits
80-bit (Extended Precision. Intel Only) 1 Bit 15 Bits 63/64 Bits

Floats can come in two major types. Either "Normalized" or "Denormalized", depending on bits in the exp field.

Normalized Values

Normalized values occur when exp != 000...0 and exp != 111...1.

First, calculate the bias, which is 2^(k-1) - 1, where k represents the number of exponent bits.

  • Single Precision: 127
  • Double Precision: 1023.

Determine M (from frac):

  • Has implied leading 1.
  • Minimum value of 1.0 when frac = 000...0
  • Maximum value of nearly 2.0 when frac = 111...1

Determine E (from exp):

  • E = exp - bias

For examples of decimal to float representation, see https://www.youtube.com/watch?v=LXF-wcoeT0o
For examples of float to decimal representation, see https://www.youtube.com/watch?v=8afbTaA-gOQ

Denormalized Values

Denormalized values occur when exp = 000...0 or exp = 111...1.

When exp = 000...0:

  • If frac = 000...0, then float represents zero.
  • Otherwise, float represents numbers less than 1.

When exp = 111...1:

  • If frac = 000...0, then float represents infinity.
  • Otherwise, float represents NaN (Not-a-Number).