Bit Representation: Difference between revisions
Brodriguez (talk | contribs) (Formatting) |
Brodriguez (talk | contribs) (Correct links) |
||
Line 137: | Line 137: | ||
* <code>E = exp - bias</code> | * <code>E = exp - bias</code> | ||
For examples of decimal to float representation, see https://www.youtube.com/watch?v= | For examples of decimal to float representation, see https://www.youtube.com/watch?v=8afbTaA-gOQ <br> | ||
For examples of float to decimal representation, see https://www.youtube.com/watch?v= | For examples of float to decimal representation, see https://www.youtube.com/watch?v=LXF-wcoeT0o | ||
=== Denormalized Values === | === Denormalized Values === |
Latest revision as of 15:01, 4 February 2020
Data Type Sizes
The following table describes how many bytes are needed to represent each data type.
Recall that 1 byte is 8 bits.
Standard 32-bit | Standard 64-bit | x86-64 | |
---|---|---|---|
Char | 1 | 1 | 1 |
Short | 2 | 2 | 2 |
Int | 4 | 4 | 4 |
Long | 4 | 8 | 8 |
Float | 8 | 8 | 8 |
Double | 8 | 8 | 8 |
Long Double | - | - | 10 / 16 |
Pointer | 4 | 8 | 8 |
Two's Compliment
Also known as "signed integer" representation.
At the bit level, everything is computed and stored the same as unsigned.
However, when representing the number to the user, it's handled differently.
Effectively:
- Check the largest (leftmost) bit.
- If it's 0, then compute normally, the same as unsigned.
- If it's 1, the value is negative, so proceed to the next steps.
- Drop off the largest (leftmost) bit, as we know the value is negative.
- Invert remaining bits.
- Add 1 to these inverted bit.
- Read in new value as your number, as a negative.
Two's Compliment Examples
Positive Example: 0110
We read this in normally, so we would have:
0 + 4 + 2 + 0 = 6
Our resulting value is 7
.
Negative Example: 1110
Our leftmost bit is 1, so we know it's negative.
First, we drop this leftmost bit, giving us 110
.
Next, we invert our bits, giving 001
.
Now we add 1 to our inverted value:
001 + 1 --- 010
Our final binary value is 010
. We can now read this as a negative number, giving:
-(0 + 2 + 0) = -2
Our resulting value is -2
Floats
IEEE floats take the form (-1)^s M 2^E
where:
- S - Sign bit. Represents if positive or negative.
- M - Significand. Generally a decimal value between [1.0, 2.0).
- E - Exponent. Always a power of 2.
Standard precision options are stored in memory as follows:
s | exp | frac | |
---|---|---|---|
32-bit (Single Precision) | 1 Bit | 8 Bits | 23 Bits |
64-bit (Double Precision) | 1 Bit | 11 Bits | 52 Bits |
80-bit (Extended Precision. Intel Only) | 1 Bit | 15 Bits | 63/64 Bits |
Floats can come in two major types. Either "Normalized" or "Denormalized", depending on bits in the exp field.
Normalized Values
Normalized values occur when exp != 000...0
and exp != 111...1
.
First, calculate the bias, which is 2^(k-1) - 1
, where k represents the number of exponent bits.
- Single Precision: 127
- Double Precision: 1023.
Determine M (from frac):
- Has implied leading 1.
- Minimum value of 1.0 when
frac = 000...0
- Maximum value of nearly 2.0 when
frac = 111...1
Determine E (from exp):
E = exp - bias
For examples of decimal to float representation, see https://www.youtube.com/watch?v=8afbTaA-gOQ
For examples of float to decimal representation, see https://www.youtube.com/watch?v=LXF-wcoeT0o
Denormalized Values
Denormalized values occur when exp = 000...0
or exp = 111...1
.
When exp = 000...0
:
- If
frac = 000...0
, then float represents zero. - Otherwise, float represents numbers less than 1.
When exp = 111...1
:
- If
frac = 000...0
, then float represents infinity. - Otherwise, float represents NaN (Not-a-Number).