Author:ys2310
2008年春にNew York Cityにあるふる〜い大学を卒業。
A single-precision binary floating-point number is stored in 32 bits.
The exponent is biased by 28 − 1 − 1 = 127 in this case (Exponents in the range −126 to +127 are representable. See the above explanation to understand why biasing is done). An exponent of −127 would be biased to the value 0 but this is reserved to encode that the value is a denormalized number or zero. An exponent of 128 would be biased to the value 255 but this is reserved to encode an infinity or not a number (NaN). See the chart above.
For normalized numbers, which are the most common, the exponent is the biased exponent and fraction is the significand without the most significant bit.
The number has value v:
v = s × 2e × m
Where
s = +1 (positive numbers) when the sign bit is 0
s = −1 (negative numbers) when the sign bit is 1
e = Exp − 127 (in other words the exponent is stored with 127 added to it, also called "biased with 127")
m = 1.fraction in binary (that is, the significand is the binary number 1 followed by the radix point followed by the binary bits of the fraction). Therefore, 1 ≤ m < 2.
In the example shown above, the sign is zero, the exponent is −3, and the significand is 1.01 (in binary, which is 1.25 in decimal). The represented number is therefore +1.25 × 2−3, which is +0.15625.
Double precision is essentially the same except that the fields are wider:
The fraction part is much larger, while the exponent is only slightly larger. The standard creators believed precision is more important than range.
NaNs and Infinities are represented with Exp being all 1s (2047). If the fraction part is all zero then it is Infinity, else it is NaN.
For Normalized numbers the exponent bias is +1023 (so e is exponent (− 1023)). For Denormalized numbers the exponent is (−1022) (the minimum exponent for a normalized number—it is not (−1023) because normalised numbers have a leading 1 digit before the binary point and denormalized numbers do not). As before, both infinity and zero are signed.
float型で表現できるのは、±10-38〜1038(2127)の範囲で、小数以下有効桁は7桁(2-23)です。
double型では、±10-308〜10308(21024)の範囲で、小数以下有効桁は15桁(2-52)です。
普通の計算ではfloat型で十分ですが、極端に大きな数を扱う場合や、高い精度が必要となる場合は、double型で宣言すると良いでしょう。
また、整数を扱うint型は、-2147483648〜2147483647の範囲です。
階乗の計算(例えば、5!=120、10!=3628800)のように計算結果が巨大な値になる場合は、注意が必要です。
| 型名 | 対象 | ※1 | 表現できる範囲 | 有効桁 |
| int | 整数 | %d | -2147483648〜2147483647 | --- |
| float | 小数 | %f | ±10-38〜1038 | 7桁 |
| double | 小数 | %f | ±10-308〜10308 | 15桁 |