Confusion with storage capacity (powers of 10 and 2)

24

9

I was taking a look at a HDD and I found a document (from Toshiba, link: 2.5-Inch SATA HDD mq01abdxxx) that says:

"One Gigabyte (1GB) means 10^9 = 1,000,000,000 bytes using powers of 10. A computer operating system, however, reports storage capacity using powers of 2 for the definition of 1GB = 2 ^30 = 1,073,741,824 bytes, and therefore shows less storage capacity."

Then powers of 10 are bigger than powers of 2, OK.

Example 10^2 = 100 and 2^2 = 4.

But I do not understand the document which says for the same storage capacity:

1GB is 1,000,000,000 bytes (powers of 10) and 1,073,741,824 bytes (powers of 2), then: it shows less storage capacity (the powers of 2). Why is it less? If I see for 1GB more storage capacity in powers of 2 than powers of 10.

learnprogramming

Posted 2016-05-25T09:48:36.700

Reputation: 473

Question was closed 2016-05-28T16:38:19.580

13"Why is it less? If I see for 1GB more storage capacity in powers of 2 than powers of 10." Your height in inches gives the smaller number than the same height in centimeters, just because there is more "length capacity" in inch than in centimeter. So, for the fixed value to express: the larger the unit, the lower the number. – Kamil Maciorowski – 2016-05-25T11:01:56.283

4Its not less, its the same value, represented by two different bases. – Ramhound – 2016-05-25T12:55:06.197

2You can't simply say that 10^2 - 100 and 2^2 = 4. You have to calculate what 100 would be in base 2. – Ramhound – 2016-05-25T13:16:43.363

4"A computer operating system" - Mine doesn't... Or, actually, it uses MB (base10) in the GUI, but MiB (base2) in the CLI. Just to keep things interesting. – marcelm – 2016-05-25T16:07:32.217

2They're saying that "powers of 10" are smaller than the similar magnitude powers of 2. E.g., 1000 (10^3) < 1024 (2^10). And 1000000 (10^6) < 1048576 (2^20). So to a drive manufacturer, your 1 terabyte hard drive has (at least) 1,000,000,000 bytes (and actually a little bit more) while to an operating system utility reporting on space 1 terabyte is 1,099,511,627,776 bytes. So the OS will report your 1Tb hard drive as 931Gb, or a bit more. (Or maybe not, see @marcelm above.) – davidbak – 2016-05-25T16:48:52.083

https://en.wikipedia.org/wiki/Binary_prefix – fr13d – 2016-05-25T18:15:37.103

Suppose you have a hard drive with 1,073,741,824 bytes. In the powers-of-two system that would be written as 1GB. In the powers-of-ten system that would be written as 1.073GB. So if the manufacturer decides to label their hard drives in the power-of-ten system, it looks like you're getting 0.073GB extra. – user253751 – 2016-05-25T20:03:05.987

It's the same size hdd manufacturers round up because selling a 4.856GB HDD wouldn't be bought even though that's about the size of a 5GB HDD – Ramhound – 2016-05-25T22:00:13.400

It was even worse for floppy disks which in some cases used mega=1000*1024.

– Chris H – 2016-05-26T08:40:51.690

compare 10^2 and 2^7, not 2^2. Same to 10^3 and 2^10 – phuclv – 2016-05-26T14:14:02.870

"Because Math". Using GB for powers of 2 is also deprecated. Powers of 2 for Gigabytes are now officially to be listed as "GiB". – Brian Knoblauch – 2016-05-27T14:15:57.887

First, thank you for your replies :) . Then I was focusing in result instead of powers. I think that I understand: For powers of 2 I need a higher exponent to represent 1GB (30 in this case, so I need: 2x2x2x2... to 30 times). For powers of 10 I need a lower exponent: only 10^9, so If I represent power of 2 with same exponent: 2^9 I would have less quantity than power of 10. Then that is the reason that the document says: 1GB in power of 2 shows less storage capacity. – learnprogramming – 2016-06-02T14:09:36.160

Answers

58

The historical reason of using powers of 2 is that memory and hard disk are accessed by the CPU using an address space composed of lines on binary code. Hardware producers decided the names in this way:

2^10 = 1024 and as it's almost 1000 then call it 1 Kilobyte

2^20 = 1048576 bytes and as it's almost 1000000 then call it 1 Megabyte

For the normal user it is nonsense and cumbersome. In addition the prefixes "kilo", "mega", etc. come into conflict with the International System of Units (SI) standard where “1 kiloWatt” means 10^3 or 1000 Watts.

To solve the problem, in the year 2000 The International Electrotechnical Commission or IEC proposed a notation scheme for the units based in powers of 2 on the norm ISO/IEC 80000-13.

The new names were created by replacing the second syllable in the old name by ‘bi’ (referring to ‘2’). A kilobyte must be now a kibibyte and so on. The new units also got corresponding symbols, so ‘10 kibibyte’ is now written as 10 KiB instead of 10 kB. This is the correspondence table:

Notation      Symbol    Value
1 kilobyte    1 kB      10^3  = 1000 bytes
1 megabyte    1 MB      10^6  = 1000000 bytes
1 gigabyte    1 GB      10^9  = 1000000000 bytes
1 terabyte    1 TB      10^12 = 1000000000000 bytes


1 kibibyte    1 KiB     2^10 = 1024 bytes
1 mebibyte    1 MiB     2^20 = 1048576 bytes
1 gibibyte    1 GiB     2^30 = 1073741824 bytes
1 tebibyte    1 TiB     2^40 = 1099511627776 bytes

16 years later a lot of hardware and software vendors still refer to the base-2 units with their SI names. A “megabyte” can mean either 1000000 bytes or 1048576 bytes.

If you buy a 100 GB hard drive, the capacity is 100x10^9 or 10^11 bytes. But, and this is the big but, the operating system will only report the drive as having a capacity of 93 GB (10^11)/(2^30). You bought a 100 gigabyte drive, which is equivalent to a 93 gibibyte drive. The operating system is the one that uses the wrong notation.

Drive manufacturers hide this issue with disclaimers and explanations that always lead to the conclusion that “actual formatted capacity may be less”.

jcbermu

Posted 2016-05-25T09:48:36.700

Reputation: 15 868

1

Comments are not for extended discussion; this conversation has been moved to chat.

– Journeyman Geek – 2016-05-27T02:14:22.867

21

In short: it was all about marketing.

jcbermu explained well, but I don't agree to the reasons behind all of that.

While any informatics system uses the binary system, the bits & bytes are written as ^2, which is normal. So it's not the operating system or software at fault for the confusion. Everything is binary here.

It's the fault of HDD manufacturers to state the HDD capacities in ^10 system, which robs you of quite some practical GB. A 20GB HDD will actually be able to store 18GB and so forth...a 1TB drive will be actually of ~930GB. The 'bibyte' mockery was invented to try to prevent some of the confusion but it utterly failed to be practically adopted.

Overmind

Posted 2016-05-25T09:48:36.700

Reputation: 8 562

10It's because the bytes on the disk "settled after shipping". – davidbak – 2016-05-25T16:51:54.617

2True. I've never heard anyone saying "I've upgraded to 16 gibi RAM". I don't think manufacturers are responsible for the mess but they do profit on it for sure. Back in 80s and 90s, computer users knew what is The Difference between kilogram and kilobyte and why. Nowadays, who knows that computers run on binary arithmetics? – Crowley – 2016-05-25T17:03:29.687

@Crowley That is the fault of today's computer users for not being educated about what they're using, the same way one might not realize the difference between HDTV and Full HD. – Andy – 2016-05-25T23:15:39.300

@Crowley: For the 1024 prefix there's no ambiguity if one distinguishes the prefixes "K", pronounced "kay", and "k", pronounced "kilo". The problem is that the SI prefixes larger than that are normally uppercase, so using uppercase for binary isn't a useful distinction when abbreviating. – supercat – 2016-05-26T02:35:17.863

4It's not about marketing, and never was. Hard drives and floppies have always been sold using the real SI-prefix, because it never made sense to use another base. – pipe – 2016-05-26T12:28:43.267

1-1, terrible. It's the fault of HDD manufacturers to state the HDD capacities in ^10 system, which robs you of quite some practical GB. No, neither of those things are true. HD manufacturers are the ones who've been doing it right all along, using the actual, correct definition of the units. It's not their fault that developers, memory manufacturers and whomever else have been using the SI units inaccurately. And, of course, what unit the storage space is measured in doesn't "rob you of some practical GB" or actually alter the capacity in any way whatsoever. – HopelessN00b – 2016-05-26T16:01:00.397

1@pipe: A 720KB floppy held exactly 1,440 blocks of 512 bytes each. Likewise with other sizes measured in KB. So far as I can tell, the most common meaning of "MB" with magnetic storage media was 1,024,000 bytes, making a 1.44MB floppy exactly twice as big as a 720KB one. – supercat – 2016-05-26T17:44:00.633

@pipe floppies were sold in two ways in my experiance. An "unformatted" capacity which was a crude theoretical capacity and later a "formatted" capacity which was measured in binary kilobytes or "hybrid" megabytes (where one "hybrid" megabyte is 1024000 bytes) – plugwash – 2016-05-26T19:28:54.480

@HopelessN00b No, HD manufactures changed how they defined MB in the 90s. Previously they matched what RAM manufactures were doing. – Andy – 2016-05-26T23:00:22.643

@Andy Do you have anything to back that up? I just checked my Quantum ProDrive from 1990. It advertises 52 MB and has exactly 52311040 bytes of storage. If it had been 52 MiB, it would have had at least 54525952 bytes. – pipe – 2016-05-27T07:12:04.203

@pipe Sounds like your drive backs me up (its formatted so some space is lost due to formatting, and possibly you have some bad sectors on a drive so old). If things were as you claim, you'd have 52,000,000 bytes, not 52,311,040. I do distinctly remember replacing at 20MB drive ( 20971592 bytes) with an ~500MB drive, which then featured the now standard disclaimer (and was my first drive which wasn't 500MB by powers of 2). Do you really believe that HD capacity ALONE has followed the SI definition from the beginning, when NOTHING – Andy – 2016-05-27T22:27:18.873

1else in the computer work did (RAM, floppy drives)? Even today, is the 8MB cache the drives have even a power of 10? How about the 6Gbit/s (600MB data bandwidth)? I don't think that 600MB referenced there is a power of 10. Why JUST the number of bits on an HD platter went with power of 10, when nothing else in the computer industry has? Please. – Andy – 2016-05-27T22:33:06.607

1@Andy The only thing that did not use the standard SI prefix is RAM. I've already demonstrated that for example a DD floppy had exactly 1000000 byte of raw capacity. 6 Gbit/s SATA is exactly 6000000000 bit/s, encoded with 10 bits per byte. Network speed is the standard base 10 unit, same with processor speed. The number of bits on a HD platter depends on the geometry of the disc, so a power-of-two makes no sense at all, and never did. Even RAM was sometimes power-of-10, which can be seen in old magazines where computers have "65 k" etc. – pipe – 2016-05-28T08:24:35.680

@pipe You haven't demonstrated anything, and this page also seems to disagree with you: https://en.wikipedia.org/wiki/List_of_floppy_disk_formats (note the column header that says KB = 1024 bytes). Or this page: https://en.wikipedia.org/wiki/Serial_ATA#SATA_revision_3.2_.2816_Gbit.2Fs.2C_1969_MB.2Fs.29 Note that 16gbit/s is also referred to as 1969MB, and 1969 is the power of 2 number. Network speeds are more commonly discussing the number of bits per second, not bytes, so I'm not sure what your argument here since we're talking about BYTES.

– Andy – 2016-05-28T15:33:55.837

A byte is also an arbitrarily defined size, which also happens to be a power of 2, but there's no other reason than history that a byte remains defined as 8 bits. But since a byte is a power of two, its logical that prefixes for larger numbers of bytes also be power of two. The only think in computers that's traditionally used normal SI units is clock speed, and Hz were well defined before computers were invented (and Hz really has nothing to do with digital theory). – Andy – 2016-05-28T15:38:42.807

1@Andy I'm done discussing this in the comment section. I have posted technical datasheet, I have posted exact block sizes from real drives. You have posted a made-up drive size to support your arguments, but a hard drive with that size was never made. You have demonstrated that you are not willing to check your sources (SATA 3.2 uses PCI Express with a different encoding - still SI-prefix, with a 130->128 bit encoding), so further discussion is pointless. – pipe – 2016-05-29T09:19:17.820

16

jcbermu's answer is good, but I want to approach this from a different angle.

1GB is 1,000,000,000 bytes (powers of 10) and 1,073,741,824 bytes (powers of 2), then: it shows less storage capacity (the powers of 2). Why is it less? If I see for 1GB more storage capacity in powers of 2 than powers of 10.

A storage media -- any storage media -- can store a specific number of accessible bits. Usually in general purpose computing, it's expressed as bytes or some multiple of bytes, but if you start looking at for example memory ICs (integrated circuits, chips), you will see their memory capacity expressed in terms of accessible bits.

A hard disk will store some specific number of bits or bytes which, for technical reasons, are addressed in terms of sectors. For example, a 4 TB drive might have 7,814,037,168 sectors of 512 bytes each, which works out to a storage capacity of 4,000,787,030,016 bytes. That's what you actually get. (In practice, you then lose some of that to the computer's bookkeeping information: file system, journal, partitioning, etc. However, the bytes are still there, you just can't use them to store files, because they are needed to store the data that effectively allows you to store the files.)

Of course, the number 4,000,787,030,016 is somewhat unwieldy. For that reason, we choose to represent this information in some other way. But as jcbermu illustrated, we choose to do so in two different ways: in powers of ten, or powers of two.

In powers of ten, 4,000,787,030,016 bytes is 4.000787030016 * 10^12 bytes, which rounds quite nicely; with four significant digits, it rounds to 4.001 TB, for the SI definition of "tera": 10^12. Our hard disk can store more than 4 * 10^12 bytes, so in SI terms, it is a 4 terabyte storage device.

In powers of two, 4,000,787,030,016 bytes is 3.638694607 * 2^40 bytes, which doesn't round quite so nicely. It also looks like a smaller quantity, because 3.639 is less than 4.001, and that is bad for marketing (who wants to buy a 3.6 TB drive when the manufacturer next door sells a 4.0 TB drive for the same price?). This is the binary prefix 3.6 "tebibytes", where the "bi" indicates that it's a base-two quantity.

In reality, however, it's exactly the same number of bytes; the number is only expressed differently! If you do the math again, you will see that 3.638694607 * 2^40 = 4.000787030016 * 10^12, so you get the same storage capacity in the end.

a CVn

Posted 2016-05-25T09:48:36.700

Reputation: 26 553

1Nicely explained, but the number of sectors (like 7,8[...] in your example) is chosen by the manufacturers so that the capacity ends up to the desired value. They could just make 8 Billion sectors, 8589934592 or any other number to end up with a true capacity value, but that's not good for business. Since the difference is technically possible, as a manufacturer I'd make a big market hit: a logo TrueCapacity(r) or TrueSpace(r) and it's guaranteed that sales would increase due to this marketing maneuver and the other manufacturer(s) would have to follow (and would be unprepared to do so). – Overmind – 2016-05-26T05:26:44.783

@Overmind: That is one possible marketing technique. Similar to Aerial Communications which had per-second billing (before T-Mobile bought them out). If you were in charge of marketing for a storage device manufacturer, I would guess that strategy could be one that you decide to consider pursuing. – TOOGAM – 2016-05-26T06:11:46.950

I found this answer to provide me the most clarity (perhaps). So, there's no actual need for it be a power of 2? There's nothing special about most of the storage medium sizes being powers of 2? – Abdul – 2016-06-01T18:02:02.993

1@Abdul Most (consumer) storage devices have user-accessible capacities that are not, in terms of bytes (or by implication also bits), a power of two. Like Overmind stated above, HDDs' exact capacities can be largely randomly selected as long as they meet marking requirements. SSDs tend to be closer to 2^n, because flash memory chips are made in sizes that are often in whole powers of two (because they have address lines and such things which makes that an advantage), but due to overprovisioning not all flash capacity will be accessible from software external to the built-in flash controller. – a CVn – 2016-06-01T19:27:13.533

5

Other answers have addressed the historical reason for the difference, but it seems to me like you are asking about the difference according to the mathematics.

You are correct that one power of 10 is larger than one power of 2, and that conversely one gigabyte (10^9 bytes) is smaller than one gibibyte (2^30 bytes).

The reversal of sizes is explained by the fact that there are more powers in one gibibyte (30 powers) than there are powers in one gigabyte (9 powers). It turns out that the number of powers has a larger effect on the final size than does the size of each individual power.

As to why the reported size of a disk is smaller when measured in gibibytes (2^30) than when measured in gigabytes (10^9), it is natural than when measuring a fixed quantity that a larger unit of measure gives a smaller number. For example, consider height in inches versus height in centimetres. Because one inch is larger than one centimetre, the same height will measure fewer inches (e.g. 72 inches) than centimetres (e.g. 183 centimetres). The height is the same physical distance in both cases, but each measurement just gives a different number according to the unit of measure.

Edward Peek

Posted 2016-05-25T09:48:36.700

Reputation: 151