Size of files in Windows OS. (It's KB or kB?)

20

9

One "kilobyte" (KB) is 1024 bytes in JEDEC-standard, whereas the definition has shifted, in most contexts, to mean 1000 bytes (kB) in accordance with SI. To resolve this difference, binary prefixes (kiB) are used.

So we have 3 choices for using prefixes - JEDEC, IEC (both in Binary), and Metric (in Decimal).

My questions are-

  1. What prefix standard does Windows use in showing the file size? (surely it's not IEC standard)
  2. Why does Windows OS show sizes of files in KB (using a capital alphabet "K") when it's a small alphabet "k" for a Kilo in SI units?

A capital "K" represents Kelvin in SI system of units.

Am I missing something here in understanding?

a.s.

Posted 2015-07-09T05:06:52.457

Reputation: 327

1Why would you assume that SI conventions have to apply to non-SI units? The last time I checked, byte wasn't an SI unit, nor SI-derived. – Luaan – 2015-07-09T10:09:50.653

9@Luaan: SI conventions are the most common conventions by far for units, even for non-SI units. For example, they're trying to run the LHC at 13 TeV, but eV (electron volt) is not SI. When you say that the ambient noise level is 40 dB, the B (bel) is not SI either. – Dietrich Epp – 2015-07-09T11:46:56.063

2@DietrichEpp: That's still physics. The byte is not a unit of physics; physicists measure information as entropy (unit: J/K). – MSalters – 2015-07-09T12:58:32.397

3

Relevant: https://xkcd.com/394/

– basic6 – 2015-07-09T12:59:55.733

3@MSalters: The "Bel" is not physics, it is an abstract unit like the byte. – Dietrich Epp – 2015-07-09T13:05:43.067

@DietrichEpp: As a physics engineer working in acoustics, I'm entirely familiar with the (deci)Bel. Audio is very much physics. ITYM that Bel is dimensionless, not abstract. That's hardly unique in physics. So it the mol, so are Cv (streamline) values, Reynolds numbers, ... – MSalters – 2015-07-09T13:45:01.747

2I do mean "abstract" in that the Bel does not correspond to any concrete (or physical) system. It is only used to express a ratio, not even a ratio of something in particular (like power). For example, in digital signal processing the dB will be used to express ratios of digital signals which have no physical units to begin with. So I strongly disagree with the notion that decibel is a "physical" unit, or connected to physics in any special way. – Dietrich Epp – 2015-07-09T14:08:35.180

1K meaning Kelvin isn't a possibility here, because there are no unit suffixes. The possibilities for interpreting KB are: K (prefix) B (unit), or KB (unit). kK would be kilo-Kelvins. The non-standard upper-case K prefix is weird, but apparently allowed with B for storage size, by JEDEC. (see @txtechhelp's answer). – Peter Cordes – 2015-07-10T07:17:40.707

Answers, useful comments posted here, and this link really cleared up my confusion. KB (with a big k) does not exist.

– a.s. – 2015-07-10T09:26:55.980

2Re the whole K thing: I own a graduated cylinder that measures liquid volume in mL. A lowercase m represents "meters" in the SI system of units. So mL must mean "meter liters"?! What am I missing? – Quuxplusone – 2015-07-10T14:36:10.443

1@Quuxplusone:The short forms for SI units (such as ml for milliliter) are symbols, not abbreviations.Here 'm' is a prefix for the metric unit 'Liter'. – a.s. – 2015-07-10T16:22:11.200

2@Luaan :No, the byte is not an SI unit. However SI states that the SI prefixes may be used with non-SI units, and that they should never be used to mean anything other than their SI meanings. – Jamie Hanrahan – 2015-07-11T05:02:08.233

Answers

40

I'll answer your question as directly as possible since the usage KB vs. KiB vs. kB vs. kb will quickly spawn an off-topic debate as that naming convention war has been going on for decades now.

1.) What prefix standard Windows use in showing file size? (surely it's not IEC standard)

Actually it's the JEDEC 100B.01 standard which means that KB (Killobyte) is 1024 Bytes.

2.) Why Windows OS show size of files in KB (using a capital alphabet "K") when it's a small alphabet "k" for a Kilo in SI units.

Again, because it's the JEDEC 100B.01 standard for unit prefixes for semiconductor storage capacity; it's not an SI unit of measure and thus does not have the same meaning.

The lowercase k can be synonymous with uppercase K when dealing with kilo / kibi; for giga, mega and tera, JEDEC, ISO and BIPM SI prefix norms define them to be uppercase G, M and T respectively. Lowercase g, m and t are used only in informal situations, when context provides the meaning (as in I just swapped out my 1gb NIC or my 2tb hdd isn't working), and are per se invalid.

A capital "K" represents Kelvin in SI system of units. Am I missing something here in understanding?

Yes, a capital K represents Kelvin when you are specifically talking about measurements of temperature and dealing with SI units of measure, however, we are dealing with semiconductor storage capacity and I would not say I have 512 KB of RAM and mean I have 512 Kelvin Bytes of RAM. Further, it really depends on context to know when/how to differentiate between the IEC/JEDEC and SI units of measuring KB/MB/GB/etc.

Most OS's and the vast majority of devices that deal with memory/storage use the prefixes K for Kilo to mean 1024 bytes, so when I get RAM that says it's a 4GB module, I know it's 4 Gibi-Bytes (4*1024*1024*1024) and not Giga-Bytes (4*1000*1000*1000).

The major exception to this is in drive capacities; when I purchase a thumb drive or hard drive, I know when it says 32GB, it means 32 Giga-Bytes (32*1000*1000*1000) and not Gibi-Bytes (32*1024*1024*1024), even though my OS will report it in Gibi-Bytes (and thus take my drive from 32GB to an effective 29.8 GiB drive). Also note that there are some flavors of Linux that like to use the KB to mean 1000 bytes, regardless of context, and this can get somewhat confusing as not all applications in the same OS will report the sizes the same. Most device makers will usually put a disclaimer somewhere on the "box" (or website etc.) to denote what they are meaning when they say KB/GB/etc, like on hard drive boxes that have the disclaimer of *1GB = 1000000000 bytes.

If you're ever confused on what style your OS is reporting to you as, you can always look at how many bytes a file is and then do the math to see what your OS is telling you (the 'size of file', not 'size on disk' as those are different things); if your OS can't tell you the raw byte count, there are bigger issues beyond what suffix it's using.

Or as Randall put it: kilobyte

txtechhelp

Posted 2015-07-09T05:06:52.457

Reputation: 3 317

7

"Most OS's and the vast majority of devices that deal with memory/storage use the prefixes K for Kilo to mean 1024 bytes" Starting with 10.6, OS X no longer does. That's a fairly significant OS.

– Sören Kuklau – 2015-07-09T12:36:40.960

@SörenKuklau - Actually, OS X still does, but it also allows you switch the units to base base 10 though. Windows does not. I am sure, somebody at some point in time, has modified Linux to do it also. – Ramhound – 2015-07-09T12:54:35.107

2@Ramhound: Could you find a source for this? OS X seems to use the correct prefixes by default on my system, and I see no option to change it. There is an option to select metric or US units, but no option for using the binary prefixes. – Dietrich Epp – 2015-07-09T13:09:16.723

@DietrichEpp - Looks like there is a script that can do it, nothing official, but I recall reading about an option to switch back and forth. This seems to indicate that option shouldn't exist if it did, so it likely was removed, considering the length of time since 10.6

– Ramhound – 2015-07-09T13:17:53.550

1MacOS and some Linux distros like Ubuntu have switched to decimal prefix to make file size consistent with HDD size. KB = 1000 bytes and GB = 1000 bytes – phuclv – 2015-07-09T14:59:10.340

4Hard drives are not the "exception." The "GB" on a DVD is in decimal gigabytes. Decimal prefixes are also used for tape capacities, network speeds ("gigabit Ethernet" is 1000^3 bits/s), CPU and bus clock speeds and bandwidth ratings, and in the old days, the so-called "baud rate" on serial ports. If anything, RAM is the exception with nearly every other product in the field using decimal prefixes. For some reason Windows Explorer decided to go with the JEDEC convention instead of the one used by the makers of the hard drives that contain the files Explorer is telling you about. – Jamie Hanrahan – 2015-07-09T19:05:51.797

2@JamieHanrahan: Drive storage has historically used sectors with a power-of-two size, and allocation chunks that were a power-of-two number of sectors. A 360K floppy held 720 sectors of 512 bytes each; a "1.44MB" floppy was 2,880 such sectors [the "megabyte" was 1,024,000 bytes]. Only after drive capacities got larger did the megabyte shrink. – supercat – 2015-07-09T20:47:46.013

1

@supercat Not so. Hard drives have almost always been marketed using decimal prefixes and have come in sizes easily expressed that way, going back to the first, the IBM 350 RAMAC ("5 million characters"). Later IBM drives used a "count key data" format in which the size of a "block" was up to the site—often a multiple of 80. Though modern drives do use binary-sized sectors there is nothing else in them that encourages binary prefixes—not with zone bit recording, not-power-of-2 cylinder counts, and possibly three active surfaces. See https://en.wikipedia.org/wiki/Binary_prefix#Disk_drives

– Jamie Hanrahan – 2015-07-09T21:06:02.563

All filesystems I'm familiar with (FAT, ext2/3/4, xfs, NTFS, some others) allocate space for files in power-of-2 chunks, usually 4kiB. Not coincidentally, that's also the virtual memory page size on x86. So a 4097 byte file uses 2x the disk space of a 4096 byte file (except on filesystems like reiserfs with tail-packing, or maybe with transparent compression.) iso9660, being a read-only FS, might pack files head-to-tail with no internal fragmentation, I forget and couldn't find the answer quickly. – Peter Cordes – 2015-07-10T07:27:44.613

Hard drives still allocate 512 byte sectors or 4096 byte sectors. You can't just allocate 1000 bytes of hard drive space. The only miracle I see is that those supposed values they churn out (e.g. 32,000,000,000 bytes) can end up in a module where bytes aren't wasted, because the BIOS support itself uses binary bases for calculating offsets into a drive. – phyrfox – 2015-07-10T12:03:49.497

1@phyrfox There is no such thing as "using binary bases for calculating"; that is a mindset that will lead you astray. For representing the numbers, yes. But for calculating, arithmetic is arithmetic: the inputs and answers are the same whether they're encoded in decimal, binary, or anything else. Re "wasted space", non-issue: if you look at any drive's true size in bytes you'll find it's enough bigger (if necessary) to be divisible by the block size. Just btw, 32,000,000,000 IS divisible by 4096, and modern OSs don't use "BIOS support" for anything beyond the initial stages of booting. – Jamie Hanrahan – 2015-07-10T19:23:59.377

@JamieHanrahan Computers use binary for everything. They just make it decimal when they want to show us something. And, like I said, sectors are indeed 512 or 4096 bytes in size, not 500 or 4000. I do concede that I didn't realize that hard drive sizes that are even "millions" do happen to be divisible by 512, though. – phyrfox – 2015-07-10T20:09:14.970

1@phyrfox No, computers do not use binary for "everything". Many accounting applications, including Excel, use decimal representation of numbers so that e.g. 1/10 can be exactly represented (in binary it cannot be, it's a repeating fraction), and round-off errors and similar occur in exactly the same way as with a calculator or manual calculation. Historically speaking, many important computers used decimal representations internally for all numbers—even memory addresses! Regardless, a "500 GB" hard drive is much, much closer to 500x1000³ bytes than it is to 500 x 1024³. – Jamie Hanrahan – 2015-07-10T20:24:14.310

"All filesystems I'm familiar with (FAT, ext2/3/4, xfs, NTFS, some others) allocate space for files in power-of-2 chunks, usually 4kiB." That's an implementation detail that should be entirely irrelevant to how data is represented to the user. The user's question is not, "how is my data stored?", but "how much space do I have left?". – Sören Kuklau – 2015-07-10T20:43:52.283

1@SörenKuklau And "how much can I store on my drive?" Another example where this confuses the user: Suppose you have a pair of "750 GB" HDs and you combine them into a RAID-0 array. The resulting capacity in SI prefixes is of course 1.5 TB. But Windows will show the original drives as 698 GB (really 698 GiB). Now the poor human may well be used to this and will expect the array to be 1396 GB, or 1.396 TB... but Windows will display it as 1.36 TB. Not a big diff, but they will again blame Windows or the drive maker or somebody for lying. – Jamie Hanrahan – 2017-01-22T16:15:49.460

@RickBrant .. as a low level Engineer I can assure you that all computers "see" only 1's and 0's. Just because Excel or your calculator can show you decimal points doesn't mean they "understand" floating point numbers. Additionally, the BIOS is used for well more than just "initial booting". Though given the OP's question, I think binary math and analyzing logic gates is something to be left for another day.

– txtechhelp – 2017-07-21T10:12:49.837

@SörenKuklau that's only for Finder; as I stated, it's application dependent .. load up Terminal and it's still base 2 .. just leading to further confusion .. – txtechhelp – 2017-07-21T10:19:58.187

@txtechhelp Maybe Linux does things differently, but... the Windows NT devs made a very early decision to not use platform firmware code for anything beyond loading the NT VBR code. And they never have since. Not under UEFI either; once UEFI firmware has loaded bootmgr.exe from the EFI system partition, UEFI firmware code is never executed again until the next reboot. ACPI "methods" would seem to be an exception but they are not, as they do not involve actually running machine code that's part of the firmware. They are written in a bytecode that is interpreted by the acpi.sys driver. – Jamie Hanrahan – 2017-07-22T07:34:37.633

@txtechhelp I understand that everything in a computer is just 1s and 0s...even if you happen to be implementing an adder - or, say, a frequency counter - that works in BCD. :D My point is that the rules of arithmetic, what results you get from what operations on what inputs, don't change just because the machine that implements them is counting in base 2 instead of base 10. How you implement the rules changes for different bases, but the answers don't change, only their representations. This is why I think "using binary bases for calculating" (emph. added) is the wrong model. – Jamie Hanrahan – 2017-07-22T08:03:02.947

Amazing answer and love the Randall reference. For years I always wondered why 'size of file' and 'size on disk' were always different, just never looked it up. – RoLYroLLs – 2018-04-04T03:37:40.627

14

In Windows Explorer, KB means kilobyte where it refers binary kilo- of 1024 bytes. Explorer uses the capital 'K' to “indicate” binary as opposed to lower-case 'k' which is the standard kilo- prefix in SI to mean 1000.

Raymond Chen's blog post Why does Explorer use the term KB instead of KiB? gives an overview why Windows does not use KiB.

If you look around you, you'll find that nobody (to within experimental error) uses the terms kibibyte and KiB. When you buy computer memory, the amount is specified in megabytes and gigabytes, not mebibytes and gibibytes. The storage capacity printed on your blank CD is indicated in megabytes. Every document on the Internet (to within experimental error) which talks about memory and storage uses the terms kilobyte/KB, megabyte/MB, gigabyte/GB, etc. You have to go out of your way to find people who use the terms kibibyte/KiB, mebibyte/MiB, gibibyte/GiB, etc.

Explorer is just following existing practice. Everybody (to within experimental error) refers to 1024 bytes as a kilobyte, not a kibibyte. If Explorer were to switch to the term kibibyte, it would merely be showing users information in a form they cannot understand, and for what purpose?

Alexey Ivanov

Posted 2015-07-09T05:06:52.457

Reputation: 3 900