Decode a Microsoft MS-DOS 5.0 FAT directory entry

27

5

The Microsoft FAT file system has a directory table to represent which "files" are in which "folders" on the disk. For the time, these entries crammed a lot of information into a small amount of bits. There are a bunch of technical specifications over on Wiki for the curious, but the challenge here is going to focus on a "simple" decoding of an entry.

Each entry consists of a 32-byte binary word, broken up into several sections. For consistency in this challenge, we'll be using the MS-DOS 5.0 version, the bytes are ordered as big endian, and we're calling byte 0x00 as the left-most, and byte 0x1F as the right-most.

Below is a brief schematic of the relevant sections, and what should be the output for each section (in bold).

  • The first 11 bytes are the file name in ASCII format (this is where the famous 8.3 filename comes from -- 8 bytes for the file name, 3 bytes for the extension). These are straight ASCII encoding, and should be output as ASCII with a period (.) between.
    • Note: both the 8 and the 3 parts are padded with spaces to make a full-length entry. The output should ignore spaces (i.e., don't output them).
    • The file extension may be empty (i.e., all spaces), in which case the output should not output the dot.
    • Since ASCII only uses the lower 7 bits, the bytes will all have a leading 0.
  • The next byte (0x0b) is a bitmask of the following:
    • 0x01 Read Only - output RO
    • 0x02 Hidden - output H
    • 0x04 System - output S
    • 0x08 Volume Label - output VL. File size (below) should be output as 0, regardless of its actual entry.
    • 0x10 Subdirectory - output SD. File size (below) should be output as 0, regardless of its actual entry.
    • 0x20 Archive - output A
    • 0x40 Device - ignored for this challenge.
    • 0x80 Reserved - ignored for this challenge.
    • Since this is a bitmask, multiple flags are possible - all applicable outputs should be concatenated together in any order. For example, 0xff could be ROHSVLSDA (or any other combination).
  • The next two bytes (0x0c and 0x0d) are not used under MS-DOS 5.0.
  • The next two bytes (0x0e and 0x0f) are the creation time as follows:
    • Bits 15 to 11 are the hours in 24-hour format - output 00 to 23
    • Bits 10 to 5 are the minutes - output 00 to 59
    • Bits 4 to 0 are the seconds/2 - output 00 to 58 (note that seconds are only in two-second resolution)
    • For clarification: hhhhhmmmmmmsssss when written big-endian.
  • The next two bytes (0x10 and 0x11) are the creation date as follows:
    • Bits 15 to 9 are the year - output 1980 for 0 up to 2107 for 127
    • Bits 8 to 5 are the months - output 1 to 12 (with or without leading zero)
    • Bits 4 to 0 are the day - output 0 to 31 (with or without leading zero)
    • For clarification: yyyyyyymmmmddddd when written big-endian.
  • The next two bytes (0x12 and 0x13) are the last access date. While used in MS-DOS 5.0, we're ignoring this portion for this challenge.
  • The next two bytes (0x14 and 0x15) are not used by MS-DOS 5.0.
  • The next two bytes (0x16 and 0x17) are the last modified time, following the same format as the creation time, above.
  • The next two bytes (0x18 and 0x19) are the last modified date, following the same format as the creation date, above.
  • The next two bytes (0x1a and 0x1b) are the cluster location of the file on disk. We're ignoring this portion for this challenge.
  • The final four bytes (0x1c, 0x1d, 0x1e, and 0x1f) are the file size - output as an unsigned integer, unless the VL or SD flags are set (above), in which case output 0.

Visual representation

0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
\______________________________FILENAME________________________________________________/\_ATTR_/\___NOTUSED____/\_CREATIONTIME_/\_CREATIONDATE_/\__LASTACCESS__/\___NOTUSED____/\_MODIFIEDTIME_/\_MODIFIEDDATE_/\___NOTUSED____/\___________FILESIZE___________/

Input

  • A single 32-byte word (i.e., 256 bits), in whatever format is convenient.
    • This could be as a string of 1 and 0, as several unsigned ints, an array of Boolean values, etc.
    • Please specify in your answer what format you're using for input.
    • You cannot take multiple input (i.e., an array pre-broken into the relevant byte sizes) unless that is the only way for your language to take input. Parsing the input is part of the challenge.
  • You can assume input to be valid (for example, you don't need to perform date checking to verify that the date is valid).
  • Bytes that are unused can be all 0, all 1, etc., just so long as they are present. In the below examples, I used all 0 for the unused bytes.

Output

Either printed to screen or returned, the following:

  • The filename as an ASCII string
  • The file attributes as an ASCII string
  • The creation time and creation date, with appropriate separators (colons, slashes, something to distinguish the components)
  • The modified time and modified date, again with appropriate separators
  • The file size

The output can be a space-separated or newline-separated single string, separate elements in an array, etc. Please specify in your answer how your output is formatted.

Rules

  • Standard I/O formats are acceptable.
  • Either a full program or a function are acceptable.
  • Standard loopholes are forbidden.
  • This is , so all usual golfing rules apply, and the shortest code wins.
  • Built-ins that perform exactly this function are forbidden.

Examples

0111000001110010011011110110011101110010011000010110110101101101011010010110111001100111000001100000000000000000101000100100010001001000110101000000000000000000000000000000000010100010010001000100100011010100000000000000000000000000000000001101000000000000

programm.ing HS 20:18:08 2016/06/20 20:18:08 2016/06/20 53248

0010000000100000001000000010000001110000011100000110001101100111001000000010000000100000000101000000000000000000010111010110110000111101100111110000000000000000000000000000000010100010010001000100100011010100000000000000000011110000000100111111001011100001

ppcg SDS 11:43:24 2010/12/31 20:18:08 2016/06/20 0

AdmBorkBork

Posted 2016-06-20T18:18:02.383

Reputation: 41 581

Is it okay if there's excess whitespace around the flags? Eg, would SD S be a valid flag set? – Morgan Thrapp – 2016-06-21T17:15:42.480

@MorganThrapp Sure, that should be fine. – AdmBorkBork – 2016-06-21T17:23:47.433

Out of curiosity, did you get a lot of experience interacting with MS-DOS 5.0 back in the day, or were you just really bored on Wikipedia one day? – cat – 2016-06-24T15:47:12.603

3@cat Some of both. I've been heavily interested in computers since I was about 5 and received a Commodore VIC-20. One of my grad-level computer science projects about 10 years ago was to build our own filesystem, so I researched a lot for that. For this, I pulled a bunch from Wiki and pared it down to something that could be a challenge. – AdmBorkBork – 2016-06-24T15:52:29.870

Answers

6

Verilog, 513 670 617 bytes

Runs using IVerilog. No special compile flags needed.

This is a monster of nested definitions, bit-twiddling, and the annoyance of having to flip bit order due to the endianness (otherwise either the string doesn't print, or the number bit order is wrong). The input is taken through the i port, which is the usual way of taking input into a Verilog module. $display is used to print to standard out. 6 bytes could be saved if leading zeros were not required for the timestamp.

`define r(o,b)wire[3:0]o;assign o={i[b],i[b+1],i[b+2],i[b+3]}; 
`define R(t,a,b,c,d,s)`r(a,s)`r(b,s+4)`r(c,s+8)`r(d,s+12)wire[15:0]t;assign t={a,b,c,d};
`define p(m,k)i[90+m]?"k":"",
`define F(a,b)"a a a b\t b%d"
module f(input[0:255]i);`R(d,q,w,e,r,112)`R(D,Q,W,E,R,128)`R(s,z,x,c,v,224)`R(S,Z,X,C,V,240)`R(p,t,y,u,o,176)`R (A,b,n,m,l,192)always@(i)$display(`F(%s%s%s,%02d:%02d:%02d%d/%d/%d),i[0:63],i[64:87]=="   "?" ":".",i[64:87],`p(5,RO)`p(4,H)`p(3,S)`p(2,VL)`p(1,SD)`p(0,A)d[15:11],d[10:5],d[4:0]*2,D[15:9]+1980,D[8:5],D[4:0],p[15:11],p[10:5],p[4:0]*2,A[15:9]+1980,A[8:5],A[4:0],|i[91:92]?0:{s,S});endmodule

Demo

Testbench (Not scored):

`timescale 1ns / 1ps

module ftest;
reg [0:255] i;
f uut (
.i(i)
);
initial begin
    i=256'b0111000001110010011011110110011101110010011000010110110101101101011010010110111001100111000001100000000000000000101000100100010001001000110101000000000000000000000000000000000010100010010001000100100011010100000000000000000000000000000000001101000000000000;
    #100;
i=256'b0010000000100000001000000010000001110000011100000110001101100111001000000010000000100000000101000000000000000000010111010110110000111101100111110000000000000000000000000000000010100010010001000100100011010100000000000000000011110000000100111111001011100001;     
end

endmodule

Example run:

$ iverilog design.sv testbench.sv  && vvp a.out  
programm.ing   HS      20:18:08       2016/ 6/20      53248
    ppcg        S  SD  11:43:24       2010/12/31          0

Reinstate Monica - ζ--

Posted 2016-06-20T18:18:02.383

Reputation: 288

5

Python, 485, 479, 442, 438, 431, 429, 418, 402, 395, 391, 370 bytes.

Saved 19 bytes thank to Cᴏɴᴏʀ O'Bʀɪᴇɴ reminding me that I can assign functions to a letter.

Saved 6 bytes thanks to FryAmTheEggman's suggestion to clean up the bitmask filter.

Saved 21 bytes thanks to W0lf's awesome Ruby answer forcing me to golf this down some more. ;)

This is an absolute monster. Pretty sure I can cut it down a bit more, but it's getting pretty close to being golfed out.

a=input()
j=''.join
n=lambda v:int(v,2)
f=j('RO H S VL SD A'.split()[i]for i in range(6)if n(a[88:96])&2**i)
print(j(chr(n(a[x:x+8])).strip()+'.'*(x==56)for x in range(0,88,8)).strip('.'),f,j('%02d:%02d:%02d'%(n(a[b-11:b-6]),n(a[b-6:b]),n(a[b:b+6]))+' %d/%d/%d '%(n(a[b+6:b+12])+1980,n(a[b+12:b+16]),n(a[b+16:b+21]))for b in[123,187]),n(a[208:])*(1-('V'in f or'D'in f)))

Morgan Thrapp

Posted 2016-06-20T18:18:02.383

Reputation: 3 574

maybe you could assign int to a char? or maybe make a function that performs str int. – Conor O'Brien – 2016-06-20T19:55:07.807

@CᴏɴᴏʀO'Bʀɪᴇɴ Good idea! – Morgan Thrapp – 2016-06-20T19:55:34.117

The space between or 'SD' can be removed, I think – Conor O'Brien – 2016-06-20T20:02:14.680

@CᴏɴᴏʀO'Bʀɪᴇɴ Yup, just did that. – Morgan Thrapp – 2016-06-20T20:02:48.807

Wow. Quite a bit shorter than I expected answers to be. – AdmBorkBork – 2016-06-20T20:03:35.843

@TimmyD Really? This is way longer than I expected my answer to be. – Morgan Thrapp – 2016-06-20T20:03:55.683

You think this is a monster? Wait until you see my Java answer...

– R. Kap – 2016-06-23T23:41:02.727

4

Haskell, 781 710 bytes

Thanks to BlackCap for some simplification

w n=('0'<$[1..2-length a])++a where a=show n
x s=tail.foldr(\a b->s:a++b)""
t=snd.span(==' ')
y a|all(==' ')a=""|0<1='.':t a
nm=(\(a,b)->t a++y b).splitAt 8
ms n(r,s)|n`mod`2^(r+1)`div`2^r>0=s|0<1=""
tm n=x ':'[w$n`div`2^11,w$n`mod`2^11`div`32,w$2*n`mod`64]
dt n=x '/'[w$1980+n`div`2^9,w$n`mod`2^9`div`32,w$n`mod`32]
pa s=x ' '[nm.map(toEnum.v.take 8).takeWhile(not.null)$iterate(drop 8)a,t,dt$v i,tm$v g,dt$v o,tm$v m,show u,"\n"]where{z n=splitAt(8*n);(a,b)=z 11 s;(c,d)=z 1 b;(e,f)=z 2 d;(g,h)=z 2 f;(i,j)=z 2 h;(k,l)=z 4 j;(m,n)=z 2 l;(o,p)=z 2 n;(q,r)=z 2 p;t=(zip [0..](words"RO H S VL SD A")>>=).ms$v c;u|any(`elem`t)"LD"=0|0<1=v r;v=foldl((+).(2*))0.map(read.pure).filter(`elem`"01")}
main=interact pa

This additionally allows garbage (like a newline character) to appear after the input.

Fox

Posted 2016-06-20T18:18:02.383

Reputation: 341

Got you some low hanging fruit: http://pastebin.com/X69jH75f

– BlackCap – 2016-09-22T21:35:27.337

3

Java, 1721 1587 1573 1560 1511 bytes:

import java.util.*;class A{int Q(String a,int b){return Integer.parseInt(a,b);}String P(int a){return Integer.toString(a);}ArrayList<String>B=new ArrayList<String>();void J(String O){B.add(O);}String TD(String l,String p,int a,int c,int d){String X,Y,Z;X=Y=Z=new String();int i=0;for(char F:l.toCharArray()){if(i<a){X+=F;}if(a<=i&i<c){Y+=F;}if(c<=i){Z+=F;}i++;}String[]H=new String[3];H[0]=P(d+Q(X,2));H[1]=P(Q(Y,2));H[2]=(p==":")?P(Q(Z,2)*2):P(Q(Z,2));int T=0;for(String A:H){H[T]=(A.length()<2)?"0"+A:A;T++;}return H[0]+p+H[1]+p+H[2];}String D(String i){String K=new String();int L=0;for(char Y:i.toCharArray()){if(L%8<1){K+=" ";}K+=Y;L++;}String[]C=K.split(" ");String[]z={"RO","H","S","VL","SD","A"};int[]l={1,2,4,8,16,32};Map<Integer,String>U=new HashMap<Integer,String>();for (int e=0;e<l.length;e++){U.put(l[e],z[e]);}String[]N={":","/",":","/"};int[]M={15,17,23,25};int[]O={5,7,5,7};int[]P={0,1980,0,1980};for(int y=1;y<9;y++){if((char)Q(C[y],2)!=' '){J(Character.toString((char)Q(C[y],2)));}}for(int v=9;v<12;v++){if((char)Q(C[v],2)!=' '){if(!B.contains(".")){J(".");}J(Character.toString((char)Q(C[v],2)));}}J(" ");int T=(char)Q(C[12],2);while(T>0){int H=l[0];for(int V:l){if(V<=T){H=V;}}J(U.get(H));T-=H;}for(int w=0;w<4;w++){J(" ");J(TD(C[M[w]]+C[M[w]+1],N[w],O[w],11,P[w]));}J(" ");if(B.contains("SD")||B.contains("VL")){J("0");}else{J(P(Q(C[29]+C[30]+C[31]+C[32],2)));}return String.join("",B);}public static void main(String[]a){A H=new A();System.out.print(H.D(new Scanner(System.in).next()));}}

Takes input in the format of 32 byte binary string. Outputs in the format of a space separated string. This may be a very very long answer, but I'm still not disappointed. I'm just happy I was able to implement this in Java. I will still try to golf this as much as I can, though.

Try It Online! (Ideone)

R. Kap

Posted 2016-06-20T18:18:02.383

Reputation: 4 730

1+1 because using Java for low-level problems is pleasantly ironic (much like my Haskell) – Fox – 2016-07-02T16:49:29.540

2

PHP, 301 288 bytes

for($b=unpack('A8f/A3e/Cl/n/Nc/N/Nm/n/Ns',$argn);$i<8;$f.=1<<$i++&$b[l]?[RO,H,S,VL,SD,A][$i-1]:'');echo trim($b[f].'.'.$b[e],' .')," $f ",($e=function($a){echo date('H:i:s Y/m/d ',mktime($a>>27&31,$a>>21&63,$a>>15&62,$a>>5&15,$a&31,1980+($a>>9&127)));})($b[c]),$e($b[m]),$b[l]&24?0:$b[s];

Try it online!

Input is a 32 byte word string via STDIN, output to STDOUT.

-13 bytes as a standalone program.

640KB

Posted 2016-06-20T18:18:02.383

Reputation: 7 149

2

Stax, 111 bytes

¼ΘUßU'ïMo^ø¬├▓> I¬i⌠·╥.↕¥½ßqS,=frT`d_`&&↓⌠ÉûÆiü=┌-< │∟Φ☼⌐¢3²Bu╜lJ╛§≥╪║ε┐╓ù♫╨Z░╖!¥É:╬Çß═╤às8Q←φ,ºï◘≥Ä£}èΦ╡FÉçø¶É

Run and debug it

recursive

Posted 2016-06-20T18:18:02.383

Reputation: 8 616

Oops, my mistake. I'll make a fix. – recursive – 2019-08-10T21:27:18.730

1It's working correctly now at the cost of 3 bytes. – recursive – 2019-08-10T21:40:35.280

2

Ruby, 344 bytes

m=gets
s=->b,l{b.slice!(0,l).to_i 2}
t=->b{'%02d:%02d:%02d %d/%d/%d'%[s[b,5],s[b,6],2*s[b,5],s[b,7]+1980,s[b,4],s[b,5],]}
i=(0..q=32).map{|i|m[i*8,8].to_i 2}
c=i.map(&:chr).join
n=c[0,8].strip
e=c[8,3].strip
e>?!&&n<<?.+e
f=''
6.times{|j|i[11][j]>0&&f<<%w(RO H S VL SD A)[j]}
$><<[n,f,t[m[112,q]],t[m[176,q]],(f[/VL|SD/]?0:m[-q,q].to_i(2))]*' '

The slightly more readable version is available here.

Online test: http://ideone.com/Fww1Rw

Cristian Lupascu

Posted 2016-06-20T18:18:02.383

Reputation: 8 369

1Nice one! Looks like I need to golf off another 27 bytes. ;) – Morgan Thrapp – 2016-06-22T13:40:14.847

2

JavaScript (ES6), 369

(b,Z=n=>n>9?n:'0'+n,W=n=>s[n]<<8|s[n+1],U=n=>[Z((t=W(n))>>11)+`:${Z(t>>5&63)}:`+Z(t%32*2),((t=W(n+2))>>9)+1980+`/${t>>5&15}/`+t%32],X=l=>String.fromCharCode(...s.slice(p,p+=l)).trim(),s=b.match(/.{8}/g).map(x=>+('0b'+x)),p=0)=>[X(8)+((x=X(3))?'.'+x:x),[...b].map((b,i)=>'A,SD,VL,S,H,RO'.split`,`[z=(2*z|b)%4294967296,i*b-90]||'',z=0).join``,U(14),U(22),b[92]|b[91]?0:z]

Less golfed

(b,
  Z=n=>n>9?n:'0'+n, // zero pad
  W=n=>s[n]<<8|s[n+1], // get word
  U=n=>[
   Z((t=W(n))>>11)+`:${Z((t>>5&63)}:`+Z(t%32*2),  // decode time
   ((t=W(n+2))>>9)+1980+`/${t>>5&15}/`+t%32 // decode date
  ],
  X=l=>String.fromCharCode(...s.slice(p,p+=l)).trim(), // extract space padded string
  s=b.match(/.{8}/g).map(x=>+('0b'+x)), // convert bits to bytes
  p=0
)=>
  [ X(8)+((x=X(3))?'.'+x:x),
    [...b].map((b,i)=>'A,SD,VL,S,H,RO'.split`,`[i*b-90]||'').join``,
    [...b].slice(-32).map((b,i)=>z=2*z|b,z=0), // this line merged with the preceding one in the golfed code
    U(14),U(22),
    b[92]|b[91]?0:z
  ]

Test

f=(b,Z=n=>n>9?n:'0'+n,W=n=>s[n]<<8|s[n+1],U=n=>[Z((t=W(n))>>11)+`:${Z(t>>5&63)}:`+Z(t%32*2),((t=W(n+2))>>9)+1980+`/${t>>5&15}/`+t%32],X=l=>String.fromCharCode(...s.slice(p,p+=l)).trim(),s=b.match(/.{8}/g).map(x=>+('0b'+x)),p=0)=>[X(8)+((x=X(3))?'.'+x:x),[...b].map((b,i)=>'A,SD,VL,S,H,RO'.split`,`[z=(2*z|b)%4294967296,i*b-90]||'',z=0).join``,U(14),U(22),b[92]|b[91]?0:z]

O.textContent+='\n'+f('0111000001110010011011110110011101110010011000010110110101101101011010010110111001100111000001100000000000000000101000100100010001001000110101000000000000000000000000000000000010100010010001000100100011010100000000000000000000000000000000001101000000000000')
O.textContent+='\n'+f('0010000000100000001000000010000001110000011100000110001101100111001000000010000000100000000101000000000000000000010111010110110000111101100111110000000000000000000000000000000010100010010001000100100011010100000000000000000011110000000100111111001011100001')
<pre id=O></pre>

edc65

Posted 2016-06-20T18:18:02.383

Reputation: 31 086

Ok, so I just ran this in Safari and got Script error.. But for some reason in Firefox it seems to work perfectly. I wonder why... – R. Kap – 2016-06-23T07:13:19.067

@R.Kap probably Safari is less ES6 compatible than Firefox. https://kangax.github.io/compat-table/es6/

– edc65 – 2016-06-23T07:35:28.207

1

Perl, 249 bytes

Takes 32 bytes as input, output is separated by newlines. unpack is perfect for this kind of binary structure parsing.

($f,$e,$a,$C,$M,$s)=unpack"A8A3CxxNx4Nx2N",<>;$f=~s/ //g;$e=~s/ //g;printf"%s
@{[map+(RO,H,S,VL,SD,A)[$a&1<<$_?$_:9],0..5]}
"."%02d:%02d:%02d %d/%d/%d
"x2 .$s*!($a&24),$f.".$e"x!!$e,map{$_>>27,$_>>21&63,$_>>15&62,$_/512%128+1980,$_>>5&15,$_&31}$C,$M

Some highlights:

  • The aforementioned unpack.
  • The turtle operator @{[]} allows to interpolate code in a string. It actually creates an array reference that is then dereferenced.
  • "$str1"x!!$str2 is a nice way to return $str1 only if $str2 is a non-empty string.

Below is a version that works on real directory entries, with little-endian fields, and only ignoring right padding on the filename and extension (so, e.g. " ppcg" doesn't have its initial whitespace removed) (254 bytes)

($f,$e,$a,$C,$M,$s)=unpack"A8A3CxxVx4Vx2V",<>;$f=~s/ +$//;$e=~s/ +$//;printf"%s
@{[map+(RO,H,S,VL,SD,A)[$a&1<<$_?$_:9],0..5]}
"."%02d:%02d:%02d %d/%d/%d
"x2 .$s*!($a&24),$f.".$e"x!!$e,map{$_>>11&31,$_>>5&63,2*$_&63,($_>>25)+1980,$_>>21&15,$_>>16&31}$C,$M

ninjalj

Posted 2016-06-20T18:18:02.383

Reputation: 3 018