Java (JDK), 295709 47044 (gzipped file) + 943 (code) + 1 (file) + 67 (imports) = 48055 bytes
Examining the file manually (with some help from notepad++), I found that there were 976 unique entries in the file, formed of 36 unique characters (plus newlines):
#(),/123456789:ABCDEFGNXabdghijmnsu
I then looked for common patterns, and created a dictionary as follows (key = value):
:maj = ¬
:min = `
\r\nA = "
\r\nB = £
\r\nC = $
\r\nD = %
\r\nE = ^
\r\nF = &
\r\nG = *
\r\nN = _
\r\nX = -
:sus = +
:hdim = =
:dim = [
(9) = }
(#9) = ]
:7 = {
:5 = ~
:aug = ;
#11 = @
b7 = '
maj7 = <
b13 = >
:11 = ?
(11) = \
b:9 = Z
¬^¬ = H
¬%¬ = I
¬$¬ = J
¬"¬ = K
£b¬ = L
¬*¬ = M
b¬"b = O
¬&¬ = P
b¬^b = Q
+4(', = R
£ = S
+4(') = T
%b¬ = U
£`£ = V
`7%`7 = W
7"` = Y
"b{] = c
*`7 = e
:13^b = f
`7$ = k
%` = l
^` = o
"` = p
b`7^ = q
b{% = r
cc = t
oo = v
&#~ = w
__ = x
YY = y
&#¬ = z
I then find-and-replace using those items in order:
()->{String s = "A LONG STRING THAT I CAN'T PASTE HERE - SEE TIO LINK";
String[] d=new String[]{"&#¬","z","YY","y","__","x","&#~","w","oo","v","cc","t","b{%","r","b`7^","q","\"`","p","^`","o","%`","l","`7$","k",":13^b","f","*`7","e","\"b{]","c","7\"`","Y","`7%`7","W","£`£","V","%b¬","U","+4(')","T","¬£","S","+4(',","R","b¬^b","Q","¬&¬","P","b¬\"b","O","¬*¬","M","£b¬","L","¬\"¬","K","¬$¬","J","¬%¬","I","¬^¬","H","b:9","Z","(11)","\\",":11","?","b13",">","maj7","<","b7","'","#11","@",":aug",";",":5","~",":7","{","(#9)","]","(9)","}",":dim","[",":hdim","=",":sus","+","\r\nX","-","\r\nN","_","\r\nG","*","\r\nF","&","\r\nE","^","\r\nD","%","\r\nC","$","\r\nB","£","\r\nA","\"",":min","`",":maj","¬"};
for (int i=0;i<d.length;i+=2){s=s.replace(d[i+1],d[i]);}return s;}
TIO (sort of).
EDIT
By compressing the string, as suggested in the comments, this answer can then be made shorter.
Using the GZIPped version of the string in a file "f" (size 45708 bytes), the code can then be as follows:
import java.io.*;
import java.nio.file.*;
import java.util.zip.*;
()->{String s="",l;try{BufferedReader b=new BufferedReader(new InputStreamReader(new GZIPInputStream(new ByteArrayInputStream(Files.readAllBytes(Paths.get("f"))))));while((l=b.readLine())!=null){s+=l;}}catch(Exception e){}String[] d=new String[]{THE SAME DICTIONARY AS THE PREVIOUS CODE - REDACTED HERE TO MAKE ANSWER SHORT ENOUGH};for (int i=0;i<d.length;i+=2){s=s.replace(d[i+1],d[i]);}return s;}
3That's .... a really long list. Does the output have to be in order? – Jo King – 2020-01-24T04:47:22.643
are compression/decompression libraries allowed? – Hymns For Disco – 2020-01-24T06:54:07.723
1Could you include a brief description of the chord format? Understanding what
C#:sus4(b7,9)
exactly means may help to compress it. – Arnauld – 2020-01-24T07:42:52.223What prevents us from reading the file and printing it? – RGS – 2020-01-24T07:55:08.327
1
@RGS You can do that. As per the rules described in this meta post, your score would be
– Arnauld – 2020-01-24T08:05:14.687your_code_size + 802213 + 1
.Ha ha, seriously! Clever way to ask us to find the shortest logic in pop music, will we be cited in the academic paper? :D Joke aside, what is
N
? Also I think it would have been better with simpler patterns extracted, or chords used in the songs rather than the whole songs.. – Kaddath – 2020-01-24T08:47:00.947@JoKing - Yes, the output must be in order. – Dustin G. Mixon – 2020-01-24T09:51:05.733
1@HymnsForDisco - Yes, compression/decompression libraries are allowed, but presumably, the best compressions will leverage insights from music theory. – Dustin G. Mixon – 2020-01-24T09:53:19.357
1@DustinG.Mixon I'm hugely interested, being a musician, but quite discouraged by the 125000+ lines of data, I have a work too :D is there a version with songs titles just for info? Another interesting part would be the relative aspect of notes, as the same progression can be transposed, but that's another matter, the way notes are written make this a bit difficult – Kaddath – 2020-01-24T10:14:59.060
@Kaddath - For the record, the file size is not unprecedented for this community. Song titles (and other interesting information) are available at The McGill Billboard Project website under "Index."
– Dustin G. Mixon – 2020-01-24T10:23:46.063This challenge would be more interesting if (like the Moby Dick challenge) we hade to output something close to the file, with score depending on both code size and number of errors.
– Robin Ryder – 2020-01-24T16:38:09.753