Remove Whitespace from a Java Program

6

2

In this challenge, you will be given the source code of a Java program. Your task is to write a function or full program that will remove the whitespace from this program. The whitespace-removed program should work exactly the same as the original program. Your program can be in any language. In this challenge, every space, tab, and newline will be considered whitespace.

Summary

Input can be through command-line arguments, function parameters, or STDIN/equivalent. Output will be the input program without whitespace through STDOUT/equivalent, along with an integer representing the amount of whitespace removed. Your score will be determined by (number of bytes) - (whitespace removed in sample program)

Sample Program (You can remove 104 whitespace characters from this)

class Condition {
    public static void main(String[] args) {
        boolean learning = true;

        if (learning == true) {
            System.out.println("Java programmer");
        }
        else {
            System.out.println("What are you doing here?");
        }
    }
}

http://pastebin.com/0kPdqZqZ

Whitespace-free version: class Condition{public static void main(String[]args){boolean learning=true;if(learning==true){System.out.println("Java programmer");}else{System.out.println("What are you doing here?");}}}

Clarifications from Comments (see below)

  • The programs passed as input will be at the same whitespace complexity as the sample program.
  • The programs will not contain comments.

Closing question

Due to the amount of confusion involved with the question, I have decided to close the competition and accept @MartinBüttner's answer. Sorry for any misunderstanding. However, I would believe the answers to be good as a tool for java golfers, which is why I did not delete the question.

GamrCorps

Posted 2015-10-09T19:34:46.057

Reputation: 7 058

Question was closed 2015-10-10T08:08:48.397

1Note: If anyone believes this to be a code golf rather than a code challenge question, feel free to change the tag. – GamrCorps – 2015-10-09T19:35:19.273

3Does the program just have to work with this single test case? If not, how do we ensure that the program is actually correct for all possible Java programs? I'm sure there must be tons of edge cases in a language as elaborate as Java. (You don't even have weird mixes of strings, character literals and comments.) – Martin Ender – 2015-10-09T19:39:04.290

Make it work with programs on the same level of complexity as the sample one (same indentation rules, etc.). I never would expect programs that could handle removing all 93 whitespace characters from even the sample. – GamrCorps – 2015-10-09T19:51:32.043

3I hope someone does write an overly long program that detects/removes them all. I'm sure it won't win on score, but it would be nice to have a snippet to run on my golfs instead of doing it manually (or with some online minifier) ;) – Geobits – 2015-10-09T19:55:17.560

What about statements broken over multiple lines? Like int\nfoo=5;? – Martin Ender – 2015-10-09T20:59:37.813

@MartinBüttner statements like those will be contained in a single line. – GamrCorps – 2015-10-09T21:13:53.853

@mbomb007 Strings should remain as is. – GamrCorps – 2015-10-09T21:14:15.840

Can you show us the fully whitespace-fixed version, please? – ETHproductions – 2015-10-09T21:49:26.453

Isn't actually possible to remove 104 whitespaces from the example, including newlines? Tested on a raw wgot of the linked file and I'm pretty sure I removed 104 characters, and it still compiles. – daniero – 2015-10-09T22:13:31.707

@daniero if you can manage to remove that much, feel free to edit the post – GamrCorps – 2015-10-09T22:14:40.187

@daniero I believe you've counted the newlines as \r\n, giving 22 characters for removing newlines, instead of just 11. – Martin Ender – 2015-10-10T08:00:05.977

I'm still not sure what "the same whitespace complexity" means. Can there be double-spaces inside strings? Can there be double spaces inside statements or expressions, e.g. int<SP><SP>foo = 5;, where <SP> is a space (otherwise the browser would display only one)? Currently the answers seem to make vastly different assumptions, so I'm this on hold as unclear until this is sorted out. – Martin Ender – 2015-10-10T08:07:40.923

@MartinBüttner That's probably what I did. ı used wc -c to check the difference, on the linked raw file from pastebin. It's not really stated how to deal with this, and my platform doesn't recognize \r\n as one character. I think counting bytes is a better option. I also addressed this in my own answer post below. – daniero – 2015-10-10T12:22:52.247

@daniero sorry for the confusion, please read this the new section in the question – GamrCorps – 2015-10-10T14:15:04.730

Answers

6

CJam, 25 bytes - 83 removed = -58

Since Java doesn't have multiline strings nor significant indentation, we can simply remove all leading spaces. Also, since there won't be any comments, and statements will not be split across lines (as per the OP's comments), I think I should be able to remove all newlines as well.

That makes for safe and short code with pretty decent savings for the test file:

q_,\N/{S\+e`1>e~}%_oNos,-

Test it here.

For the test case it yields

class Condition {public static void main(String[] args) {boolean learning = true;if (learning == true) {System.out.println("Java programmer");}else {System.out.println("What are you doing here?");}}}
83

Martin Ender

Posted 2015-10-09T19:34:46.057

Reputation: 184 808

C̶a̶n̶'̶t̶ ̶y̶o̶u̶ ̶a̶l̶s̶o̶ ̶r̶e̶m̶o̶v̶e̶ ̶n̶e̶w̶l̶i̶n̶e̶s̶?̶ ̶D̶o̶n̶'̶t̶ ̶k̶n̶o̶w̶ ̶C̶J̶a̶m̶ ̶t̶h̶o̶u̶g̶h̶,̶ ̶s̶o̶ ̶m̶a̶y̶b̶e̶ ̶i̶t̶ ̶w̶i̶l̶l̶ ̶b̶e̶ ̶w̶o̶r̶t̶h̶ ̶l̶e̶s̶s̶ ̶p̶o̶i̶n̶t̶s̶.̶ Sorry, realized that you can't always remove newlines in java. – Zereges – 2015-10-09T19:59:32.407

@Zereges If I can (waiting for clarification from the OP) it will actually save a byte of code (I'm explicitly writing the newlines back with the second N). – Martin Ender – 2015-10-09T20:03:01.323

Well, I found out, that my counter example did not work, so maybe you can remove newlines. You can also remove spaces before and after following symbols { } ( ) [ ] , ; = . and maybe some more – Zereges – 2015-10-09T20:11:16.977

1@Zereges only if they don't appear inside strings. – Martin Ender – 2015-10-09T20:15:10.563

The Java Language Specification does not permit newlines in string literals. edit: oh, I see- that apparently wasn't a point of confusion.

– JohnE – 2015-10-09T20:19:02.087

@MartinBüttner Right, probably not worth the bytes. – Zereges – 2015-10-09T21:08:06.667

@Mego Good point, I missed that. Fixed. – Martin Ender – 2015-10-09T21:17:56.093

3

JavaScript (ES6), 72 bytes - 92 removed = -20

(partially invalid?)

x=>(z=x.replace(/\s*[{}()=;]\s*/g,y=>y.trim()))+`
`+(x.length-z.length)

The output:

class Condition{public static void main(String[] args){boolean learning=true;if(learning==true){System.out.println("Java programmer");}else{System.out.println("What are you doing here?");}}

I believe the one space I could still remove is the one in (String[] args), but this would not be worth it. This is pretty much the lowest score JS can get (for this specific case, anyway).

ETHproductions

Posted 2015-10-09T19:34:46.057

Reputation: 47 880

You forgot to print the amount of whitespace removed (just like I did) – daniero – 2015-10-09T22:51:00.503

For input programs with a string literal containing any of {}()=; next to whitespace, it alters the string. – DankMemes – 2015-10-10T02:01:01.473

2

Ruby, 63 bytes - 104 removed = -41

edit: 104 bytes that is.

$><<j=(i=$<.read).gsub(/(?<=\W)\s|\s(?=\W)/,"")
p i.size-j.size

Removes all newlines, so the printed number at the end comes on the same line too:

$ curl -s http://pastebin.com/raw.php?i=0kPdqZqZ | ruby remove_whitespace.rb 
class Condition{public static void main(String[]args){boolean learning=true;if(learning==true){System.out.println("Java programmer");}else{System.out.println("What are you doing here?");}}}104

daniero

Posted 2015-10-09T19:34:46.057

Reputation: 17 193

Did you count the newlines as \r\n? Otherwise, I'm not sure how you'd get 104. – Martin Ender – 2015-10-10T07:59:31.350

Also, this would fail if there were two significant spaces in a row anywhere (e.g. in a string or int<SP><SP>foo = 5;) – Martin Ender – 2015-10-10T08:02:20.460

@MartinBüttner I used wc before and after - curl -s http://pastebin.com/raw.php?i=0kPdqZqZ | wc -c gives a difference of 104 if I don't print the numbers afterwards ("104\n"). As for the "significant spaces", there are none in the example so I think we're good (with "the same whitespace complexity" mentioned in the question). edit: I agree with putting the question on hold, this is all unclear. – daniero – 2015-10-10T12:13:16.053

1

Python, 92 bytes - 88 removed = 4

import re,sys
p=sys.stdin.read()
s=re.sub('\s+(=|{|}|(|))\s+',r'\1',p)
print len(p)-len(s),s

Test it here

Once again, regex to the rescue. There's a strange discrepancy between ideone and my python install - I'm getting 90 chars removed.

local python

jprog.py source:

#!/usr/bin/env python

print """class Condition {
    public static void main(String[] args) {
        boolean learning = true;

        if (learning == true) {
            System.out.println("Java programmer");
        }
        else {
            System.out.println("What are you doing here?");
        }
    }
}"""

Mego

Posted 2015-10-09T19:34:46.057

Reputation: 32 998

1

Java 8 : 79 Bytes

s->{System.out.print(s.replaceAll("(\\s+|\\n)?(\\{|\\}|=+|;)(\\s+|\\n)", "$2"));}

Regular expression with grouping can remove the extra white spaces between curly braces and equals sign used in assignment or equality check.

Regular expression is something like, (\s+|\n)?({|}|=+|;)(\s+|\n)

It will provide following output, which is valid Java program.

class Condition{public static void main(String[] args){boolean learning=true;if (learning==true){System.out.println("Java programmer");}else{System.out.println("What are you doing here?");}}}

CoderCroc

Posted 2015-10-09T19:34:46.057

Reputation: 337

1right tool for the job lol – Rohan Jhunjhunwala – 2016-07-25T00:39:29.980