What exactly was your question?

19

1

Introduction

When you are given a boring text, you just want to get it over with as quick as possible. Let's observe the following text:

Ens colligi timenda etc priorem judicem. De quascunque ii at contingere 
repugnaret explicetur intellectu. Adjuvetis hoc fortassis suspicari opportune 
obversari vix eam? Dei praemia prudens hominum iii constet requiri haberem. Ima 
sane nemo modi fuit lus pro dem haud. Vestro age negare tactum hoc cui lor. Ne et 
ut quod id soli soni deus. At constare innumera is occurret ea. Nia calebat seu 
acquiro fraudem effingo dicimus.

Note: This text has newlines for readability, while the test cases you need to handle don't have newline characters.

One way to skim text is to find questions that the writer has put into the text. For example, there is 1 question in the text above. Namely:

Adjuvetis hoc fortassis suspicari opportune obversari vix eam?

This can be done by splitting the text into sentences. A sentence will always end with one of the following punctuation symbols: .?! (others don't have to be handled). Other symbols than these are part of the sentence.


The Task

Given a line of text consisting of only

  • letters (A-Za-z)
  • numbers (0-9)
  • spaces
  • punctuation (,.;:?!) (a sentence cannot start with one of these).

Output the questions that are in the text in any reasonable format. You may assume that the text will always have at least 1 question. Outputting trailing and leading spaces before and after a question is allowed.

Important: Next to a punctuation symbol, there will never be another punctuation symbol (e.g. ?? is invalid and will not be required to handle).


Test cases

In the format:

Input
Output(s)

The test cases:

Huh? I haven't heard what you just said. Could you repeat that please?
Huh?
Could you repeat that please?

plz can i haz cheesburgr? i am cat pls.
plz can i haz cheesburgr?

This is a badly formatted question.Can u please help me,or my friends,with formatting this question    ?thankyou.
Can u please help me,or my friends,with formatting this question    ?

a.b.c.d?
d?

Does this question have a question mark? yes
Does this question have a question mark?

Why example.com resolves to 127.0.0.1 in 99.9 percent of cases?
9 percent of cases?

A? b? c? d!
A?
b?
c?

This is , so the submission with the lowest number of bytes wins!

Adnan

Posted 2016-07-19T22:51:28.807

Reputation: 41 965

2.?! Others must not be handled, as your rules specify, but you have said others don't have to be handled. – Erik the Outgolfer – 2016-07-20T07:27:50.167

No testcase with more than one question? – edc65 – 2016-07-20T13:47:30.453

@edc65 The first test case has 2 questions, but I'll add another test case. – Adnan – 2016-07-20T13:48:14.577

Answers

9

Retina, 13 11 bytes

!`[^.?!]*\?

Try it online!

!`       print all matches
[^.?!]*  any number of non-ending-punctuation symbols
\?       followed by a question mark

Thanks to @MartinEnder for 2 bytes!

Doorknob

Posted 2016-07-19T22:51:28.807

Reputation: 68 138

5

Python, 46 Bytes

import re
f=lambda s:re.findall("[^!?.]*\?",s)

Call with:

f("your string here")

output on tests:

['Can u please help me,or my friends,with formatting this question    ?', 'Can u please help me,or my friends,with formatting this question    ?', ' Huh?', ' Could you repeat that please?', ' plz can i haz cheesburgr?', 'd?', 'Does this question have a question mark?', '9 percent of cases?', 'A?', ' b?', ' c?']

another idea, 77 bytes (in python3 you'd need a list around filter):

import re
f=lambda s:filter(lambda x:x[-1]=="?",re.split("(?<=[\.\?!]).",s)))

I'm new to this so this, so this could probably be much shorter.

-17 (!) bytes thanks to Martin

-2 bytes by matching anything that is not "!","?" or "." (Getting close to the shell solutions, but I doubt I could save much more)

KarlKastor

Posted 2016-07-19T22:51:28.807

Reputation: 2 352

1Welcome to Programming Puzzles and Code Golf! Very nice first answer :). – Adnan – 2016-07-20T10:50:04.690

I don't think you need that lookbehind at all and neither do you need to make the [\w,:; ]* ungreedy (because that group can't go past a punctuation character anyway), and then you also don't need to prepend . to your input. You calso shorten the remaining character class to [^.!?]. – Martin Ender – 2016-07-20T11:42:43.827

Thank you, Martin! I have tried around with this a bit, but didn't see the obvious. – KarlKastor – 2016-07-20T19:06:31.790

4

JavaScript, 35 24 bytes

a=>a.match(/[^.?!]*\?/g)

Returns all substrings that start after a ., ?, or ! (or the beginning of the text) and end in a ?.

Business Cat

Posted 2016-07-19T22:51:28.807

Reputation: 8 927

Urgh. And I thought I did good with 40 bytes. Good Job OP and @MartinEnder – MayorMonty – 2016-07-29T01:56:23.447

3

V, 12 bytes

Í[^.!?]*[.!]

Try it online!

A very straightforward answer.

Í             "Remove every occurrence, on every line
 [^.!?]       "Of any character that isn't '.', '!', or '?'
       *      "Repeated any number of times
        [.!]  "Followed by a a '.' or a '!'

Thankfully, handling newlines, or verifying all test-cases does not add any bytes.

James

Posted 2016-07-19T22:51:28.807

Reputation: 54 537

3

Shell utilities, 43 38 bytes

Thanks to rexkogitans for trimming 5 bytes!

tr ? "\n"|sed "s/.*[\.!]//;s/.\+/&?/"

Pipe input in, like this:

echo Huh? I haven't heard what you just said. Could you repeat that please?|tr ? "\n"|sed "s/.*[\.!]//;s/.\+/&?/"

If if matters, I'm using:

  • GNU tr 5.3.0
  • GNU sed 4.2.1
  • Microsoft's cmd.exe, shipped with Windows 5.1.2600
  • Joe

    Posted 2016-07-19T22:51:28.807

    Reputation: 895

    1I've never submitted an answer using multiple utilities before, so if I'm doing something wrong, let me know. – Joe – 2016-07-19T23:52:09.673

    Do you use windows versions of the GNU utililties, or a shell for windows? – Erik the Outgolfer – 2016-07-20T08:02:23.457

    tr ? "\n"|sed "s/.*[\.!]//;s/.\+/&?/" saves 5 Bytes (two added in tr and 7 saved in sed - this was tested in bash). /g is not necessary, as it is processed line by line. – rexkogitans – 2016-07-20T08:53:54.983

    @EʀɪᴋᴛʜᴇGᴏʟғᴇʀ, I'm using the Windows ports of the GNU utilities. – Joe – 2016-07-20T21:15:50.750

    @rexkogitans, thanks! I forgot about []; my first attempt at doing that looked something like \(\.|!\). – Joe – 2016-07-20T21:16:45.767

    @SirBidenXVII If you were to do it with groups (not necessarily good in this case), you could use the -r option to sed, which allows you to write (\.|!) – someonewithpc – 2016-07-22T22:29:26.057

    3

    Jelly, 16 bytes

    f€“.?!”0;œṗfÐf”?
    

    Try it online! or verify all test cases

    Dennis

    Posted 2016-07-19T22:51:28.807

    Reputation: 196 637

    28 bytes, isn't it? (16 UTF-8 chars) – Fabio Iotti – 2016-07-20T06:09:20.473

    6@bruce965 Jelly uses a custom code page that encodes each of the 256 characters it understands as the single byte each. The bytes link in the header points to it. – Dennis – 2016-07-20T06:10:47.567

    Oh, cool! I'm not a codegolfer yet, so I'm not aware of this tricks, sorry for the question. – Fabio Iotti – 2016-07-20T06:33:31.710

    4@bruce965 For the record, it's not really a trick: the language could just as well use ISO 8859-1 (or some other existing single-byte encoding) and be just as powerful, but using a custom code page allows you to use more easily typable characters and better mnemonics than if you had to code with control characters for example. At the end of the day, it's just a stream of bytes, where every byte has been assigned some meaning. – Martin Ender – 2016-07-20T07:18:19.537

    2OK, "trick" might have sounded with a bad connotation, I should have said "stratagem" or something. I couldn't find any better word than "trick". – Fabio Iotti – 2016-07-20T07:55:59.773

    @bruce965 Have a look here so you'll be prepared for more odd glyphs.

    – Adám – 2016-07-29T13:21:26.877

    2

    Perl 5.10, 21 18 bytes (with -n flag)

    say m/[^?.!]+\?/g
    

    Straightforward implementation of the question.

    Try it here!

    Paul Picard

    Posted 2016-07-19T22:51:28.807

    Reputation: 863

    You can get rid of the leading m of your regex, and then you'll be able to remove the space between say and / – Dada – 2016-07-22T11:01:20.067

    2

    Ruby 1.9, 17 bytes

    $_=$F
    

    A 5 bytes program that must be invoked with the following command line options:

    paF[^?]*[.!]
    

    xsot

    Posted 2016-07-19T22:51:28.807

    Reputation: 5 069

    I didn't know Ruby flags could be wrestled with in such a manner, +1! Feels kind of odd, though, since consecutive questions will be together as one string within that array while other questions are separate, right? Unless there's a Ruby 1.9 quirk I'm not aware of. – Value Ink – 2016-07-29T07:14:05.270

    @ValueInk The contents of the array will be concatenated so the program outputs a single string, not an array literal. You can try it out at http://golf.shinh.org/check.rb which has ruby 1.9. The flags can be set in the shebang.

    – xsot – 2016-07-29T09:26:17.723

    Aha, that explains why you need 1.9 since 2.0 and up output it to look like an actual array. – Value Ink – 2016-07-29T21:11:33.183

    1

    Batch, 174 bytes

    @echo off
    set/ps=
    set t=
    :l
    set c=%s:~0,1%
    set t=%t%%c%
    if "%c%"=="?" echo %t%&set t=
    if "%c%"=="!" set t=
    if "%c%"=="." set t=
    set s=%s~1%
    if not "%s%"=="" goto l
    

    Reading a line from STDIN is a byte shorter than using set s=%*.

    Neil

    Posted 2016-07-19T22:51:28.807

    Reputation: 95 035

    1

    PowerShell v4+, 43 bytes

    ([regex]::Matches($args,'[^?!.]*\?')).Value
    

    Really straightforward. Takes input $args and feeds that in as the first parameter to a .NET [regex]::Matches(...) static function. The regex we're matching is [^?!.]*\? -- that is, any number of non-sentence-ending characters that are followed by a question mark. The static function returns an array of objects detailing what capture group, index, etc., but we only want the .Values, so the return is encapsulated in parens and we call that property. This is where the v4+ requirement comes into play, as in prior versions you'd need to instead do something like a loop |%{$_.Value} or |Select Value to get the appropriate properties.

    Example without the parens and .Value

    PS C:\Tools\Scripts\golfing> .\what-exactly-was-your-question.ps1 "Huh? I haven't heard what you just said! Could you repeat that please?"
    
    Groups   : {Huh?}
    Success  : True
    Captures : {Huh?}
    Index    : 0
    Length   : 4
    Value    : Huh?
    
    Groups   : { Could you repeat that please?}
    Success  : True
    Captures : { Could you repeat that please?}
    Index    : 40
    Length   : 30
    Value    :  Could you repeat that please?
    

    Example with the parens and .Value

    PS C:\Tools\Scripts\golfing> .\what-exactly-was-your-question.ps1 "Huh? I haven't heard what you just said! Could you repeat that please?"
    Huh?
     Could you repeat that please?
    

    AdmBorkBork

    Posted 2016-07-19T22:51:28.807

    Reputation: 41 581

    1

    Python 3, 91 bytes

    def f(x,a=0):
     for n in range(len(x)):
      if x[n]in".!":a=n+1
      if x[n]is"?":print(x[a:n+1])
    

    Saves 1 byte in Python 2:

    def f(x,a=0):
     for n in range(len(x)):
      if x[n]in".!":a=n+1
      if x[n]is"?":print x[a:n+1]
    

    Daniel

    Posted 2016-07-19T22:51:28.807

    Reputation: 6 425