#Hashtag_or_not

25

2

In this code golf challenge, you will verify hashtags!

#What_your_code_should_do

Input is a string. Output a truthy value if it is a valid hashtag, and a falsy value otherwise.

We define a string as a valid Hashtag if ...

  • It starts with a hash (#).
  • It doesn't have a number right after the hashtag (e.g. #2016USElection isn't a valid hashtag).
  • It doesn't have any "special characters" (i.e. any character that isn't an alphabet, underscore (_) or a number).

You can assume that the input only contains ASCII characters. (It would be unfair if we did Unicode too.)

#Rules

Basic rules apply.

#Examples

Truthy:

#
#e
#_ABC 
#thisisanunexpectedlylongstringxoxoxoxo
#USElection2016

Falsy:

Hello, World!
#12thBday
#not-valid
#alsoNotValid!
#!not_a_hash

user54200

Posted 2016-07-18T11:08:40.170

Reputation:

10Is # really a valid hashtag? – Adám – 2016-07-18T11:35:04.900

1@Adám Why not?? – None – 2016-07-18T11:35:33.777

4Is #öäü valid? – chrki – 2016-07-18T12:15:08.980

Should the empty string return falsy? Or are we allowed to require at least one character? – owacoder – 2016-07-18T13:00:55.117

7# is not a valid hashtag by any system, Facebook or Twitter it also breaks the rules set also im not sure #_ABC is valid again on them but im not certain of that. – Martin Barker – 2016-07-18T13:23:40.987

3I assume an alphabet means ascii uppercase or lowercase letter? i.e. abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ? – Rɪᴋᴇʀ – 2016-07-18T14:14:25.387

7A # is not a hashtag. It's a hash. It, followed by a string is what social media networks refer to as a hashtag. It's a tag, which starts with a hash. – i-CONICA – 2016-07-18T15:43:03.847

1Will we receive empty string as an input? – Leaky Nun – 2016-07-18T16:44:10.110

Is printing nothing equals to False? – kenorb – 2016-07-18T22:26:58.177

@LeakyNun No, You won't. – None – 2016-07-19T06:51:12.333

2This is similar to the definition of a valid identifier in many languages. Not sure if abusing eval would be possible though. – gcampbell – 2016-07-19T12:40:39.883

You should either allow - or disallow both - and _, in my opinion. – haykam – 2016-07-20T02:58:42.363

@Peanut Why? The challenge works perfectly well as is, and _ is treated as a word character in a lot of contexts while - almost never is. – Martin Ender – 2016-07-20T07:12:56.613

Answers

19

Retina, 12 bytes

^#(?!\d)\w*$

Prints 1 for hashtags and 0 otherwise.

Try it online! (The first line enables a linefeed-separated test suite.)

Not much to explain here, this is quite a literal implementation of the definition: ^ and $ are just anchors ensuring that the match covers the entire string, # checks that the string starts with a #, (?!\d) ensures that the next character isn't a digit (without advancing the regex engine's position), \w* checks that we can reach the end of the string with zero or more letters, digits or underscores.

By default, Retina counts the number of matches of the given regex, which is why this gives 1 for valid hash tags and 0 otherwise.

Martin Ender

Posted 2016-07-18T11:08:40.170

Reputation: 184 808

In Perl, (?!\d) is (?=\D)... but I don't know how you've written Retina. Is it possible you could use (?\D) without the = and save a byte? (If not, is it worth editing the language so that's doable?) – msh210 – 2016-07-19T15:08:00.137

2@msh210 (?!\d) is different from (?=\D) in that the latter requires some character after the current position while the former is satisfied with the end of the string. Regardless of that, adjusting the regex flavour is currently not possible (since I'm just handing the regex to .NET's regex engine), but making such changes is on the roadmap somewhere (very far) down the line. – Martin Ender – 2016-07-19T15:13:16.180

1That said, I don't think I'll make the = optional. The entire (?...) syntax was chosen for extensibility, in that the character after the ? is never optional and determines which kind of group this is, and I don't think I want to give up that extensibility. – Martin Ender – 2016-07-19T15:13:17.837

(re your first comment) Duh, of course, I shoulda noted that. But it's irrelevant to this answer. (re your second) Yeah, makes sense. There is after all also (?{ and (?? and (?< (both for capturing groups and for lookbehind) and (?- and (?1 and of course the basic (?:. And maybe some I;ve missed. – msh210 – 2016-07-19T15:19:45.870

6

Perl, 22 bytes

21 bytes code +1 for -p

$_=/^#([a-z_]\w*)?$/i

Prints 1 if it's a valid hashtag, empty string otherwise.

Usage

perl -pe '$_=/^#([a-z_]\w*)?$/i' <<< '#'
1
perl -pe '$_=/^#([a-z_]\w*)?$/i' <<< '#_test'
1
perl -pe '$_=/^#([a-z_]\w*)?$/i' <<< '#1test'

Saved 2 bytes thanks for Martin Ender (and another 4 using his lookaround method)


Perl, 18 bytes

17 bytes code +1 for -p

Using Martin's lookaround this can be much shorter!

$_=/^#(?!\d)\w*$/

Dom Hastings

Posted 2016-07-18T11:08:40.170

Reputation: 16 415

You copied Martin's one and edited it, right? – None – 2016-07-19T06:54:34.337

@MatthewRoh The second answer uses Martin's mechanism yes. He did say I could use it, but I didn't want it to be my main answer as I didn't come up with it myself! I've added it for comparison. Retina still beats Perl easily in this type of challenge! – Dom Hastings – 2016-07-19T06:56:11.837

6

JavaScript (ES6), 25 bytes

s=>/^#(?!\d)\w*$/.test(s)

F = s => /^#(?!\d)\w*$/.test(s)
input.oninput = () => result.innerHTML = input.value ? F(input.value) ? '\ud83d\udc8e' : '\ud83d\udca9' : '\ud83d\udcad';
#input, #result {
  vertical-align: middle;
  display: inline-block;
}
#input {
  line-height: 2em;
}
#result {
    font-size: 2em;
}
<input id="input" type="text"/> <span id="result">&#x1f4ad</span>

George Reith

Posted 2016-07-18T11:08:40.170

Reputation: 2 424

5

C, 80 bytes

Function f() takes the string as an argument and modifies int *b to either 1 or 0 to indicate truthy/falsy.

f(char*p,int*b){for(*b=(*p==35)&&!isdigit(p[1]);*p;++p)*b&=isalnum(*p)||*p==95;}

If the string always has at least one character (i.e. never an empty string), a byte can be shaved off for 79 bytes:

f(char*p,int*b){for(*b=(*p==35)&&!isdigit(p[1]);*++p;)*b&=isalnum(*p)||*p==95;}

owacoder

Posted 2016-07-18T11:08:40.170

Reputation: 1 556

5

Python 3, 41 bytes

import re
re.compile('#(?!\d)\w*$').match

Gábor Fekete

Posted 2016-07-18T11:08:40.170

Reputation: 2 809

This should be absolutely fine. Since match objects are truthy and None is falsey, I think dropping the bool() is okay. – Lynn – 2016-07-19T14:42:19.173

Yeah I thought about that, thanks for clarifying it! – Gábor Fekete – 2016-07-19T14:44:04.663

This generates truthy value for “#fix me Gábor” too. BTW, I see the rules are getting ignored by others too, but this we used to consider a snippet, which usually is not accepted as answer unless the question explicitly allows them. – manatwork – 2016-07-19T14:57:25.263

Thanks, I rewrote it to handle the case you wrote and made it a lambda function. – Gábor Fekete – 2016-07-19T15:00:22.107

2How about re.compile('#(?!\d)\w*$').match? It’s acceptable to drop the f=, BTW. – Lynn – 2016-07-19T16:19:34.320

Oh, nice idea, it will be then a function too! – Gábor Fekete – 2016-07-19T17:04:00.747

4

Brachylog, 55 bytes

"#"|h"#",?b@lL'(eE,@A:"1234567890":"_"c'eE),@A:"_"ce~hL

This uses no regex.

Explanation

Main predicate, Input (?) is a string

  "#"                           ? = "#"
|                             Or
  h"#",                         First character of ? is "#"
  ?b@lL                         L is the rest of the chars of ? lowercased
  '(                            It is not possible for any char of L that...
    eE,                           Call this char E
    @A:"1234567890":"_"c          Concatenate the lowercase alphabet with the digits and "_"
    'eE                           E is not a member of that concatenated string
   ),                           
   @A:"_"c                      Concatenate the lowercase alphabet with "_"
   e~hL                         One char of that concatenated string is the first char of L

Fatalize

Posted 2016-07-18T11:08:40.170

Reputation: 32 976

4

Python 3, 103 93 bytes

all((c=='_'or c.isalpha()*i>0)^(i<1and'#'==c)^(c.isdigit()*i>1)for i,c in enumerate(input()))

The # being True killed me here, I had to enumerate the string to avoid an index error on the single character input.

atlasologist

Posted 2016-07-18T11:08:40.170

Reputation: 2 945

1+1. Nice! I completely forgot the isalpha() method on my py3 answer :D "#" being true, also destroyed me. – Yytsi – 2016-07-18T13:59:51.637

4

PowerShell v2+, 25 bytes

$args-match'^#(?!\d)\w*$'

Using Martin's regex, just wrapped up in PowerShell's -match operator coupled with the input $args. For truthy/falsey values, this will return the string itself on a match (a truthy value) or nothing on a non-match (a falsey value). This is because when a comparison operator is applied against an array, it returns anything that satisfies that operator.

A couple examples (wrapped in a [bool] cast to make the output more clear):

PS C:\Tools\Scripts\golfing> [bool](.\hashtag-or-not.ps1 '#2016Election')
False

PS C:\Tools\Scripts\golfing> [bool](.\hashtag-or-not.ps1 'Hello, World!')
False

PS C:\Tools\Scripts\golfing> [bool](.\hashtag-or-not.ps1 '#')
True

PS C:\Tools\Scripts\golfing> [bool](.\hashtag-or-not.ps1 '')
False

PS C:\Tools\Scripts\golfing> [bool](.\hashtag-or-not.ps1 '#USElection2016')
True

AdmBorkBork

Posted 2016-07-18T11:08:40.170

Reputation: 41 581

3

Mathematica, 52 46 43 bytes

Saved 6 9 bytes due to @MartinEnder.

StringMatchQ@RegularExpression@"#(?!\d)\w*"

Function. Takes a string as input, and returns True or False as output. Pretty simple, just matches against the regex /#(?!\d)\w*/.

LegionMammal978

Posted 2016-07-18T11:08:40.170

Reputation: 15 731

I have reason to believe that this won't work for inputs like hello#world since you don't have the beginning and end string anchors. I don't know Mathematica though so I'm not sure. – Value Ink – 2016-07-18T17:21:38.910

All righty, I can live with that. Have your +1 – Value Ink – 2016-07-18T20:39:56.363

3

Python3 - 156 128 bytes

lambda n:n=="#"or(n[0]=="#")*all(any([47<ord(c)<58,64<ord(c)<91,ord(c)==95,96<ord(c)<123])for c in n[1:]+"0")*~(47<ord(n[1])<58)

A solution that doesn't use regex. 0 is falsey and every other value is truthy.

Thanks to @LeakyNun for saving bytes!

Yytsi

Posted 2016-07-18T11:08:40.170

Reputation: 3 582

@LeakyNun I had to remove the +0 after n[1:], but sadly, still didn't work :/ Gave false to "#d". – Yytsi – 2016-07-18T15:32:37.537

@LeakyNun still doesn't work :( Again, had to remove +0 but fails on "#d". I tested it on Python3 though. Not sure if it will work on Python2 – Yytsi – 2016-07-18T15:37:18.523

@LeakyNun Just plain false. – Yytsi – 2016-07-18T15:47:05.977

@LeakyNun Throws IndexOutOfRange for "#" and False for "#d". – Yytsi – 2016-07-18T16:10:29.940

lambda n:n=="#"or(n[0]=="#")*all(any([47<ord(c)<58,64<ord(c)<91,ord(c)==95,96<ord(c)<123])for c in n[1:]+"0")*~(47<ord(n[1])<58) for 128 bytes. Proof that it works – Leaky Nun – 2016-07-18T16:44:47.097

@LeakyNun Now that works. The return values are pretty odd: "12" --> 0, "#" --> true, "#ab" --> -1, "#1b" --> -2. Can I just treat every output other output than "True" as falsey? – Yytsi – 2016-07-18T16:50:52.403

Here is the meta post for reference. – Leaky Nun – 2016-07-18T17:01:11.500

@LeakyNun Thanks! I managed to golf my solution down to 131 bytes, but yours is currently better, so I'll take it. – Yytsi – 2016-07-18T21:34:05.893

No, Falsey is 0, every other value will be truthy, in this case. – Leaky Nun – 2016-07-19T07:39:11.450

@LeakyNun Oops, I'll fix it, thanks! – Yytsi – 2016-07-19T10:38:00.590

3

Dyalog APL, 22 20 bytes

Without RegEx:

{0≤⎕NC 1↓⍵,⎕A}∧'#'=⊃

-2 thanks to ngn

Adám

Posted 2016-07-18T11:08:40.170

Reputation: 37 779

1Oh, wow. There are still people who know APL. It's 37 years since I used it! – Auspex – 2016-07-19T10:39:25.977

@Auspex APL is well and alive, but quite few features have been added in those years. – Adám – 2016-07-19T11:17:43.370

3

Octave, 37 56 54 43 bytes

Thanks to @LuisMendo for removing 8 bytes!

@(s)s(1)==35&(isvarname(s(2:end))|nnz(s)<2)

Not very golfy, but very built-inny.
Edit: The original code accepted strings with no leading '#'. I guess I should have stuck with regex.

Test suite on ideone.

beaker

Posted 2016-07-18T11:08:40.170

Reputation: 2 349

3

Python 2, 79 bytes

lambda x:x=='#'or(1>x[1].isdigit())&x[1:].replace('_','').isalnum()&('#'==x[0])

First golfing attempt. Ungolfed version:

def f(x):
    if x == '#':
        return True
    else:
        return x[0]=='#' and x[1:].replace('_','').isalnum() and not x[1].isdigit()

Cowabunghole

Posted 2016-07-18T11:08:40.170

Reputation: 1 590

Nice answer, and welcome to the site! – James – 2016-07-18T17:24:44.943

2

Google Sheets, 30 bytes

An anonymous worksheet function that takes input from the cell A1 checks it against the RE2 expression and outputs the result to the calling cell.

=RegexMatch(A1,"^#([a-z_]\w*)?

Taylor Scott

Posted 2016-07-18T11:08:40.170

Reputation: 6 709

2

Lua, 59 55 54 bytes

Code

s=arg[1]print(load(s:sub(2).."=0")and s:sub(1,1)=="#")

How it works:

  1. Check if the rest of the characters can be a valud Lua identifier (identifiers in Lua follow the same rules as hashtags.)
  2. Check if the first character is a #.

Takes input from the command line. Prints true if the string is a valid hashtag, otherwise, it prints nil.

xaxa

Posted 2016-07-18T11:08:40.170

Reputation: 21

1

Excel VBA, 54 bytes

Anonymous VBE immediate window function that takes input from cell [A1], checks if the value of the cell matches the Like pattern, and outputs as Boolean to the VBE immediate window

?Not[Left(A1,2)]Like"[#]#"And[A1]Like"[#][_a-zA-z0-9]*

Taylor Scott

Posted 2016-07-18T11:08:40.170

Reputation: 6 709

1

Haskell, 79 bytes

import Data.Char
f(a:r)=a=='#'&&r==""||r!!0>'@'&&all(\c->c=='_'||isAlphaNum c)r

Try it online!

Laikoni

Posted 2016-07-18T11:08:40.170

Reputation: 23 676

1

05AB1E, 18 bytes

Code:

¬'#Qs¦A«¬d_sDžjKQP

Uses the CP-1252 encoding. Try it online!.

Adnan

Posted 2016-07-18T11:08:40.170

Reputation: 41 965

1

Standard ML, 121 118 107 bytes

(fn#"#"::a=>(fn x::r=>x> #"@"andalso List.all(fn#"_"=>1=1|c=>Char.isAlphaNum c)a|e=>1=1)a|e=>1=0)o explode;

Try it online! Functional solution without using regex. Declares an anonymous function which is bond to the implicit result identifier it.

> val it = fn : string -> bool    
- it "#valid_hash";
> val it = true : bool

Laikoni

Posted 2016-07-18T11:08:40.170

Reputation: 23 676

4isAlphaNum$orelse that's rather threatening... – cat – 2016-07-18T15:57:06.130

@cat this might be the sole positive thing one can say about such verbose boolean operators as orelse and andalso. – Laikoni – 2016-07-18T17:38:32.807

2It's like, AlphaNum, orelse!! (orelse what?) – cat – 2016-07-18T17:43:35.820

One might consider the o explode at the end to be quite threatening too ... – Laikoni – 2016-07-18T17:48:14.777

orelse explode D: – cat – 2016-07-18T18:09:47.977

1SML seems quite scary, I don't think I could handle that all day :c – cat – 2016-07-18T18:10:33.007

1

Pyke, 19 bytes

\#.^It,!It\_D-PRkq|

Try it here!

Quick fix for tonight

Blue

Posted 2016-07-18T11:08:40.170

Reputation: 26 661

1@kenorb rebooted it, ping me if any more issues – Blue – 2016-07-18T22:15:32.107

#123 returns still nothing, shouldn't return 0? – kenorb – 2016-07-18T22:25:09.200

1Nothing is a boolean false – Blue – 2016-07-18T22:25:52.813

1

Python 3, 97 Bytes 70 Bytes 56 Bytes

lambda x:s=x[2:];b=x[1];all(x!="#",[x[0]=="#",any[b.isalpha(),"_"in b],any[s.isalnum(),"_"in s]])

(Code changed) Human readable

x=input()
if x[0]=="#" and x[1].isalpha() and str(x)[2:].isalnum():
    print(True)
else:
    print(False)

Dignissimus - Spammy

Posted 2016-07-18T11:08:40.170

Reputation: 449

Nice answer, and welcome to the site! Functions are also allowed, so you could shorten this quite a bit with lambda x:all(True==[x[0]=="#",x[1].isalpha(),x[2:].isalpha()]) – James – 2016-07-18T20:46:21.100

No problem, glad I could help! – James – 2016-07-18T21:01:06.650

1

I hate to be the bringer of bad news, but doesn't this fail for '#', which the OP says is truthy? Won't it also fail if the hashtag contains any underscores, which are false under isalpha?

– TheBikingViking – 2016-07-18T21:23:31.497

@TheBikingViking sorry, I will try to fix this now – Dignissimus - Spammy – 2016-07-18T21:26:44.140

It might be worth marking your answer as non-competing until it is fixed, to prevent any downvoting. – TheBikingViking – 2016-07-18T21:29:49.793

@TheBikingViking Is it okay now? – Dignissimus - Spammy – 2016-07-18T22:04:38.547

Try running it past all the test cases, and see if it returns the correct output. – TheBikingViking – 2016-07-18T22:07:53.817

2@TheBikingViking That's not what non-competing means. Non-competing is not an excuse for an invalid submission. The correct procedure is to delete the answer, fix it, then undelete it. – Mego – 2016-07-18T22:44:47.087

@Mego Ah, okay, Thanks for pointing that out. – TheBikingViking – 2016-07-19T16:09:10.713

1

GNU grep, 15 + 2 = 17 bytes

grep -Ei '^#([a-z_]\w*)?$'

Test:

$ echo '#
#e
#_ABC
#thisisanunexpectedlylongstringxoxoxoxo
#USElection2016
Hello, World!
#12thBday
#not-valid
#alsoNotValid!' | grep -Ei '^#([a-z_][a-z0-9_]*)?$'

Output:

#
#e
#_ABC
#thisisanunexpectedlylongstringxoxoxoxo
#USElection2016

Jordan

Posted 2016-07-18T11:08:40.170

Reputation: 5 001

1

Sed 19 + 2 = 21 bytes

/^#([a-z_]\w*)?$/Ip

This filters out all non-hashtags and outputs valid hashtags.

Run as sed -rn "/^#$|^#[a-z]\w*$/Ip". Quit with Ctrl + D (send EOF).

someonewithpc

Posted 2016-07-18T11:08:40.170

Reputation: 191

1

Ruby, 16 + 3 1 (n flag) = 19 17 bytes

Uses 0 as truthy and nil as falsy.

p~/^#(?!\d)\w*$/

Run it as ruby -ne 'p~/^#(?!\d)\w*$/'. Thanks to @manatwork for fixing the bash error when running the program.

Value Ink

Posted 2016-07-18T11:08:40.170

Reputation: 10 608

1

Do yourself a favor and always enclose code in single quotes. Otherwise the shell will attempt (or even worse, successfully perform) all kind of expansions. (Regarding the current issue with !, see Event Designators in man bash.)

– manatwork – 2016-07-19T07:55:16.820

0

Clojure, 130 135 132 bytes

  • +5 bytes to deal with an NPE that happened when the string consisted of only a hashtag.

  • -2 bytes by using Character/isLetterOrDigit.

(fn[s](let[[h & r]s n(map int r)](and(= h\#)(not(<= 48(or(first n)0)57))(every? #(or(Character/isLetterOrDigit^long %)(= 95 %))n))))

Ungolfed:

(defn hashtag? [s]
  (let [[h & r] s
        codes (map int r)]
    (and (= h \#)
         (not (<= 48 (or (first codes) 0) 57))
         (every?
           #(or (Character/isLetterOrDigit ^long %)
                (= 95 %))
           codes))))

Carcigenicate

Posted 2016-07-18T11:08:40.170

Reputation: 3 295

Whoops, this actually gives a NPE for "#". Give me a sec. – Carcigenicate – 2016-11-29T20:08:18.410

0

Java 8, 57 54 28 bytes

s->s.matches("#(?!\\d)\\w*")

Port of Martin Ender's Retina answer to save a few bytes and match added test cases.
Not that String#matches always matches the entire String, so no need for ^...$.

Try it here.

Kevin Cruijssen

Posted 2016-07-18T11:08:40.170

Reputation: 67 575

0

C#, 92 bytes

s=>s[0]=='#'&s.Length>1&&(s[1]<48|s[1]>57)&s.Skip(1).All(x=>char.IsLetterOrDigit(x)|x=='_');

C# lambda (Predicate) where input is a string and output is a bool.

Try it online!

aloisdg moving to codidact.com

Posted 2016-07-18T11:08:40.170

Reputation: 1 767

0

Lua, 39 bytes

print(arg[1]:match("^#[%a_][%a_%d]*$"))

Straightforward copypasta of match description. Outputs falsy nil if not hashtag, outputs truthly hashtag back otherwise.

Can be shortened for one more byte by using find if outputing list of two values (which is truthly) doesn't break rules.

Oleg V. Volkov

Posted 2016-07-18T11:08:40.170

Reputation: 171

I think this won't match a # on its own. – Martin Ender – 2016-07-19T14:44:48.100

@MartinEnder, of course. It shouldn't. None of the top answers do that either. Also http://codegolf.stackexchange.com/questions/85619/hashtag-or-not/85797#comment210296_85619

– Oleg V. Volkov – 2016-07-20T00:09:01.037

Whether # is a hashtag on Twitter or Facebook is irrelevant to this challenge. The specification is very clear on the fact that # should be considered a hashtag for the purposes of this challenge. And while I haven't checked all of the answers, all I did check do accept # as a hashtag, so I'm not sure which top answers you're referring to. – Martin Ender – 2016-07-20T07:12:13.173