Get a word's individuality!

8

0

I love /usr/share/dict/words; it's so handy! I use it for all my programs, whenever I can! You're going to take advantage of this ever so useful file to use, by testing a word's individuality.


Input

  • A word; defined in this challenge as any string of characters
  • /usr/share/dict/words in some format; you may hard code it, read from disk, assume it as a second argument, whatever makes the most sense in your challenge

Output

  • A words individuality (see below)

A word's individuality is derived from the following equation:

<the number of words for which it is a substring> / <length of the word>

Let's take a look at an example: hello. There are 12 words which have the substring hello in them, divided by 5 (hello's length), and hello's individuality is 12/5 or 2.4


P.S. This is , so the lower the individuality score, the more individual

Because individuality is a long word, your program must be as short as possible

Good Luck!


Test Cases

You can use this convenient Node.js script, which fits the challenge requirements to fit your code. It is also how I generated the test cases:

var fs = require("fs");
var word = process.argv[2];

process.stdout.write("Reading file...")
fs.readFile("/usr/share/dict/words", function(err, contents) {
  console.log("Done")
  if (err) throw err;

  words = contents.toString().split("\n");

  var substrings = words.filter(w => w.indexOf(word) > -1).length;
  var length     = word.length;

  console.log(`${word} => ${substrings} / ${length} = ${substrings / length}`)
})

Test Cases:

hello => 12 / 5 = 2.4
individuality => 1 / 13 = 0.07692307692307693
redic => 52 / 5 = 10.4
ulous => 200 / 5 = 40
challen => 15 / 7 = 2.142857142857143
ges => 293 / 3 = 97.66666666666667
hidden => 9 / 6 = 1.5
words => 12 / 5 = 2.4
aside => 8 / 5 = 1.6

MayorMonty

Posted 2016-11-23T01:28:04.750

Reputation: 778

Shouldn't it be the other way around? To make it more individual, have a higher individuality score? – Gabriel Benamy – 2016-11-23T02:21:03.453

2Probably, but making breaking changes to the challenge when people may have started golfing would be unwise – MayorMonty – 2016-11-23T02:22:40.333

Can we use other word lists instead? I think this one is easier to use (being a Windows user). The list is apparently not as long, so the individuality will be higher, but this doesn't alter the challenge the way I see it.

– Stewie Griffin – 2016-11-23T07:43:29.587

1Is a word a substring of itself? – FlipTack – 2016-11-23T08:12:01.620

1I assume a case-insensitive match ? – zeppelin – 2016-11-23T08:58:17.997

In order to keep answers uniform, I request you use the same word list, but keep in mind that you can accept it in any way. Getting the word list is not part of the challenge – MayorMonty – 2016-11-23T17:06:14.327

@Flp.Tkc Yes, basically anything that can be grepped – MayorMonty – 2016-11-23T17:07:24.277

@zeppelin in most wordlists, all the entries are lowercase. Additionally you can assume that all input will be lowercase – MayorMonty – 2016-11-23T17:08:41.733

@MayorMonty Ok, great, I've asked as the /dict/words is not lowercase (at least not in my system). – zeppelin – 2016-11-23T18:22:49.730

Answers

1

05AB1E, 9 bytes

#vy²å}ON/

Try it online!

#         Separate by newlines or spaces.
 vy       For each entry in the dictionary.
   ²å     1 if the second argument is a substring of the current word, 0 o.w.
     }    End loop.
      O   Sum ones and zeros.
       N  Get list size. 
        / Divide.

Magic Octopus Urn

Posted 2016-11-23T01:28:04.750

Reputation: 19 422

Looks like yours is going to be the shortest, but I'll give it a week or two – MayorMonty – 2016-11-25T03:51:44.867

3

Bash, 41, 39, 34, 33, 26 bytes

EDIT:

  • Converted from function to a script
  • One byte off by removing the ignore case flag
  • Replaced wc -l with grep -c, saving 5 bytes. Thanks @Riley !

A rather trivial solution in bash + coreutils

Golfed

bc -l<<<`grep -c $1`/${#1}

Test

>cat /usr/share/dict/words| ./test ulous
7.60000000000000000000

>grep -i ulous /usr/share/dict/words | wc -l
38

zeppelin

Posted 2016-11-23T01:28:04.750

Reputation: 7 884

1Would grep -ic $1 work instead of grep -i $1|wc -l? – Riley – 2016-11-23T14:37:32.680

True ! (always thought this to be a GNU extension, but it turns out to be a POSIX option indeed). Thank you ! – zeppelin – 2016-11-23T15:04:16.680

2

Python 3, 52 49 bytes

-3 bytes thanks to Kade, for assuming w to be the word list as list:

f=lambda s,w:w>[]and(s in w[0])/len(s)+f(s,w[1:])

Previous solution:

lambda s,w:sum(s in x for x in w.split('\n'))/len(s)

Assumes w to be the word list. I choose Python 3 because in my word list there are some Non-ASCII chars and Python 2 does not like them.

Karl Napf

Posted 2016-11-23T01:28:04.750

Reputation: 4 131

1Since you are allowed to take the wordlist in any reasonable format, couldn't this work for 50 bytes: f=lambda s,w:w>[]and (s in w[0])/len(s)+f(s,w[1:]) – Kade – 2016-11-23T13:17:33.750

1I should note you can remove the space between and and ( to make it 49 bytes. – Kade – 2016-11-23T13:23:50.993

@Kade awesome! Nice abuse of the lax requirements. – Karl Napf – 2016-11-23T13:27:13.087

@Dopapp No, because this would not substring – Karl Napf – 2016-11-23T18:49:11.990

2

Perl 6,  45 36 33  32 bytes

wordlist as a filename f, 45 bytes

->$w,\f{grep({/:i"$w"/},f.IO.words)/$w.chars}

wordlist as a list l, 36 bytes

->$w,\l{grep({/:i"$w"/},l)/$w.chars}

using placeholder variables, and reverse (R) meta-operator, 33 bytes

{$^w.chars R/grep {/:i"$w"/},$^z}

using .comb to get a list of characters, rather than .chars to get a count, 32 bytes

{$^w.comb R/grep {/:i"$w"/},$^z}

Expanded:

{             # block lambda with placeholder parameters 「$w」 「$z」

  $^w         # declare first parameter ( word to search for )
  .comb       # list of characters ( turns into count in numeric context )

  R[/]        # division operator with parameters reversed

  grep        # list the values that match ( turns into count in numeric context )

    {         # lambda with implicit parameter 「$_」
      /       # match against 「$_」
        :i    # ignorecase
        "$w"  # the word as a simple string
      /
    },

    $^z       # declare the wordlist to search through
              #( using a later letter in the alphabet
              #  so it is the second argument )
}

Brad Gilbert b2gills

Posted 2016-11-23T01:28:04.750

Reputation: 12 713

1

awk: 31 bytes

Passing the word as the w variable to the awk command, and the file in <stdin>:

$0~w{N++}END{print N/length(w)}

Sample output:

 $ awk -vw=hello '$0~w{N++}END{print N/length(w)}' /usr/share/dict/words
 2.4

Adam

Posted 2016-11-23T01:28:04.750

Reputation: 591

1

PHP, 54 bytes

Assumes the word list in $w.

<?=count(preg_grep("/$argv[1]/",$w))/strlen($argv[1]);

Alex Howansky

Posted 2016-11-23T01:28:04.750

Reputation: 1 183

0

Clojure, 53 bytes

Not that exciting :/

#(/(count(filter(fn[w](.contains w %))W))(count %)1.)

That 1. is there to convert a rational into a float. I pre-loaded words into W as such:

(def W (map clojure.string/lower-case (clojure.string/split (slurp "/usr/share/dict/words") #"\n")))

NikoNyrh

Posted 2016-11-23T01:28:04.750

Reputation: 2 361