How to finding words that are near other words?

I am going to create a dictionary that users can find each words that are near each other from pronouncing and word Letter.

For example When user searched near my site return other words that are near that, like near,pear,dear,rear,here

How to do it?

like this website :rhymezone

KIMIA

Posted 2014-08-16T04:57:54.737

Reputation: 3

http://en.wikipedia.org/wiki/Levenshtein_distance , http://en.wikipedia.org/wiki/String_distance_metric , http://en.wikipedia.org/wiki/Phonetic_algorithm – gronostaj – 2014-08-16T07:48:04.697

Answers

I think this is tolerably complex. It's a branch of Natural Language Programming (not the other NLP, Neuro Linguistic Pap). You need to be able to match "here" and "hear" - so you need to break down the elements, and map them to a phonetic equivalence (h-"ere" and h-"ere" for both). And some spellings have variant phonetics (cough, plough, through) or have even worse homonyms (the there, they're, their group) and those can be regionally different (in the UK, the long or short 'a' in garage, graph and glass, for example). When you're making phonetic equivalences, they aren't always singular.

Once you've got phonetic equivalences, you can use a variety of heuristics to minimise the difference between them. For poetry, you probably need to excessively emphasise the weight of terminations - rhyming mostly depends on word endings. You may want alliterative runs for poetry, too - a different weighting would probably be needed for euphonious alliteration.

I'd join one of the free online university courses on NLP - there are several, currently. Much better grounding in modern techniques for parsing language. :)

JezC

Posted 2014-08-16T04:57:54.737

Reputation: 550

You can use string comparison among strings of the same length and allow for one (or two) unmatching characters.

By looking at your website, though, I'd suggest to map the words by their phonetic representation and then search into them, treating the strings like characters arrays and starting the comparison from the back.

To build over your example: near -> \'nir\ pear -> \'per\ dear -> \'dir\ rear -> \'rir\ or \'rer\ here -> \'hir\

(I'm ripping off the Merriam-Webster online for the phonetic notation, here)

The mapping, I'm afraid, should be done as a look-up table and can't be compiled, because English doesn't have very strong pronunciation rules...

Anyway, once you mapped your words, you can compare their last phoneme: in this case, you may want to look for words that end in "ir\" (which would exclude "pear", in this case).

This method doesn't look terribly efficient: if I could use some disk space, I'd save the searches for future reference: so when a second use looks for all the words which rhyme to "near", the application just loads the saved search -since dictionaries usually don't evolve too fast.

Lorenzo Zanetti

Posted 2014-08-16T04:57:54.737

Reputation: 1