Create an autocorrect program

-2

1

People sometimes make spelling eorrors. A lto. Your challenge is to write a program that can take a string with certain spelling errors, such as the switching of I and E, and correct those mistakes. This has to be able to make changes to any combination of characters in any string, and fix the spelling errors. This is a , the winner will be chosen in three days. If I need to clarify any additional details, please do say, and I will add it to my challenge. Good luck!

Rules

  • You may read correctly-spelled words from a file.
  • You will read the string from standard input.
  • You need only correct basic spelling errors.
  • It's ok if your program makes a few errors.

DatEpicCoderGuyWhoPrograms

Posted 2014-07-09T21:05:36.030

Reputation: 362

Question was closed 2014-07-10T06:47:57.547

2Is it OK if we read correctly spelled words from a file? And are we reading from stdin? – golfer9338 – 2014-07-09T21:19:16.607

@golfer9338 Yes and yes. – DatEpicCoderGuyWhoPrograms – 2014-07-09T21:38:57.483

3Yes, there are some things which you need to clarify. What do you mean by "has to be able to make changes to any combination of characters in any string"? Which errors does it need to correct? What about overcorrection? What about other languages? – Peter Taylor – 2014-07-09T22:02:32.400

@PeterTaylor It needs to be able to fix basic spelling errors. Such as the swapping of I and E. Like misspelling their. It doesn't matter what language you make it in. In the case of overcorrecting, that's okay if it makes a few mistakes, even advanced autocorrect like the one on my iphone makes mistakes. – DatEpicCoderGuyWhoPrograms – 2014-07-09T22:07:00.827

1You need a set of production rules like "If you see this, it should be replaced with this." Making it a popularity contest is not an excuse for completely leaving out the details of the task you have asked us to do. Btw, golfer9338 added a neat idea: read correctly spelled words from a file. His edit was not perfect, but I approved it anyway because I think that his idea could salvage this challenge. – Rainbolt – 2014-07-11T02:01:34.497

It's still completely opaque what you mean by "has to be able to make changes to any combination of characters in any string". – Peter Taylor – 2014-07-11T10:29:47.803

Answers

3

for a real "good" correction, we need to know the set of words which are "reasonable". In a program editor, this depends on the word before the cursor. This may be either a class or variable name, or the name of the function to be called (some syntax awareness required here). In a general text editor, this would be a list of words from a dictionary.

In the following code, I assume that the list of "good" words is found in "candidates", in Smalltalk this would be one of:

Smalltalk allClasses collect:[:cls | cls name]

for class names, or

Smalltalk allClasses collectAll:[:cls | cls methodDictionary keys] as:Set

for the names of all methods.

Of course, in a real world corrector, this would take into account the current scope to restrict the set of possible candidates.

The code then selects the best candidate by its editing distance, which is the distance between two strings based on the weighted number of insert, delete and replace operations required to change one string onto another. All Smalltalks provide a "spellAgainst:" method in their String class, which is based on levenshtein or a similar algorithm.

"the following is of course very Smalltalk dialect specific,
 as the GUI frameworks are slightly different.
 It assumes there is an editor object, which represents the code edit
 window."

candidates := <one of the above>.
wordToFix := editor wordBeforeCursor.

best := candidates detectMax:[:each | each spellAgainst: wordToFix].

editor selectWordBeforeCursor.
editor replaceSelectionBy:best.

PS: one of the coolest features of Smalltalk is that the IDE is part of your program and vice versa (they are the same), so you can actually add the above code fragment to your editor's keyPress method and have it work immediately!

blabla999

Posted 2014-07-09T21:05:36.030

Reputation: 1 869

2

Python

This uses Levenshtein distance, but doesn't account for non-alpha characters and case. Corrects single misspelled words. Enter a string on stdin, output on stdout, and have a newline separated list of words in a file called words.txt in the same directory.

Code

# Open list of correcly-spelled words.
wordFile = open("words.txt")
threshold = 8
listOfWords = input().split()
index = 0

def lev(a, b):
    if min(len(a), len(b)) == 0:
        return max(len(a), len(b))
    else:
        return min(lev(a[:-1], b) + 1, lev(a, b[:-1]) + 1, 
            lev(a[:-1], b[:-1]) + int(not a[-1] == b[-1]))

for x in listOfWords:
    replacement = (x, threshold + 1)

    for word in wordFile:
        x = x.lower()
        word = word[:-1].lower()

        if x == word:
            replacement = (x, 0)
            break # Some words may actually be spelled correctly!

        d = lev(x, word)
        if (d < threshold) and (replacement[1] > d):
            replacement = (word, d)

    listOfWords[index] = replacement[0]
    index += 1
    wordFile.seek(0)

print(*listOfWords)

Bugfix

All words are corrected now.

golfer9338

Posted 2014-07-09T21:05:36.030

Reputation: 411

Python3, I take it? – DatEpicCoderGuyWhoPrograms – 2014-07-09T23:34:27.583

1@DatEpicCoderGuyWhoPrograms Yes. – golfer9338 – 2014-07-10T00:50:26.653