In this program you will be writing a spam predictor based on maximal
likelihood.  Input will be entirely through the console and will read
as follows. Variables are defined later and examples follow after
that.

n_spam
spam_word1 spam_word2

n_ham
ham_word1 ham_word2 ...


midlow midhigh

n_msg
message_word1  message_word2 ...


n_spam is an integer that describes the number of spam words in the
spam word list. These words can be repeated and each time counts as a
separate occurence in a spam message.  Following n_spam are n_spam
number of spam words.


n_ham is an integer that describes the number of ham words in the ham
word list. These words can be repeated and each time counts as a
separate occurence in a ham message.  Following n_ham are n_ham number
of ham words. Some of these words may appear as spam, and some may
not.


midlow is a floating point number.  
midhigh is a floating point number.
Any word that is classified as having a probability of being spam that
falls in the region between midlow and midhigh is ignored.


n_msg is an integer that describes the number of words in the message
that you will be using to classify as spam or not. Following n_msg is
a list of words that are in the message to be classified. If the word
was listed as ham or spam above, then include it in the computation as
long as its probability does not fall in the midlow-midhigh region. If
a word does not appear in the corpus, it is ignored. Use Laplacian
smoothing to compute the probability of a word being spam.


Now a walkthrough of Example #2 below:
There are two spam words in the corpus: spam and yes
There are two ham words in the corpus: spam and not
Let's compute the probabilities of these words now:
P(spam) = (1 + 1) / (2 + 2) = 1/4 = 0.5
P(yes)  = (1 + 1) / (1 + 2) = 2/3 = 0.666667
P(spam) = 0.5
P(not)  = (0 + 1) / (1 + 2) = 1/3 = 0.333333

Now, spam is tossed out because it has a probability between .45 and
.55. The other two words (yes and not) are the only ones to classify
messages in this system.

This example has a message with 4 words. yes spam not boo.  spam and
boo are no longer in our list of words with probabilities, so they are
tossed.

alpha = P(yes) * P(not) = 0.666667 * 0.333333 = 0.222222
beta = (1-P(yes))* (1-P(not)) = 0.333333 * 0.666667 = 0.222222
So the final result is:
alpha / (alpha + beta) = 0.222222 / (0.222222 + 0.222222) = 0.5

It should not be surprising that this system cannot classify this
message conclusively.


Some examples:

--------------------
| EXAMPLE INPUT #1 |
--------------------
10

spam urgent important sweepstakes free chance win important free not

10

ham not spam please how help close correct correct free


.45 .55


20

this is the message to classify as spam or ham do you think your
system classify this correct or not


--------------------
| EXAMPLE OUTPUT #1 |
--------------------

P(ham)=0.333333	 alpha=0.333333 beta=0.666667
P(correct)=0.25	  alpha=0.0833333 beta=0.5
Result= 0.142857


--------------------
| EXAMPLE INPUT #2 |
--------------------
2
spam yes

2
spam not

.45 .55

4
yes spam not boo


--------------------
| EXAMPLE OUTPUT #2 |
--------------------
P(yes)=0.666667	 alpha=0.666667 beta=0.333333
P(not)=0.333333	  alpha=0.222222 beta=0.222222
Result= 0.5