JoeEmail - 0.9.107
Step By Step
How It Works
Set up your account(s)
When you first start JoeEmail, you will be prompted to enter account information needed to log into your mail server. JoeEmail supports POP3 mail servers. You can enter additional accounts by clicking on the "Accounts" item under the "File" menu. Mail from each account is dumped into your Inbox. When you respond to an email from a certain account it will use the return address from the account the original message came from. New messages will be from the account you designate as your "default" account.
Once you have one or more accounts set up you can start to receive mail and update your Spam database. Incoming messages go to your inbox first. Depending on the spam handling options you've set (see below), spam messages will then be either routed to the spam box, routed to the Deleted box (and will be deleted next time you start JoeEmail), or just plain wiped out immediately.
When you first begin to use JoeEmail, most messages will be be determined to NOT be Spam. This is because there is a large bias towards emails containing words that are not in the database- this bias helps to minimize the chance of a false positive: an email that is NOT Spam but that JoeEmail thinks IS Spam. For any email filter, false positives are the absolute biggest worry.
In spite of this bias, I have found that after you get a few spams in your database, you can start to see a number of false positives. It is EXTREMELY IMPORTANT that you monitor your spam box carefully at first- mark any non-spam messages right away! This is especially true for HTML formatted mail. Spammers seem to love HTML and you will find that often HTML tags or attributes end up being the most damning evidence of spam. This can sent innocent mail straight to spam hell in a hurry- so BE CAREFUL!
I have just added some statistics to help determine when a database becomes more "mature" and can routinely detect the majority of spam with a very low chance of false positives. Currently my "real" database has ~500 spams and ~100 non-spams and is doing a very good job. I would very much appreciate hearing back from anybody with their experience!
One more warning about false positives- JoeEmail uses probabilities to guess which mails are spams and which are not. Let me repeat two words: probabilities and guess. There is no guarantee! It's like quantum mechanics and philosophy: you can't be absolutely sure. What I can say is that I'm using this program myself and it's working for me.
Okay, on to the details of operation!
The Spam Database
For JoeEmail to work, you need to get some data into your Spam Database. The Spam database keeps track of the number of times tokens (i.e. words) have appeared in messages that you've designated as either Spam or Not Spam appear. Using these data, JoeEmail determines a probability that an incoming message is Spam. As you mark more messages as either Spam or not Spam, your database will become more and more accurate. After as few as a couple dozen messages you will find that most spams are automatically flagged and easily deleted.
Here is how you update your Spam Database:
The "Delete As Spam" button
To mark messages as Spam, click the check-box in the list corresponding to your Spam messages. Once you have selected one or more messages, hit the "Delete As Spam" button. This will count the number of word tokens in the messages and update your database as necessary. The next time email is analyzed, the new probabilities will be figured in. All marked messages are deleted from the server the next time JoeEmail is started. To delete them from the server immediately, choose "Eradicate Deleted Mail" from the "Edit" menu.
Any Spam mail that shows up with a low spam probability should be deleted as spam.
The "Not Spam" button
It is also important to mark messages that are not spam. The "Not Spam" button works the same as the "Delete As Spam" button, except it updates the non-spam probabilities and does NOT delete the emails.
The "Manage Spam" Menu options
Add Text to Spam/Non-Spam Corpus: Use these options to enter old emails or other words into the bad or the good pile. This option works best when you copy and paste the full text source of a prior email (i.e. include the full header if you can).
Retokenize DB: "Tokenization" is what I call the process by which I split a string of text (i.e. an email message) into tokens which are then used to calculate probabilities. All of the messages you flag as spam or not spam are stored in tblMessages in your Spam.mdb database. The entire text is stored as it was in the email. This allows you to take advantage of improvements in the algorithms by re-using (re-tokenizing) everything you have marked so far. When your database starts to get big, this option will start to take a while. It's not something that you will want to run every day.
Mail Client Functions
The "Delete" button
Use the "Delete" button to simply delete an email off the server. It is not necessary to delete all spam using the "Delete as Spam" button, though as mentioned above you should "Delete as Spam" any spam messages that have a low spam probability. All marked messages are deleted from the server the next time JoeEmail is started. To delete them from the server immediately, choose "Eradicate Deleted Mail" from the "Edit" menu.
To quickly delete Spam, hit Ctrl-M (marks messages with 99% Spam probability) and then hit delete. Poof- gone!
The "New", "Reply", and "Reply to All" buttons
Well, these work pretty much as you would think. Unlike the Delete, Delete As Spam, and Not Spam buttons, however, New, Reply and Reply to All are based on the selected record from the message list- not the checked one(s).
Composing and Sending messages
Hopefully most of this will be self-explanatory. Type addresses or names from your address book into the "To" or "CC" or "BCC" boxes (JoeEmail will try to autofill if it can). Enter your subject and the body of the message. When you are done, click the "Send" button.
You can chose which account to send your message from with the drop list at the bottom of the message. New messages will come from your "default" account, replies will come from the account they were sent to.
There is a spell checker, you can click the button to check or, if you have enabled it in the options, the message will be checked when you send.
A message first goes to your "Outbox". Once it get successfully to your outgoing mail server the message will be moved to your "Sent" box.
From the "Edit" menu, choose "Options and Preferences" to bring up a dialog box with all the settings you can set.
Auto-Refresh: When enabled, JoeEmail will connect, download, analyze, mark (if auto-mark enabled), and delete (if auto-delete enabled) messages from all accounts at the interval specified.
Show HTML: This will show the message as HTML (if it has an HTML component).
Don't show spam as HTML: Viewing messages as HTML can be a way for spammers to verify your email address. It is strongly recommended to click this option which will not show the HTML for any messages that are likely to be spam.
Show as XML: When enabled the message will be displayed as an XML document.
Show Headers: This will include the raw mail header in the text view of the email. This is mostly for debugging, but it might be useful for you. I don't know...
Show Calc Details: This options shows the 15 words and their spam probabilities used to determine the overall spam probability of the message. I find this quite interesting!
Play Audio: I have become addicted to the Star Trek bosun's whistle alerting me to new mail. You can use your own sound if you like. Or not. Whatever.
Remove downloaded messages: Check this box to have messages that you've downloaded removed from your mail server after the specified number of days. If you select 0 days, messages will be removed the next time you start JoeEmail.
Check spelling on send: If checked all message bodies will be checked for spelling when you press the "Send" button. Spell check is currently disabled.
Show in System Tray: When enabled, JoeEmail will display an icon in the Windows System Tray and will not be in the task bar when minimized.
Spam Calc Options & Statistics
The "Spam Calc Options & Statistics" tab in the Options and Preferences form shows you the status of your Spam Database and allows you to change some Spam sorting and calculation options.
Spam Threshold: All messages with a spam probability this high or higher will be shown in red and automatically routed to the Spam box.
Non-spam Threshold: All messages with a spam probability this low or lower will be shown in black.
Number of "interesting" tokens: The spam calculation will look at this many of the most indicative tokens to combine with Bayes' Rule. Fifteen is the default.
Assumed Probability for new tokens: Use this option to specify the spam probability to use for tokens that are not in the spam database. The default is 20%.
The following two items should be used with caution! You should have high confidence in your database before enabling them as they can delete messages before you have a chance to glance at them. Ultimately that is the goal of JoeEmail, but you should feel comfortable that your spam database is accurate before proceeding.
Delete incoming spams: Checking this box will cause spams to be moved from your inbox to the deleted items box instead of the spam box. Only message with a probability >= your Spam Threshold will be moved.
Remove spams from server: Save as the above, but the messages in the deleted items box will be removed from the server before moving new spams to the deleted items box. So you will have your refresh interval to review messages before they are automatically removed from your server.
JoeEmail keeps track of several statistics to give you an idea how well it thinks it's doing. Statistics include:
Emails Identified: Number of emails assigned a spam probability- basically the number of emails downloaded from your mail servers. Used as the denominator in the Effectiveness % calculation.
Spams Deleted: The number of spams with a spam probability greater than or equal to your Spam Threshold that have been deleted. Does not directly affect the Effectiveness % calc.
Spams Missed: The number of messages with an initial spam probability less than or equal to your Non-spam Threshold at the time they were "Deleted As Spam".
False Positives: The number of messages with an initial spam probability greater than or equal to your Spam Threshold at the time they were flagged as "Not Spam". These are the most dangerous items- it is likely that you will see some while your database is small. The last Date/Time when a false positive was detected is also shown.
Effectiveness: The effectiveness % is calculated by (Emails Identified - Spams Missed - False Positives) / (Emails Identified). As your database matures, I'd like to hear what kind of results you are getting!
You will also see a message in this area of the form giving you an idea if your database is "mature" enough to start thinking about maybe turning on the automatic spam deleting features of JoeEmail.
In addition to effectiveness statistics, you can also see some figures about the size of your database.
Spam Messages: The number of messages you have "Deleted as Spam". These messages have been tokenized and stored in your Spam.mdb file.
Non-spam Messages: The number of messages you have flagged as "Not Spam". These messages have been tokenized and stored in your Spam.mdb file.
Unique Tokens: This is the number of unique tokens that meet the minimal criteria for contributing to the spam probability of an email. This means that a token must have appeared in at least 5 spam messages or at least 3 non-spam messages. If you open up Spam.mdb you will see that tblTokens has many many more tokens, but not with sufficient counts to be used. See How It Works for more details.
In version 0.9.95+, JoeEmail has an address book. Addresses are stored with first name, last name, and email address information. You can add, edit, and delete address book entries.
It is possible to import an existing address book if you are familiar with Access. Look at the tblAddresses table and do whatever you need to to get the data in. Only a few fields are actually used (first and last name, and email address). The format of tblAddresses is what you get when you import a csv file exported from Outlook.
You can add, rename, and delete custom mail boxes to store your mail. Currently naming conventions are pretty limited- no spaces, no funky characters, etc. You can Add/Delete/Rename from the File menu or by right-clicking on a mail box directly. Mail is stored in an XML file in the same folder that JoeEmail.exe is run from. So DO NOT run the same .exe (from a network drive) if you have a multi-user situation.