Does anyone have code (or is there any in the image) which, given a string, returns whether or not that string is a syntactically valid email address?
Thanks.
Derek Brans Nerd on a Wire Web design that's anything but square http://www.nerdonawire.com mailto: brans@nerdonawire.com phone: 604.874.6463 toll-free: 1-877-NERD-ON-A-WIRE
Derek Brans wrote:
Does anyone have code (or is there any in the image) which, given a string, returns whether or not that string is a syntactically valid email address?
Interesting question!
It is interesting because Microsoft outlook has such a feature and it does it WRONG. ;)
Classicly, when everyone was logged into the campus mainframe which cost $10,000,000 and had 2MB of memory and you wanted to send a mail to bob you would simply put "bob" in the "to:" field and the system would accept it...
Microsoft Outlook will give an error message and absolutly won't let you send the message.
Wetscrape is somewhat better in that it will simply add the rest of your e-mail address to the outgoing message, if I set "to:" to Bob in netscape 4.08 for windows 3.11 it would simply add "@starpower.net" and send it without complaint...
There are syntactic rules (unfortunately I don't know where to find them).
Basically what you are looking for is illegal charactors and improper forms...
It should be noted that some sites (such as American Racing as of 1999) use UUCP e-mail addresses which involve "bang paths" that look something like:
user!server1!server2!server3@some.SMTP.server
Derek Brans wrote:
Does anyone have code (or is there any in the image) which, given a string, returns whether or not that string is a syntactically valid email address?
I have no code but ...
1. You could ask DNS (that could take up to some minutes) if it really exists. (check for MX, RR or A) 2. If you have the RePlugin installed you could test if it is a regular address, but is doesn't have to exist. 3. You could use the messages of String 4. There are several Open Source or Implementations in other Languages like Class Mail_RFC822 of Pear (pear.php.net)
Regards Chris Burkert
Am Sonntag, 15.06.03 um 10:52 Uhr schrieb Chris Burkert:
Derek Brans wrote:
Does anyone have code (or is there any in the image) which, given a string, returns whether or not that string is a syntactically valid email address?
I have no code but ...
- You could ask DNS (that could take up to some minutes) if it really
exists. (check for MX, RR or A) 2. If you have the RePlugin installed you could test if it is a regular address, but is doesn't have to exist. 3. You could use the messages of String 4. There are several Open Source or Implementations in other Languages like Class Mail_RFC822 of Pear (pear.php.net)
5. or you could
- install SmaCC, the Smalltalk Compiler Compiler available from SqueakMap - transform the necessary parts for email address grammar rfc822 from http://www.faqs.org/rfcs/rfc822.html to SmaCC-standards - and create a native smalltalk email-address-parser.
Markus
Hello,
or you could
- install SmaCC, the Smalltalk Compiler Compiler available from
SqueakMap
- transform the necessary parts for email address grammar rfc822 from
http://www.faqs.org/rfcs/rfc822.html to SmaCC-standards
- and create a native smalltalk email-address-parser.
Just being picky, RFC 2822 supercedes 822. At least you don't have to worry about UUCP addresses:-) (but a bit more careful treatment of MIME encoded addresses.)
-- Yoshiki
"Derek" == Derek Brans brans@nerdonawire.com writes:
Derek> Does anyone have code (or is there any in the image) which, given a string, returns whether or not that string is a syntactically valid email address?
In addition to the other messages in this thread, let me also point out that the following are valid:
*@qz.to (my friend, Eli the bearded uses this one) fred&barney@stonehenge.com (my example when it comes up - go ahead and test it!) merlyn@(that's "at")stonehenge(the rock place (that rocks!)).com(dot com!)
Yes, that last one has *nested* parens. Therefore, you cannot do this with a regex, which cannot match nested anythings.
In general, there are no illegal characters, but everything has to appear in a proper context.
Good luck with your mission. :)
merlyn@stonehenge.com (Randal L. Schwartz) wrote:
"Derek" == Derek Brans brans@nerdonawire.com writes:
Derek> Does anyone have code (or is there any in the image) which, given a string, returns whether or not that string is a syntactically valid email address?
In addition to the other messages in this thread, let me also point out that the following are valid:
*@qz.to (my friend, Eli the bearded uses this one) fred&barney@stonehenge.com (my example when it comes up - go ahead and test it!) merlyn@(that's "at")stonehenge(the rock place (that rocks!)).com(dot com!)
For the record, Squeak's mail parser handles the first and third, but loops on the second. There is an issue with the various character sets: $& is not in the list of valid atom characters, and there is no check in the parser for characters that don't match any valid character set.
In the third case it removes the comments for you.
In general, there are no illegal characters, but everything has to appear in a proper context.
Are you certain that & may be used in an atom? I followed the RFC religiously, and I'm sure I started at <atom> and worked out the list of allowed characters. But maybe I misread it, or maybe a later RFC allows more characters?
Lex
On the subject of bugs,
MailAddressParser addressesIn: 'asdf, asdf' evaluates to {'asdf' . 'asdf'}. But shouldn't it raise an error (since asdf is not a syntactically valid email address)?
Thanks for all comments posted to this thread.
Derek Brans Nerd on a Wire Web design that's anything but square http://www.nerdonawire.com mailto: brans@nerdonawire.com phone: 604.874.6463 toll-free: 1-877-NERD-ON-A-WIRE ----- Original Message ----- From: "Lex Spoon" lex@cc.gatech.edu To: "The general-purpose Squeak developers list" <squeak-dev@lists.squeakfoundation.orgsqueak-dev@lists.squeakfoundation.org> Sent: Sunday, June 15, 2003 4:21 AM Subject: [BUG] Re: email syntax validation needed
merlyn@stonehenge.com (Randal L. Schwartz) wrote:
> "Derek" == Derek Brans brans@nerdonawire.com writes:
Derek> Does anyone have code (or is there any in the image) which, given
a string, returns whether or not that string is a syntactically valid email address?
In addition to the other messages in this thread, let me also point out that the following are valid:
*@qz.to (my friend, Eli the bearded uses this one) fred&barney@stonehenge.com (my example when it comes up - go ahead and
test it!)
merlyn@(that's "at")stonehenge(the rock place (that rocks!)).com(dot
com!)
For the record, Squeak's mail parser handles the first and third, but loops on the second. There is an issue with the various character sets: $& is not in the list of valid atom characters, and there is no check in the parser for characters that don't match any valid character set.
In the third case it removes the comments for you.
In general, there are no illegal characters, but everything has to
appear
in a proper context.
Are you certain that & may be used in an atom? I followed the RFC religiously, and I'm sure I started at <atom> and worked out the list of allowed characters. But maybe I misread it, or maybe a later RFC allows more characters?
Lex
"Lex" == Lex Spoon lex@cc.gatech.edu writes:
fred&barney@stonehenge.com (my example when it comes up - go ahead and test it!)
Lex> Are you certain that & may be used in an atom? I followed the RFC Lex> religiously, and I'm sure I started at <atom> and worked out the list of Lex> allowed characters. But maybe I misread it, or maybe a later RFC allows Lex> more characters?
Following this in RFC2822:
addr-spec = local-part "@" domain
local-part = dot-atom / quoted-string / ...
dot-atom = dot-atom-text
dot-atom-text = 1*atext *("." 1*atext)
atext = ALPHA / DIGIT / ; Any character except controls, "!" / "#" / ; SP, and specials. "$" / "%" / ; Used for atoms "&" / "'" / "*" / "+" / "-" / "/" / "=" / "?" / "^" / "_" / "`" / "{" / "|" / "}" / "~"
Note the "&" in there.
So "fred&barney" is a valid local-part, consisting of a single dot-atom which is a single dot-atom-text which is a string of one or more atext.
See, nearly everything!
Even ?@stonehenge.com would be a legit address. :)
Special chars appear inside double-quoted strings:
quoted-string = DQUOTE *(qcontent) DQUOTE qcontent = qtext / quoted-pair qtext = (non-white-space controls and the rest of ASCII not including quote chars) quoted-pair = ("" text) text = (any character excluding CR and LF)
Derek Brans wrote:
Does anyone have code (or is there any in the image) which, given a string, returns whether or not that string is a syntactically valid email address?
Thanks.
Derek Brans Nerd on a Wire Web design that's anything but square http://www.nerdonawire.com mailto: brans@nerdonawire.com mailto:brans@nerdonawire.com phone: 604.874.6463 toll-free: 1-877-NERD-ON-A-WIRE
Derek,
What did you finally use in your quest on this topic? Here is a piece of code I use. It requires that the email address have a $@ character in it. It requires that the hostname portion following that character be resolvable via DNS (and it must resolve within 5 seconds). And it requires that the username portion of it not have a space (I've found putting spaces in the username to be a very common user error). You can see I can easily expand the list of bad characters in the username portion if I choose to, although I never have:
isValidEmailAddress: emailAddress | username | emailAddress isNil ifTrue: [^ false]. (NetNameResolver addressForName: (emailAddress copyAfter: $@) timeout: 5) isNil ifTrue: [^ false]. username _ emailAddress copyUpTo: $@. (username includesAnyOf: ' ') ifTrue: [^ false]. ^ true
Derek,
I had a customer with an email address of their name, followed by '@sbcglobal.net'. NetNameResolver wasn't able to resolve the domain 'sbcglobal.net', so the code I showed you below tagged the email address as a bad address when in fact it was good.
Anybody know how to do MX record lookups from within Squeak rather than DNS lookups? Seems on my system, 'sbcglobal.net' resolves to a good MX record, but fails as a DNS lookup.
Nevin
Nevin Pratt wrote:
Derek,
What did you finally use in your quest on this topic? Here is a piece of code I use. It requires that the email address have a $@ character in it. It requires that the hostname portion following that character be resolvable via DNS (and it must resolve within 5 seconds). And it requires that the username portion of it not have a space (I've found putting spaces in the username to be a very common user error). You can see I can easily expand the list of bad characters in the username portion if I choose to, although I never have:
isValidEmailAddress: emailAddress | username | emailAddress isNil ifTrue: [^ false]. (NetNameResolver addressForName: (emailAddress copyAfter: $@) timeout: 5) isNil ifTrue: [^ false]. username _ emailAddress copyUpTo: $@. (username includesAnyOf: ' ') ifTrue: [^ false]. ^ true
On Monday 21 July 2003 09:19 pm, Nevin Pratt wrote:
Anybody know how to do MX record lookups from within Squeak rather than DNS lookups? Seems on my system, 'sbcglobal.net' resolves to a good MX record, but fails as a DNS lookup.
Sure. Use OSProcess and call nslookup or dig.
Ned Konz wrote:
On Monday 21 July 2003 09:19 pm, Nevin Pratt wrote:
Anybody know how to do MX record lookups from within Squeak rather than DNS lookups? Seems on my system, 'sbcglobal.net' resolves to a good MX record, but fails as a DNS lookup.
Sure. Use OSProcess and call nslookup or dig.
I thought 'nslookup' and 'dig' just did DNS checks, no? I thought the only difference between Squeak's NetNameResolver class and them is that NetNameResolver uses the default DNS machines defined on the host machine, whereas with 'dig' (and I think 'nslookup' as well) allow you to specify which machine you want to use for the DNS lookup.
So, off hand, I don't see what 'nslookup' and/or 'dig' buy me for this application.
Anyway, I've expanded my emailAddress validation to the following (and this code handles the 'sbcglobal.net' case, because it prepends 'www.' for one of the tests, and 'www.sbcglobal.net' successfully resolves):
isValidEmailAddress: emailAddress | username host | emailAddress isNil ifTrue: [^ false]. host _ emailAddress copyAfter: $@. username _ emailAddress copyUpTo: $@. (username includesAnyOf: ' ') ifTrue: [^ false]. (NetNameResolver addressForName: host timeout: 5) notNil ifTrue: [^ true]. (NetNameResolver addressForName: 'www.' , host timeout: 5) notNil ifTrue: [^ true]. (NetNameResolver addressForName: 'mail.' , host timeout: 5) notNil ifTrue: [^ true]. ^ false
On Tue, Jul 22, 2003 at 05:24:26AM -0700, Nevin Pratt wrote:
Ned Konz wrote:
On Monday 21 July 2003 09:19 pm, Nevin Pratt wrote:
Anybody know how to do MX record lookups from within Squeak rather than DNS lookups? Seems on my system, 'sbcglobal.net' resolves to a good MX record, but fails as a DNS lookup.
Sure. Use OSProcess and call nslookup or dig.
I thought 'nslookup' and 'dig' just did DNS checks, no? I thought the only difference between Squeak's NetNameResolver class and them is that NetNameResolver uses the default DNS machines defined on the host machine, whereas with 'dig' (and I think 'nslookup' as well) allow you to specify which machine you want to use for the DNS lookup.
So, off hand, I don't see what 'nslookup' and/or 'dig' buy me for this application.
It should be nicer to do with host:
ragnar:~$ host -t MX sbcglobal.net sbcglobal.net mail is handled by 10 vmc-ext.prodigy.net. sbcglobal.net mail is handled by 10 vmd-ext.prodigy.net. sbcglobal.net mail is handled by 10 vme-ext.prodigy.net. sbcglobal.net mail is handled by 10 vmf-ext.prodigy.net. sbcglobal.net mail is handled by 10 vmg-ext.prodigy.net. sbcglobal.net mail is handled by 10 vmh-ext.prodigy.net. sbcglobal.net mail is handled by 10 vmi-ext.prodigy.net. sbcglobal.net mail is handled by 10 vmm-ext.prodigy.net. sbcglobal.net mail is handled by 10 vmn-ext.prodigy.net. sbcglobal.net mail is handled by 10 mailapps1-ext.prodigy.net. sbcglobal.net mail is handled by 10 mailapps2-ext.prodigy.net. sbcglobal.net mail is handled by 10 vm4-ext.prodigy.net. sbcglobal.net mail is handled by 10 vmb-ext.prodigy.net.
squeak-dev@lists.squeakfoundation.org