I've been trying to parse email messages with more than the simplest of structures and can't seem to get anywhere.
VM: Linux 3.2-4743 #1 [oss audio, xshm] Image: 3.2gamma-4843
Consider an example: Take an email with an attachment, and forward it as an attachment. Diagram:
a Email b Body c Forwarded Email d Body of Forwarded Email e Attachment
The problem is that no matter what I do, I can't get to the attachment, part e. Or for that matter the body of the forwared email, part d.
I can see 2 ways of going about this:
1. The body of a MailMessage instantiated from the original email message (MailMessage class>>from:) returns a MIMEDocument whose parts message returns an Array with 2 elements. The first element is a MailMessage whose body is a MIMEDocuement containing the text labelled as part b above. The second element is also a MailMessage whose body is a MIMEDocument includes all of c-e. However, this MIMEDocument's parts message returns an empty array. I'm stopped at node c.
2. Again, instantiate a MailMessage using MailMessage class>>from: from the original email, use MailMessage>>makeMultipart and MailMessage>>parts returns an OrderedCollection instance with 2 elements. The first element is a MailMessage, again MailMessage>>makeMultiPart and MailMessage>>parts on it returns an Array with only one element, it being a MailMessage whose text is part b. If you run makeMultipart and parts on this MailMessage, you keep getting one element Arrays with the same text. I guess we can consider this a recursion limit criterion.
Now, take the second element of the two element Array returned by the first level of makeMultipart/parts. This is a MailMessage whose text contains all of c-e. Recurse with makeMultipart/parts again and we get an Array with ONLY one element, a matching MailMessage. Continued recursion never visits d or e.
I would really appreciate it if someone can point out my mistake/misunderstanding here. Otherwise I have to consider the MIME system to be less than useful.
Ken Causey
Ken Causey ken@ineffable.com wrote:
I've been trying to parse email messages with more than the simplest of structures and can't seem to get anywhere.
VM: Linux 3.2-4743 #1 [oss audio, xshm] Image: 3.2gamma-4843
Consider an example: Take an email with an attachment, and forward it as an attachment. Diagram:
a Email b Body c Forwarded Email d Body of Forwarded Email e Attachment
The problem is that no matter what I do, I can't get to the attachment, part e.
This works for me:
a := MailMessage from: someText. c := MailMessage from: a parts second body content. e := c parts second. attachmentData := e body content
This seems fairly direct.
Now, MailMessage isn't the cleanest of all cleanly code in the world. If you want to clean it up and document your new design, that would be great!
Lex
On Sun, 2002-04-28 at 23:57, Lex Spoon wrote:
Ken Causey ken@ineffable.com wrote:
I've been trying to parse email messages with more than the simplest of structures and can't seem to get anywhere.
VM: Linux 3.2-4743 #1 [oss audio, xshm] Image: 3.2gamma-4843
Consider an example: Take an email with an attachment, and forward it as an attachment. Diagram:
a Email b Body c Forwarded Email d Body of Forwarded Email e Attachment
The problem is that no matter what I do, I can't get to the attachment, part e.
This works for me:
a := MailMessage from: someText. c := MailMessage from: a parts second body content. e := c parts second. attachmentData := e body content
This seems fairly direct.
Now, MailMessage isn't the cleanest of all cleanly code in the world. If you want to clean it up and document your new design, that would be great!
Lex
OK, that's very interesting as this does work. But this indicates even more clearly that the MailMessage instances returned as a result of MailMessage>>parts are apparently not correctly formed, since they are not functionally equivalent to instantiating an instance manually from the text using MailMessage class>>from:.
Thanks for this workaround.
Ken Causey (nikos)
Ken Causey ken@ineffable.com wrote:
OK, that's very interesting as this does work. But this indicates even more clearly that the MailMessage instances returned as a result of MailMessage>>parts are apparently not correctly formed, since they are not functionally equivalent to instantiating an instance manually from the text using MailMessage class>>from:.
Can you be more explicit, like with some example code? There are two #parts floating around, you know -- I bet this is what you bumped into. MIMEDocument>>parts is doing guesswork, and it isn't surprising that a complex message can trip it up. Now that we have MailMessage>>parts, this method should probably go away. At the least, it should be renamed #guessParts. Before doing so, however, we just need to figure out if any code is still using the old versian.
-Lex
Can you be more explicit, like with some example code? There are two #parts floating around, you know -- I bet this is what you bumped into. MIMEDocument>>parts is doing guesswork, and it isn't surprising that a complex message can trip it up. Now that we have MailMessage>>parts, this method should probably go away. At the least, it should be renamed #guessParts. Before doing so, however, we just need to figure out if any code is still using the old versian.
-Lex
I run into much the same behaviour with both parts messages, as I explained in the original bug. I'm not sure of the best way to go about providing example code, but here goes:
a := MailMessage from: 'Return-Path: ken@ineffable.com Received: from localhost.localdomain (ken@temp.ineffable.com [205.229.226.241]) by mail.premiernet.net (8.12.3/8.12.3/Debian -4) with ESMTP id g3QI6CKT010267 for kentest@premiernet.net; Fri, 26 Apr 2002 13:06:12 -0500 Subject: [Fwd: Attachment] From: Ken Causey ken@ineffable.com To: kentest@premiernet.net Content-Type: multipart/mixed; boundary="=-CaP9ddwRV+U2ialyRvBD" X-Mailer: Ximian Evolution 1.0.3 Date: 26 Apr 2002 13:06:12 -0500 Message-Id: 1019844372.399.37.camel@temp Mime-Version: 1.0 X-UIDL: ,57"!O5Y!!%^p"!Zh!! Status: RO
--=-CaP9ddwRV+U2ialyRvBD Content-Type: text/plain Content-Transfer-Encoding: 7bit
forwarded email with attachment
--=-CaP9ddwRV+U2ialyRvBD Content-Disposition: inline Content-Description: Forwarded message - Attachment Content-Type: message/rfc822
Return-Path: ken@ineffable.com X-Sieve: cmu-sieve 2.0 Received: from mail.premiernet.net [205.229.224.233] by localhost with POP3 (fetchmail-5.9.0) for ken@localhost (single-drop); Fri, 26 Apr 2002 13:04:15 -0500 (CDT) Received: from localhost.localdomain (ken@temp.ineffable.com [205.229.226.241]) by mail.premiernet.net (8.12.3/8.12.3/Debian -4) with ESMTP id g3QI3EKT010064; Fri, 26 Apr 2002 13:03:14 -0500 Subject: Attachment From: Ken Causey ken@ineffable.com To: kentest@premiernet.net Cc: ken@ineffable.com Content-Type: multipart/mixed; boundary="=-sT9PYGNIhkbF+ahSUtGM" X-Mailer: Ximian Evolution 1.0.3 Date: 26 Apr 2002 13:03:14 -0500 Message-Id: 1019844194.399.35.camel@temp Mime-Version: 1.0 X-Spam-Status: No, hits=0.0 required=5.0 tests= version=2.11 X-UIDL: //a"!?2Z!!JHC!!K0+!!
--=-sT9PYGNIhkbF+ahSUtGM Content-Type: text/plain Content-Transfer-Encoding: 7bit
original attachment
--=-sT9PYGNIhkbF+ahSUtGM Content-Disposition: attachment; filename=test.txt Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; name=test.txt; charset=ANSI_X3.4-1968
Test
--=-sT9PYGNIhkbF+ahSUtGM--
--=-CaP9ddwRV+U2ialyRvBD-- '.
To demonstrate my first method:
a body parts second body parts
returns #(). This method uses MIMEDocument>>parts
To demonstrate my second method:
a makeMultipart parts second makeMultipart parts first
at this point you can keep appending 'makeMultipart parts first' and you continue to get back the same exact result. This method uses MailMessage>>parts.
Ken
Ken Causey ken@ineffable.com wrote:
[snipped example message]
Excellent. Let's go:
To demonstrate my first method:
a body parts second body parts
returns #(). This method uses MIMEDocument>>parts
Okay. I don't know why. I don't especially care about trying to make this work. MailMessage>>parts is better.
To demonstrate my second method:
a makeMultipart parts second makeMultipart parts first
Let's see.
First, the makeMultiparts are distracting. They do nothing if the message isn't multipart already, and if the message *isn't* multipart, it will force them to be! That is undesired.
But the main issue is that your message is actually structured like this:
a b ("forwarded email with attachment) c (a single-part "multipart" message) d (an embedded message) e ("original attachment") e (test.txt)
Note that there is an extra layer here. Since it is encoded as a message, you'll have to decode it with "MailMessage from:" before proceding further. The following works to extract the attachment:
forwardedMessage := MailMessage from: a parts second parts first body content. forwardedMessage parts second
at this point you can keep appending 'makeMultipart parts first' and you continue to get back the same exact result. This method uses MailMessage>>parts.
In fact, the makeMultipart is more than distracting -- it is increasing the nesting depth, even as the "parts first" is decreasing it!
Overall, the difficulty is with the difference between a mail message and its textual representation. Okay I'm stopping now before I get a headache. :)
Lex
On Tue, 2002-04-30 at 00:21, Lex Spoon wrote:
Ken Causey ken@ineffable.com wrote:
forwardedMessage := MailMessage from: a parts second parts first
body content. forwardedMessage parts second
Lex,
Thank you very much for this solution. This gets me past my problem. I'm still inclined to think that there is a problem, but I wonder now if it's more of a design problem rather than an implementation one. Well, for now I need results, and you've helped my quite a bit.
Thank you!
Ken
squeak-dev@lists.squeakfoundation.org