Presentation is loading. Please wait.

Presentation is loading. Please wait.

Telecooperation Technische Universität Darmstadt Copyrighted material; for TUD student use only E-Mail Q&A Telecooperation Group TU Darmstadt.

Similar presentations


Presentation on theme: "Telecooperation Technische Universität Darmstadt Copyrighted material; for TUD student use only E-Mail Q&A Telecooperation Group TU Darmstadt."— Presentation transcript:

1 Telecooperation Technische Universität Darmstadt Copyrighted material; for TUD student use only E-Mail Q&A Telecooperation Group TU Darmstadt

2 Prof. Dr. M. Mühlhäuser Telekooperation © 2 Interoperability No need to implement everything from RFCs 2045-2047 –Way too much work –Correctly implemented, you would out-standard most common e-mail clients Your implementation should have this functionality –7Bit encoding –Quoted printable & Base64 encoding with all charsets Java can handle (i.e. every charsetName that does not throw an UnsupportedEncodingException) –Multipart messages are recognized and decoded correctly –Robustness: Do not choke on unrecognized headers Programs will be tested with public test cases + secret ones –Secret test cases only use above mentioned functionality, too

3 Prof. Dr. M. Mühlhäuser Telekooperation © 3 Headers Multiline-Headers –Line continuations start with a “folding whitespace” – may be space or tab (\t) Ignore every header you do not know –If you want, you can also display additional headers like BCC – but required are only those mentioned in milestone 3.1 Case-sensitivity –Header names are always case-insensitive c.f. RFC 2822, section 1.2.2. „Characters will be specified […] by a case- insensitive literal value enclosed in quotation marks“ –Header values used in the assignment are usually case-insensitive, e.g. Content-Transfer-Encoding: Base64 and base64 are both possible Exceptions: multipart-boundary all header values displayed to the user

4 Prof. Dr. M. Mühlhäuser Telekooperation © 4 Date Look into the documentation of SimpleDateFormat –no need to parse each item for yourself, even recognizes “GMT” and “UTC” as timezones –Modify the parser with Locale.US in order to let it parse things like “May” Output via DateFormat.getDateTimeInstance() Timezone –Setting via SimpleDateFormat or Calender#setTimeZone is preferred to manual time manipulation –Reason: DateFormat may be configured to display the timezone

5 Prof. Dr. M. Mühlhäuser Telekooperation © 5 Attachments Base64 encoded lines are always 76 characters wide – only exception is the last line If numberofchars % 4 != 0, you may just throw an exception and terminate Do not use javax.mail.internet.MimeUtility or similar additional libraries for decoding Use the Content-Disposition header to suggest a name for saving Attachments that are not of type text/… don’t have and don’t need a charset –Just treat as stream of bytes/byte array

6 Prof. Dr. M. Mühlhäuser Telekooperation © 6 Base64-Example Take group of 4 characters S W 4 g Decode according to RFC –S = 0x12; W = 0x16; 4 = 0x38; g = 0x20 –Decoding may be done in groups: A-Z  char – ‘A’; a-z  char – ‘a’ + 26; 0-9 = char – ‘0’ + 26*2; +, /, = must be treated separately Combine to 24 bit number, shift according to index (big endian) –0x12 << 18 | 0x16 << 12 | 0x38 << 6 | 0x20 << 0  0x496e20 Shift number back in 8 bit blocks (also big endian) –Byte 0 = 0x496e20 >> 16 & 0xff = 0x49 –Byte 1 = 0x496e20 >> 8 & 0xff = 0x6e –Byte 2 = 0x496e20 >> 0 & 0xff = 0x20

7 Prof. Dr. M. Mühlhäuser Telekooperation © 7 Decoding Your own input stream –Elegant way of decoding Base64 and Quoted-Printable data (you can do it differently, only a suggestion) 1.Extend java.io.InputStream 2.Take character-array of undecoded data as parameter 3.Overwrite read() –Decode the character data when –Return -1 if end of data reached 4.Let the InputStreamReader deal with the nasty problem of decoding charsets Sample application has only 50 LoC for decoding quoted printable, 100 LoC for Base64

8 Prof. Dr. M. Mühlhäuser Telekooperation © 8 Regular Expressions Regular expressions are a nice way for filtering out substrings A bit like file name patterns (*, ?), but more powerful –Letters, Numbers remain the same –Punctuation characters usually have a special meaning, for characters escape them by a \ to use the character [, use \[ Attention: you need to escape the Backslash in Java-Strings  \[ == "\\[" –Alternatives: use [] [abc] matches a or b or c [A-Z] matches A or B or … or Z Negation: [^abc] matches everything but a or b or c –Wildcard. matches everything –Repetition * means “the previous element zero or more times” + means “the previous element one or more times”

9 Prof. Dr. M. Mühlhäuser Telekooperation © 9 Regular Expressions with Java Part of java.util.regex First, compile the pattern to search: –Pattern p = Pattern.compile("charset=[^ ]*") –The compile method has a variant that takes flags – use it for case-insensitivity: Pattern.CASE_INSENSITIVE Next, make a Matcher for a String out of it –Matcher m = p.match("Content-Type: text/plain; charset=\"us- ascii\"") Be sure to call the Matcher’s find method –m.find() m.group(0) now contains everything that maches –charset="us-ascii"

10 Prof. Dr. M. Mühlhäuser Telekooperation © 10 Grouping You need the thing after “charset=“ –Solution 1: parse for yourself –Solution 2: add groups to the expression Groups are signified by () and counted from 1 –Pattern p = Pattern.compile("charset=([^ ]*)") After matching, group(1) contains "\"us-ascii\")

11 Prof. Dr. M. Mühlhäuser Telekooperation © 11 Debugging Mail clients should be able to connect to the server and fetch the mail Always helpful: try to connect to the pop-server via telnet and issue POP commands manually –For closer examination, you may unzip the JAR-file and have a look at “mailbox.xml”


Download ppt "Telecooperation Technische Universität Darmstadt Copyrighted material; for TUD student use only E-Mail Q&A Telecooperation Group TU Darmstadt."

Similar presentations


Ads by Google