Download presentation
Presentation is loading. Please wait.
Published byMilo Holmes Modified over 9 years ago
1
A Comparison of Compression Techniques for XML-based Security Policies in Mobile Computing Environments Xuebing Qing Carlisle Adams
2
Agenda Why Compress? Criteria for Compression Algorithms Gzip and Bzip wbXML with/without Transcode ASN.1 Combinations –wbXML + Zip –ASN.1 + Zip Recent XML Compression Proposals Conclusions and Future Directions
3
Why Compress? For high interoperability between domains, XML (XACML) is a good choice for policy representation On-device Authorization Decision rendering, and simple policy deployment/updating, is also required. XML is too verbose and heavy for many mobile devices: –Limited bandwidth –Limited CPU power, RAM –Limited battery, flash memory, etc.
4
Evaluating Compression Algorithms Criterion 1: High Compression Ratio Criterion 2: Low Processing Overhead Criterion 3: No Semantic Ambiguity “Nice to have”: 3 rd Party API Support We consider the most popular compression algorithms, as well as their combinations: –Gzip and Bzip –wbXML –ASN.1 –wbXML with transcode + Gzip or Bzip –ASN.1 + Gzip + Bzip None of them introduce semantic ambiguity and all have good 3 rd party API support. The ideal algorithm: should achieve the highest compression rate while keeping decompression overhead at a minimum.
5
Experimental Setting Written in Java, tested under JSDK 1.4.2 / Windows 2000 / 866 MHz CPU and 512 MB RAM Runtime Memory Profiling: Eclipse Hyades Plug-in Java APIs Used –wbXML: kXML 1.1 (Open Source) –ASN.1: Pure Java API by OSS Nokalva (Alpha Version – Trial version) –Gzip: The gzip implementation in JDK 1.4.2 –Bzip: Apache BZip2 Implementation Test Cases: 9 XACML files (2KB ~ 1 MB) created from the XACML (version 1.1) Conformance Test Suite
6
Gzip and Bzip2: Compression Rate Very good compression rate (especially when size > 70K) Compression_rate gzip better than Compression_rate bzip2 when size 70K Bzip2 performs extremely well when size >= 250K. Zip algorithm works better with large files, yet it still compresses small files (2K) to 1/3 of original size.
7
Gzip and Bzip2: Processing Overhead - Time Only decompression time is considered, because the compression of XACML only happens on the server side when deploying policies. Absolute decompression time is not enough to evaluate. The wbXML-to-XML conversion mainly involves XML tag replacement and is not CPU intensive so it can be performed on a device (thus the time of the conversion can be used as a reference to make a fairly realistic evaluation). Gzip performs the best; BZip2 is similar to wbXML conversion Considering that kXML 1.1 API has significant room for optimization, it appears that wbXML conversion may ultimately have a similar time overhead to Gzip and hence may be acceptable on a mobile device.
8
Gzip and Bzip2: Memory Overhead – Raw Data Numbers in brackets are mem increment; numbers in red means memory in use decreases when file size increases – it is caused by garbage collection. Memory overhead of wbXML-to-XML is used as a reference for the estimate. Size memory = Size memory_in_use + Size memory_gced. So the memory used by File 8 is not 1,857,623 (memory in use), but 3,087,933 bytes that include garbaged collected memory in the process. To analyze, we categorize memory as two parts: base runtime memory for the decompression API and program itself, and decompression memory for representation and computation of data at runtime. Base memory is estimated by comparing the absolute memory size with that of wbXML-to-XML conversion. Memory size increment factor is used to estimate decompression memory. File Size (bytes) GZIP Memory [increment] (bytes) Bzip2 Memory [increment](bytes) wbXML with Transcode (bytes) File 12,167913,77018,699,6941,221,647 File 24,798922,000 [8,230]18,707,972 [8,278]1,272,590[50,943] File 39,479938,566 [16,566]18,851,890 [143,918]1,372,148[99,448] File 423,976974,080 [35,514]18,759,269 [-92621] 2 1,803,052[430,904] File 570,1861,175,590 [201,510]18,957,045 [197,776]1,241,162[-561,890] 4 File 6140,0711,450,050 [274,460]19,374,474 [417,429]1,106,431[-134,730] 4 File 7278,6231,996,007 [545,957]19,752,229 [377,755]1,131,434[25,003] 4 File 8556,3951,857,623 [-138,384] 1 20,802,929 [1,050,700]1,385,234[253,800] 4 File 91,111,9393,482,445 [1,624,822]8,916,388 [-11,886,541] 3 742,690[-642,544] 4
9
Gzip and Bzip2: Memory Overhead – Result Memory size increment factor measures the memory increment caused by the data size increment, or memory increment / file size increment. The bigger a memory size increment factor is, the more memory is used for data compression and the more frequent the garbage collection will be. It is range of possible values instead of one fixed value Result: Gzip has a very small footprint when decompressing XACML data – its processing memory overhead is reasonable and acceptable. However, a zipped XACML has to be unpacked into XML and then processed. The processing overhead of Gzip is OH gzip = OH Gzip-decompression + OH xml-processing
10
wbXML: Overview Part of the presentation logic in WAP Uses a token dictionary, where each token (transcode) maps to a predefined string (mainly element tags and attribute tags). wbXML without transcode: no explicit token dictionary specified (otherwise, wbXML with transcode). Code segments used to generate transcode in kXML 1.1
11
wbXML: Compression Rate wbXML with transcode reduces size to under 50% of the original, which is much better than wbXML without transcode. Not comparable with Gzip, particularly when the file size is over 5 KB. However, an XACML policy in wbXML can be processed directly by a wbXML parser without any decompression overhead. We only discuss the processing overhead for wbXML with transcode.
12
wbXML: Analysis of Processing Overhead There is no time and memory overhead for decompression. However, it is impractical to measure and compare CPU time and memory used by evaluating an XACML policy in wbXML form and in XML form. We do following analysis rather than experiments –Footprint wbxml_obj < Footprint xml_obj : since a wbXML file is 50% of its original XACML size, it is reasonable to assume that a wbXML object is approximately half of its XML counterpart. –Smaller runtime representation certainly enables faster processing, but need to consider the overhead of transcode-table lookup at runtime. –We can assume Processing_Time wbxml <= Processing_Time xml –Evaluating an XACML policy in wbXML is less battery intensive because its in-memory representation is much smaller than its XML counterpart. –Result: OH wbxml = x OH xml-processing where < 1; it is smaller than OH gzip = OH Gzip-decompression + OH xml-processing
13
ASN.1: Schema Based XML Encoding A schema-based binary encoding spec, X-694 “Mapping W3C XML Schema Definitions into ASN.1”, is under development. The spec introduces ASN.1, a binary-and-schema-based language, into the XML world, which is XML-schema based. With the specification, an XML document can be converted into ASN.1, which is then encoded with ASN.1’s binary encoding rules, such as PER, DER, CER, BER Theoretically, ASN.1 with PER, the most compact encoding rule, can achieve the same level compression rate that Gzip does [4]. However, Pure Java API by OSS Nokalva only offers a compression rate that is just a little bit better than wbXML, partially because the API is still in its Alpha stage – several hot fixes have been sent during the experiments in this research.
14
ASN.1 Encoding: Compression Rate Slightly better than wbXML with transcode, but not comparable to Gzip. The result is different from the one from Fast Web Services (FWS) [7]; this might be caused by the difference in APIs used and/or by the different characteristic between XACML files and the Web services XML files used in FWS.
15
ASN.1 Encoding: Analysis of Processing Overhead No need to convert an ASN.1 encoded policy to XACML when processing, because ASN.1 is a schema language and supports similar operations as XML. As with wbXML, we do analysis rather than experiments. The analysis is similar with the one for wbXML. Result: OH ASN.1 = x OH xml-processing where < 1; it is smaller than OH gzip = OH Gzip-decompression + OH xml-processing According to Sun’s experimental results on FWS, could be as small as 0.1 in a Web services environment (although no such result has been achieved in our experiments).
16
Agenda Why Compress? Criteria for Compression Algorithms Gzip and Bzip wbXML with/without Transcode ASN.1 Combinations –wbXML + Zip –ASN.1 + Zip Recent XML Compression Proposals Conclusions and Future Directions
17
Combine wbXML or ASN.1 with Gzip Gzip, wbXML and ASN.1 do not perform well enough to satisfy the criteria on their own. Pure Gzip has more processing overhead than wbXML and ASN.1, while wbXML and ASN.1 do not compress as well as Gzip. It makes sense to combine them: –wbXML with transcode + Gzip –ASN.1 with transcode + Gzip –Other combinations are not as good as the above (wbXML with transcode is better than wbXML without transcode, and Bzip2 consumes much more memory and CPU time than Gzip for decompression).
18
The Combinations: Compression Rate Much better than pure ASN.1 and wbXML Even better than pure Gzip It is interesting that the overall compression rate of wbXML + Gzip for XACML over 100KB is better than ASN.1 + Gzip.
19
The Combinations: Analysis of Processing Overhead For wbXML with transcode + Gzip: OH wbxml_GZip = OH Gzip_decompression + x OH xml-processing For ASN.1 + Gzip: OH ASN.1_Gzip = OH Gzip_decompression + x OH xml-processing Just for reference: –Gzip: OH gzip = OH Gzip-decompression + OH xml-processing –wbXML: OH wbxml = x OH xml-processing OH wbxml_Gzip is definitely better than OH Gzip because an XACML file is only decompressed once but processed many times. Although OH wbxml_Gzip is greater than OH wbxml, the difference can be ignored, because OH Gzip_decompression is small and the decompression only happens the first time the policy is downloaded, and when the policy is updated. Conclusion: wbXML + Gzip is better than ASN.1 + Gzip : –Tag names in XACML are long; simple replacement (wbXML) achieves a good compression rate. –Replacement (wbXML) creates less overhead than complex encoding (ASN.1) –ASN.1 does not achieve the excellent compression rate expected (when publicly available APIs are used). –Good open source wbXML APIs are available.
20
Recent XML Compression Proposals (1): XOP/MTOM XOP: XML-binary Optimized Packaging –an XML serialization protocol, which converts certain XML data content (usually base-64 encoded) into binary streams and puts them into a structure that looks like MIME multipart, with an XML document as the root part. MTOM: Message Transmission Optimization Mechanism –a description of how XOP is layered into SOAP HTTP transport (SOAP 1.2) for Web services More HTTP friendly (it’s using MIME multipart); not originally conceived for the wireless world. More like a communication protocol than a compression algorithm. There appears to be no public implementation available; therefore, not known how well it performs with respect to our criteria (compression rate, processing overhead, semantic ambiguity)
21
Recent XML Compression Proposals (2): XMill A compression algorithm from AT&T, particularly designed for XML Step 1 - Regrouping: separate structure, layout, and data, then distribute data elements into data streams (int, char, string, base64, etc.) Step 2 – Use gzip, bzip2, etc., to compress these streams XMill typically achieves much better compression rate than conventional compressors such as gzip, bzip2 on XML data. More processing overhead than gzip, bzip2 for the extra “step 1”. Compared with wbXML + Gzip, XMill needs to convert XACML back to XML for processing.
22
Conclusions and Future Directions Suggested criteria for the use of XML-based policies in mobile devices Reviewed and compared a variety of compression algorithms for XML Concluded that {wbXML + transcode + Gzip} offers the best combination of compression rate and processing overhead of all algorithms tested –This combination is recommended for use with XML-based security policies in mobile computing environments Directions for further work –Keep an eye on ASN.1 (will public implementations match theoretical results?) –The compression rate of wbXML with transcode can be improved by adding more transcodes into the table (e.g., built-in function names, data type names, etc.). How much improvement can be gained? –Experiments on XMill (perform more detailed comparison with wbXML to determine the best algorithm for this environment)
23
References [1] Uche Ogbuji. “Tip: Compress XML files for efficient transmission”, IBM DeveloperWorks, 9 April, 2004 [2] M. Cokus, D, Winkowski. “XML Sizing and Compression Study For Military Wireless Data”, XML 2002 Proceedings by deepX [3] http://www.wapforum.org/what/technical/PROP-WBXML-19990815.pdf. “WAP Binary XML Content Format Specifications – Version 1.2” [4] ASN.1 Site - XML. “What ASN.1 Can Offer for XML?”, http://asn1.elibel.tm.fr/xml/ June, 2004 [5] ITU-T X.694. “Information Technology – ASN.1 encoding rules – Mapping W3C XML Schema Definitions Into ASN.1”, Jan, 2004 [6] Nokia. “Nokia Position Paper: W3C Workshop on Binary Interchange of XML Information Item Sets”, Aug, 2003, http://www.w3.org/2003/08/binary-interchange-workshop/02-Nokia-Position- Paper_02.htmhttp://www.w3.org/2003/08/binary-interchange-workshop/02-Nokia-Position- Paper_02.htm [7] P. Sandoz, et al. Sun Microsystem. “Fast Web Services”, July, 2003, W3C Workshop on Binary Interchange of XML Information Item Sets [8] http://www.devx.com/xml/article/16754/0/page/1 “Compressing XML” [9] M. Girardot, N. Sundaresan. “Millau, an encoding format for efficient representation and exchange of XML over the Web”, http://www9.org/w9cdrom/154/154.html [10] http://www.gnu.org/software/gzip/gzip.html. “gzip - GNU Project - Free Software Foundation(FSF)” [11] http://gnuwin32.sourceforge.net/packages/bzip2.htm “Bzip2 for Windows” [12] http://www.kxml.org “kXML with wbXML support”http://www.kxml.org [13] http://www.oss.com “OSS Nokalva ASN.1/Pure Java Tools - Beta”http://www.oss.com [14] http://www.eclipse.org/hyades/ “Hyades – Automated Software Quality Evaluation Framework”http://www.eclipse.org/hyades/ [15] http://sourceforge.net/projects/xmill “XMill - A User Configurable XML Processor”http://sourceforge.net/projects/xmill
24
Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.