ORM Library

Implementing Data Compression

 

Download C# sources for this article

Abstract

This article describes how you can implement data compression when using eXpress Persistent Objects (XPO) to optimize data storage within your application.

Let's assume you are writing a straightforward bug tracking application. This application allows your users to enter bug reports (a subject and a detailed description which can hold a significant amount of text). As you can imagine, you can easily persist such a business object via XPO. The business model for this is listed below:

C#
VB.NET
class BugReport: XPObject {
  public string Subject;
  public string DetailedDescription;
}

OK, so now you have an object to store your bug information in, but applications of any reasonable complexity may have hundreds if not thousands of bugs. In any meaningful bug report, you will want to have as much information as possible in the description so you can replicate and address the bug - and as such, we are going to need a lot of space to store hundreds, if not thousands, of DetailedDescriptions.

To help maintain a compact database size, we will use data compression.

Contents

Applying Compression

There are many different compression algorithms available to developers and in our example we've chosen to use the "ZLib" library - shipped by Mike Krueger (mike@icsharpcode.net) - to generate the necessary compression/decompression.

This library's methods allow you to work with the system Stream class as a data layer for both encoding and decoding operations. The most obvious moment to perform compression and decompression operations is during object manipulation phases, such as data reading and data saving. XPO simplifies this process via its ValueConverter class and ValueConverterAttribute (for more information review XPO’s online documentation). We have created a new class called StringCompressionConverter which implements compression/decompression of string values in overridden methods, ConvertToStorageType and ConvertFromStorageType respectively. The StorageType property of the StringCompressionConverter class specifies the type of database column which holds the compressed data - byte array in our case.

C#
VB.NET
class BugReport : XPObject {
   public string Subject;
   [ValueConverter(typeof(StringCompressionConverter))]
   public string DetailedDescription;
}

class StringCompressionConverter : ValueConverter {
   public override object ConvertToStorageType(object value) {
      if (((string)value == string.Empty) || ((string)value == null)) {
         return null;
      }
      else {
         return CompressionUtils.CompressData(
             System.Text.Encoding.UTF8.GetBytes((string)value));
      }
   }
   public override object ConvertFromStorageType(object value) {
      byte[] decompressedValue = CompressionUtils.DecompressData((byte[])value);
      if (decompressedValue != null) {
         return System.Text.Encoding.UTF8.GetString(decompressedValue);
      }
      else { return string.Empty; }
   }
   public override Type StorageType {
      get { return typeof(byte[]); }
   }
}

The implementation for the raw data compression/decompression routines is shown below.

C#
VB.NET
public class CompressionUtils {
   public static byte[] CompressData(byte[] data) {
      if (data != null) {
         MemoryStream ms = new MemoryStream();
         Deflater deflater = new Deflater();
         DeflaterOutputStream outStream = new DeflaterOutputStream(ms, deflater);

         outStream.Write(data, 0, data.Length);
         outStream.Flush();
         outStream.Finish();

         return ms.GetBuffer();
      }
      else {
         return null;
      }
   }

   private const int BufferSize = 5196;
   private class ReadItem {
      public byte[] Buffer = new byte[BufferSize];
      public int Length = 0;
   }
   public static byte[] DecompressData(byte[] data) {
      byte[] result = null;
      if (data != null) {
         ArrayList dataBlocks = new ArrayList();
         MemoryStream ms = new MemoryStream(data);
         InflaterInputStream inStream = new InflaterInputStream(ms);
         int pos = 0;
         while (true) {
            ReadItem readItem = new ReadItem();
            readItem.Length = inStream.Read(readItem.Buffer, 0, BufferSize);
            if (readItem.Length <= 0) {
               break;
            }
            dataBlocks.Add(readItem);
            pos += readItem.Length;
         }
         result = new byte[pos];
         int resultPosition = 0;
         for (int i = 0; i < dataBlocks.Count; ++i) {
            Array.Copy(((ReadItem)dataBlocks[i]).Buffer, 0, result, resultPosition,
((ReadItem)dataBlocks[i]).Length);
            resultPosition += ((ReadItem)dataBlocks[i]).Length;
         }
      }
      return result;
   }
}

Optimizing Access to Compressed Data

The sample above works just fine, but it has a serious flaw - compressing and decompressing requires a significant amount of processing. This is reasonable for one or two records but imagine what happens if you have to populate a listbox with every subject in the database. In this instance, we have to uncompress every single Description and we aren't even interested in them, as we just need the subject.

XPO has an elegant solution to optimize the loading of properties with large amounts of data. This is accomplished through Delayed or Demand loading. The property is only fetched when it is actually asked for and is implemented by using the Delayed attribute.

In this example, we simply give our DetailedDescription property the attribute Delayed and associate it with an XPDelayedProperty object. When anyone accesses the uncompressed Public DetailedDescription property (and thus asks for the data) its value is automatically decompressed and returned to the caller via the getter method.

C#
VB.NET
class BugReport: XPObject {
  public string Subject;
  XPDelayedProperty detailedDescription =
    new XPDelayedProperty();

  [Delayed("detailedDescription"), Persistent]
  [ValueConverter(typeof(StringCompressionConverter))]
    public string DetailedDescription {
    get { return (string)detailedDescription.Value; }
    set { detailedDescription.Value = value; }
  }
}

That is all you need to do to implement data compression.

Download C# sources for this article

To learn more about XPO, please write to us at: info@devexpress.com. To order your copy, visit our online order page.

More from DevExpress
Live Chat
Have a pre-sales question?
Need assistance with your evaluation?
We are here to help.
Chat is one of the many ways you can contact members of the DevExpress Team. We are available Monday-Friday between 8:30am and 5:00pm Pacific Time.
If you need additional product information, require pre-sales assistance, or want help with your order, write to us at info@devexpress.com or call us at
+1 (818) 844-3383.