Implementing Data Compression
Download C# sources for this article
Abstract
This article describes how you can implement data compression when using eXpress Persistent Objects (XPO) to optimize data storage within your application. Let's assume you are writing a straightforward bug tracking application. This application allows your users to enter bug reports (a subject and a detailed description which can hold a significant amount of text). As you can imagine, you can easily persist such a business object via XPO. The business model for this is listed below:
|
class BugReport: XPObject { public string Subject; public string DetailedDescription; }
Class BugReport : Inherits XPObject Public Subject As String Public DetailedDescription As String End Class
|
OK, so now you have an object to store your bug information in, but applications of any reasonable complexity may have hundreds if not thousands of bugs. In any meaningful bug report, you will want to have as much information as possible in the description so you can replicate and address the bug - and as such, we are going to need a lot of space to store hundreds, if not thousands, of DetailedDescriptions. To help maintain a compact database size, we will use data compression.
Contents
Applying Compression
There are many different compression algorithms available to developers and in our example we've chosen to use the "ZLib" library - shipped by Mike Krueger (mike@icsharpcode.net) - to generate the necessary compression/decompression. This library's methods allow you to work with the system Stream class as a data layer for both encoding and decoding operations. The most obvious moment to perform compression and decompression operations is during object manipulation phases, such as data reading and data saving. XPO simplifies this process via its ValueConverter class and ValueConverterAttribute (for more information review XPO’s online documentation). We have created a new class called StringCompressionConverter which implements compression/decompression of string values in overridden methods, ConvertToStorageType and ConvertFromStorageType respectively. The StorageType property of the StringCompressionConverter class specifies the type of database column which holds the compressed data - byte array in our case.
|
class BugReport : XPObject { public string Subject; [ValueConverter(typeof(StringCompressionConverter))] public string DetailedDescription; }
class StringCompressionConverter : ValueConverter { public override object ConvertToStorageType(object value) { if (((string)value == string.Empty) || ((string)value == null)) { return null; } else { return CompressionUtils.CompressData( System.Text.Encoding.UTF8.GetBytes((string)value)); } } public override object ConvertFromStorageType(object value) { byte[] decompressedValue = CompressionUtils.DecompressData((byte[])value); if (decompressedValue != null) { return System.Text.Encoding.UTF8.GetString(decompressedValue); } else { return string.Empty; } } public override Type StorageType { get { return typeof(byte[]); } } }
Class BugReport : Inherits XPObject Public Subject As String <ValueConverter(GetType(StringCompressionConverter))> _ Public DetailedDescription As String End Class
Class StringCompressionConverter : Inherits ValueConverter Public Overrides Function ConvertToStorageType(ByVal value As Object) As Object If ((CType(value, String) Is String.Empty) Or (CType(value, String) Is Nothing)) Then Return Nothing Else Return CompressionUtils.CompressData( _ System.Text.Encoding.UTF8.GetBytes(CType(value, String))) End If End Function Public Overrides Function ConvertFromStorageType(ByVal value As Object) As Object Dim decompressedValue As Byte() = CompressionUtils.DecompressData(CType(value, Byte())) If Not decompressedValue Is Nothing Then Return System.Text.Encoding.UTF8.GetString(decompressedValue) Else Return String.Empty End If End Function Public Overrides ReadOnly Property StorageType() As Type Get Return GetType(Byte()) End Get End Property End Class
|
The implementation for the raw data compression/decompression routines is shown below.
|
public class CompressionUtils { public static byte[] CompressData(byte[] data) { if (data != null) { MemoryStream ms = new MemoryStream(); Deflater deflater = new Deflater(); DeflaterOutputStream outStream = new DeflaterOutputStream(ms, deflater);
outStream.Write(data, 0, data.Length); outStream.Flush(); outStream.Finish();
return ms.GetBuffer(); } else { return null; } }
private const int BufferSize = 5196; private class ReadItem { public byte[] Buffer = new byte[BufferSize]; public int Length = 0; } public static byte[] DecompressData(byte[] data) { byte[] result = null; if (data != null) { ArrayList dataBlocks = new ArrayList(); MemoryStream ms = new MemoryStream(data); InflaterInputStream inStream = new InflaterInputStream(ms); int pos = 0; while (true) { ReadItem readItem = new ReadItem(); readItem.Length = inStream.Read(readItem.Buffer, 0, BufferSize); if (readItem.Length <= 0) { break; } dataBlocks.Add(readItem); pos += readItem.Length; } result = new byte[pos]; int resultPosition = 0; for (int i = 0; i < dataBlocks.Count; ++i) { Array.Copy(((ReadItem)dataBlocks[i]).Buffer, 0, result, resultPosition, ((ReadItem)dataBlocks[i]).Length); resultPosition += ((ReadItem)dataBlocks[i]).Length; } } return result; } }
Public Class CompressionUtils Public Shared Function CompressData(ByVal data() As Byte) As Byte() If (Not (data Is Nothing)) Then Dim ms As MemoryStream = New MemoryStream() Dim deflater As Deflater = New Deflater() Dim outStream As DeflaterOutputStream = New DeflaterOutputStream(ms, deflater) outStream.Write(data, 0, data.Length) outStream.Flush() outStream.Finish() Return ms.GetBuffer() Else Return Nothing End If End Function
Private Const BufferSize As Integer = 5196 Private Class ReadItem Public Buffer As Byte() Public Length As Integer = 0 Public Sub New() Buffer = New Byte(BufferSize) {} End Sub End Class Public Shared Function DecompressData(ByVal data As Byte()) As Byte() Dim result As Byte() = Nothing If Not data Is Nothing Then Dim dataBlocks As ArrayList = New ArrayList() Dim ms As MemoryStream = New MemoryStream(data) Dim inStream As InflaterInputStream = New InflaterInputStream(ms) Dim pos As Integer = 0 Do While True Dim readItem As ReadItem = New ReadItem() readItem.Length = inStream.Read(readItem.Buffer, 0, BufferSize) If (readItem.Length <= 0) Then Exit Do End If dataBlocks.Add(readItem) pos += readItem.Length Loop result = New Byte(pos) {} Dim resultPosition As Integer = 0 Dim i As Integer For i = 0 To dataBlocks.Count - 1 Array.Copy(CType(dataBlocks(i), ReadItem).Buffer, 0, result, resultPosition, _ CType(dataBlocks(i), ReadItem).Length) resultPosition += CType(dataBlocks(i), ReadItem).Length Next End If Return result End Function End Class
|
Optimizing Access to Compressed Data
The sample above works just fine, but it has a serious flaw - compressing and decompressing requires a significant amount of processing. This is reasonable for one or two records but imagine what happens if you have to populate a listbox with every subject in the database. In this instance, we have to uncompress every single Description and we aren't even interested in them, as we just need the subject. XPO has an elegant solution to optimize the loading of properties with large amounts of data. This is accomplished through Delayed or Demand loading. The property is only fetched when it is actually asked for and is implemented by using the Delayed attribute. In this example, we simply give our DetailedDescription property the attribute Delayed and associate it with an XPDelayedProperty object. When anyone accesses the uncompressed Public DetailedDescription property (and thus asks for the data) its value is automatically decompressed and returned to the caller via the getter method.
|
class BugReport: XPObject { public string Subject; XPDelayedProperty detailedDescription = new XPDelayedProperty();
[Delayed("detailedDescription"), Persistent] [ValueConverter(typeof(StringCompressionConverter))] public string DetailedDescription { get { return (string)detailedDescription.Value; } set { detailedDescription.Value = value; } } }
Class BugReport : Inherits XPObject Public Subject As String Private detailedDescription_ As XPDelayedProperty = New XPDelayedProperty()
<Delayed("detailedDescription"), Persistent(), _ ValueConverter(GetType(StringCompressionConverter))> _ Public Property DetailedDescription() As String Get Return detailedDescription_.Value End Get Set(ByVal Value As String) detailedDescription_.Value = Value End Set End Property End Class
|
That is all you need to do to implement data compression. Download C# sources for this article To learn more about XPO, please write to us at: info@devexpress.com. To order your copy, visit our online order page.
|