This article shows you how to implement the LZW lossless compression algorithm in VBScript. It can also be used in VBA as is or almost as is.
https://en.wikipedia.org/wiki/Lempel...%E2%80%93Welch
The LZW Algorithm
The LZW algorithm is a compression technique that results in no loss of data. It builds a dictionary of codes and values used in the compression on the fly. The dictionary is not stored with the compressed file and is discarded after compression. During decompression, the dictionary is rebuilt from the compressed data.
The LZW algorithm functions by:
- Initialize the dictionary to contain all strings of length one
- Find the longest string in the dictionary that matches the current input
- Output the dictionary code for that matching input
- Append the next character from the input to the matching input string and add it as a new dictionary value with a new code
- Go to step 2
The Code and How to Use It
The code below is an example implementation of the LZW Algorithm in VBScript and is easily portable to VBA. The functions are LZWCompress and LZWUncompress and take the file path as the parameter.
The dictionary is initialized to the full range of 8 bit values and uses 16 bits for each key. The dictionary reinitializes after reaching 65535 keys I did this for ease of implementation even though that also means it's not as compressed as it can be.
My tests on large Access databases have shown an 86% compression level compared to 93% compression using "ultra" level compression with the LZMA algorithm in 7zip.
My implementation of the algorithm is also slow due to the fact that I read the file 1 byte at a time. Again, this was due to ease of implementation. Would be more efficient to read a large amount of the file into memory rather than byte by byte.
Expand|Select|Wrap|Line Numbers
- Option Explicit
- Const ForReading = 1, ForWriting = 2, ForAppending = 8
- Function LZWCompress(strPath)
- Dim oFS, oFRead, oFWrite, oDict, strNext, strCurrent, intMaxCode, i
- Set oDict = CreateObject("Scripting.Dictionary")
- Set oFS = CreateObject("Scripting.FileSystemObject")
- Set oFRead = oFS.OpenTextFile(strPath, ForReading)
- Set oFWrite = oFS.OpenTextFile(strPath & ".lzw", ForWriting, True)
- Set oFS = Nothing
- intMaxCode = 255
- strCurrent = oFRead.Read(1)
- For i = 0 To 255
- oDict.Add Chr(i), i
- Next
- Do Until oFRead.AtEndOfStream
- strNext = oFRead.Read(1)
- If oDict.Exists(strCurrent & strNext) Then
- strCurrent = strCurrent & strNext
- Else
- oFWrite.Write(Chr(CByte(oDict.Item(strCurrent) \ 256)) & Chr(CByte(oDict.Item(strCurrent) Mod 256)))
- intMaxCode = intMaxCode + 1
- oDict.Add strCurrent & strNext, intMaxCode
- strCurrent = strNext
- If intMaxCode = 65535 Then
- intMaxCode = 255
- oDict.RemoveAll
- For i = 0 To 255
- oDict.Add Chr(i), i
- Next
- End If
- End If
- Loop
- oFWrite.Write(Chr(CByte(oDict.Item(strCurrent) \ 256)) & Chr(CByte(oDict.Item(strCurrent) Mod 256)))
- oFRead.Close
- oFWrite.Close
- Set oFRead = Nothing
- Set oFWrite = Nothing
- Set oDict = Nothing
- End Function
- Function LZWUncompress(strPath)
- Dim oFS, oFRead, oFWrite, oDict, intNext, intCurrent, intMaxCode, i, strNext
- Set oDict = CreateObject("Scripting.Dictionary")
- Set oFS = CreateObject("Scripting.FileSystemObject")
- Set oFRead = oFS.OpenTextFile(strPath, ForReading)
- Set oFWrite = oFS.OpenTextFile(strPath & ".unc", ForWriting, True)
- Set oFS = Nothing
- intMaxCode = 255
- strNext = oFRead.Read(2)
- intCurrent = 0
- For i = 1 To Len(strNext)
- intCurrent = intCurrent + 256 ^ (Len(strNext) - i) * Asc(Mid(strNext, i, 1))
- Next
- For i = 0 To 255
- oDict.Add i, Chr(i)
- Next
- Do Until oFRead.AtEndOfStream
- oFWrite.Write(oDict.Item(intCurrent))
- intMaxCode = intMaxCode + 1
- strNext = oFRead.Read(2)
- intNext = 0
- For i = 1 To Len(strNext)
- intNext = intNext + 256 ^ (Len(strNext) - i) * Asc(Mid(strNext, i, 1))
- Next
- If oDict.Exists(intNext) Then
- oDict.Add intMaxCode, oDict.Item(intCurrent) & Left(oDict.Item(intNext), 1)
- Else
- oDict.Add intMaxCode, oDict.Item(intCurrent) & Left(oDict.Item(intCurrent), 1)
- End If
- If intMaxCode = 65535 Then
- intMaxCode = 255
- oDict.RemoveAll
- For i = 0 To 255
- oDict.Add i, Chr(i)
- Next
- End If
- intCurrent = intNext
- Loop
- oFWrite.Write(oDict.Item(intCurrent))
- oFRead.Close
- oFWrite.Close
- Set oFRead = Nothing
- Set oFWrite = Nothing
- Set oDict = Nothing
- End Function