Searching files in a zip archive in memory with PowerShell

I found myself tonight needing to build a function to search the contents of text files in a very large zip archive to find one containing a specific value. To handle the operation quickly, I wanted to perform the operation in memory. While that may be too specific to be useful for most, I thought at least an example of how to retrieve a file within a zip archive and parse its content might be interesting to a wider audience, and more importantly, a useful archive for myself.

P.S. – I am happy to share the more complex end result if anyone tells me it’s useful to them.

What you will need

While the new extract-archive and compress-archive cmdlets are handy for basic zip archive creation and extraction, they are not much help when you need to get down to the item level within a zip archive. The system.io.compression assemblies are required, which you must load explicitly, as they are not loaded by default. For this simple example, these are key enablers:

ZipFile class, to open the archive to parse members
https://msdn.microsoft.com/en-us/library/system.io.compression.zipfile(v=vs.110).aspx

GetEntry method, to retrieve individual file (or files) in the archive
https://msdn.microsoft.com/en-us/library/system.io.compression.ziparchive.getentry(v=vs.110).aspx?cs-save-lang=1&cs-lang=vb#code-snippet-1

StreamReader class, to read the file into memory and search content
https://msdn.microsoft.com/en-us/library/system.io.streamreader(v=vs.110).aspx

Sample Code

In this example, we will open a zip archive named scripts.zip on the D:\ drive, retrieve a file named ConnectToAzure.txt and search for the value.

Select your zip archive.

$ZipArchive = "d:\scripts.zip"

Open archive for reading.

$ZipStream = [io.compression.zipfile]::OpenRead(“$ZipArchive”)

Select the item in the archive. Notice how you must reference the folder in the path to the file.

$ZipItem = $ZipStream.GetEntry('Scripts/ConnectToAzure.txt ')

Open the item from the archive.

$ItemReader = New-Object System.IO.StreamReader($ZipItem.Open())

Use Streamreader class and read into memory. $DocItemSet represents the contents of the file.

$DocItemSet = $ItemReader.ReadToEnd()

Search the file contents for desired value.

$DocItemSet.Contains("Connect-MsolService")

Happy PowerShelling!

Print Friendly
Posted in Blog Tagged with:

Leave a Reply

%d bloggers like this: