Friday, February 19, 2010

Powershell, 7-Zip, Amazon S3 Upload Script with AES-256 Encryption

I recently was tasked with finding a way to store some backup files from our server in a secure and reliable off-site location. After talking with our hosting provider, who wanted around $600 a month for off-site tape rotation, we decided to look at using Amazon Simple Storage Service (Amazon S3) to store the files in the cloud instead. We needed a way to automate the upload process and make sure that the data was encrypted, so I spent a few days working on a Powershell script (using the excellent PowerGui Script Editor) that uses 7-zip to create a .7z archive with AES-256 encryption and then send it up to Amazon S3 using the Amazon Web Services SDK for .NET. Here is the script:

It seems to work pretty well so far, taking about 5-10 minutes to zip and encrypt 1GB of SQL Backups down to 100MB and then upload it to Amazon S3. From there we can use tools like CloudBerry S3 Explorer to browse or download files when needed. The monthly costs to keep data on Amazon S3 is $0.150 per GB, with $0.10 per GB transfer in and $0.150 per GB transfer out. With a 1 week backup retention and minimal data-out transfers we expect to pay around around $10 to $20 a month and should be able to access it much quicker than if we were using off-site tape storage. Cloud computing FTW!
UPDATE 6/14/2010: I just posted the script used to create the Database backup folder.

16 comments:

andy01 said...

Thank you for posting a note on CloudBerry Explorer! I just want to add that CloudBerry Explorer comes with PowerShell command line interface that you can use instead of C# if you like. You can learn more about it here

Andy, CloudBerry Lab team.

Greg Bray said...

Thanks for the comment. I did see that CloudBerry can be used with PowerShell, but the above script is going to be deployed on a production server that will not have CloudBerry installed, so the Amazon SDK seemed like a better fit. I plan on writing another script that can be used to download the data and test the restore process, and I may end up trying the CloudBerry Powershell cmdlets for that. I definitely find CloudBerry very useful for browsing S3, and you have done a very good job at adding a user interface to what is otherwise an API only process!

Anonymous said...

sends it up but when i try to download it i get a message after typing password that the zip appears to be corrupts. the log looks clean. created same directory structure.

archive is unknown format or damaged.

Greg Bray said...

Sorry about that... it looks like it was an issue with how Powershell handles command line arguments when calling 7zip. I tired to get it to accept passwords that had spaces in them, but I guess that didn't work. I reworked the call to 7zip on line 082 so it should now pass all of the parameters correctly. Just update that line and make sure that your password is all one word.

dave said...

Thanks for this Greg, been looking at sorting my webserver backup solution out for ages and was stuck what do from here.

Signed up for AmazonS3 and got your script running now, I tweaked the timestamp to only use days so i get a 7 day overwrite (might do another with month/year to take out of the set.

Also another point to note when I first got to running your script on my webserver, I was getting "ERROR: 7Zip terminated with exit code 8".
After some commenting and hacking, I found if I changed the 7zip command line switch from -mx9 to -mx5 (ultra compression to normal) it worked.

I know the better the compression the less $$$ on the amazon spend, but i'll have to live with it until I can beef up the server...


thanks again for such a well considered and well written solution...

Greg Bray said...

Looks like that exit code is "Not enough memory for operation", so reducing the compression level could make it work better in that case. Glad you could find it useful. We use the S3ObjectKey to set the retention period. If I get a chance I'll post the full script used for creating the local backup folder and setting the retention period.

joe said...

I was looking for something like this -- thanks for posting it. I don't need encryption at the moment but it's nice to know I can add it when necessary...
Joe

Anonymous said...

Hi Greg, Thank you for your post. I'm trying to upload files and sometimes for large file like 80MB or above, I get the following error message, I split your code into two files amazons3.ps1 calling into S3.ps1 (which does the uploaded with the args passed to it). Please help. I saw some posts telling to disable the keepalive but don't how to do it in ps1.

Exception calling "PutObject" with "1" argument(s): "The request was aborted: The request was canceled."
At D:\Ven\S3.ps1:73 char:34
+ $S3Response = $AmazonS3.PutObject <<<< ($S3PutRequest) #NOTE: upload defaults to 20 minute timeout.
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : DotNetMethodException

D:\Ven\S3.ps1 : ERROR: Amazon S3 put requrest failed for production-db-backup-Aug-1-2010.zip.
Script halted.
At D:\ven\amazons3.ps1:45 char:5
+ & <<<< 'D:\Ven\S3.ps1' $file $amazon_filename $S3BucketName
+ CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException
+ FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,S3.ps1

ITnetworkguru said...

Thanks so much for creating and putting this out for everyone to use!!! I am modifying this script for my own use and you have saved me hours of work! Wanted to share what I am doing and some problems I have run into. I have an SSL error when the script reaches the S3putrequest, I can't see where the URL is defined in order to know where it is trying to connect. Should the bucket name be the FQDN? Second item I am trying to figure out is how do I modify the script to upload to S3 a 7zip split archive? My archives are 30+ gigs and I need to split them up.

Greg Bray said...

Thanks, glad you found it useful! The bucket name should just be the unique part like MyBucket or MyBackups. The Amazon library will then create the URL from that information (See link on line 88 for more info).

As for your other changes, you would need to setup the #Zip and Encrypt files section to create the multi-part file and then change lines 88-106 to upload more than one file. I don't think that the Amazon library has this built in, so you could put that code into a function and call it once for each file. Also you could look at using something like CloudBerry Explorer to upload the files instead (see first comment here from Andy01), as they implement more advance upload features that could make it a bit faster.

Chris Andrews said...

I've found that you don't need to drastically change the script to allow large uploads - all you need to do is change the PutRequest Timeout value to a larger number than the default of 20 minutes (i.e. 1200000ms). To set the timeout to an hour, update PS7ziptoAmazonS3.ps1:

$S3PutRequest.Key = $S3ObjectKey
$S3PutRequest.Timeout = 3600000 #add this line - this is an hour
$S3PutRequest.FilePath = $strFile

Robert N. said...

Nice post, also check out bulk S3 uploading via PowerShell:

http://sushihangover.blogspot.com/2012/03/powershell-bulk-uploading-to-new-s3.html

or using PowerShell for SSL public/private key encryption (also for retrieving your EC2 Windows Administrator's password):

http://sushihangover.blogspot.com/2012/04/powershell-rsa-private-key-based.html

Anonymous said...

Anyone want to help me modify this to use multiple files?
Amazon s3 has a 5gb limit per file..
I am splitting backup into 4.5gb files and would like this script to run and upload 1 at a time..

Thanks

kyleseidel said...

Holy crap... what an awesome script. Very well thought out. Thanks for sharing. It made my life easier. I needed to backup a OpenEdge Progress database and upload it to a remote server using SFTP (WinSCP).

Thanks again.
-Kyle

Anonymous said...

Thanks for sharing this script, it is very helpful.

However, there is one critical flaw. The script assumes that if the S3 response to the PutObject is not equal to null, all is well.

That is not the case. In my scenario, I was uploading large files (2GB+), and the PutObject would timeout after a couple of minutes. The S3 response was not equal to null in this case; even though an exception had been thrown. I tested this by running "Write-Host (S3Response.ToString())", and it contained valid XML data.

The only way to reliably make sure your PutObject request succeeded, for the sake of later commands, is to wrap it in a try-catch block.

In my scenario, I was deleting the local file after the PutObject completed, so yes, I learned all of this the hard way.

Çınar BAKAN said...

Anyone want to help me modify this to use multiple files?
Amazon s3 has a 5gb limit per file..
I am splitting backup into 4.5gb files and would like this script to run and upload 1 at a time..

Blog.TheG2.Net - Your guide to life in the Internet age.