Monday, June 14, 2010

Powershell Backup All Databases Locally and to Amazon S3

Previously I had posted a Powershell script that would use 7Zip to compress and encrypt a folder and then send it to Amazon S3. This script could have many uses, but in my case it was designed to fulfill the need for off site storage of compressed database backups. While it did all the heavy lifting of getting the data encrypted, compressed and uploaded to S3 I left the creation of the database backup folder as an exercise for the user to complete.

I now have a working system that I would like to share in case anyone was unable to create the backup folder using Powershell. There were a few hiccups that I found, such as not being able to use the Start-Transcript method in a SQL Agent Job and having my MSDB data file triple in size after 3 months of backups, but after working out those issues the script provides a very cost effective method of creating both local and offsite database backups with 1 week or 1 month retention. Currently I have it running on a server with about 100 databases ranging in size from 7MB to 700MB with daily full backups and hourly differential backups being sent to Amazon S3 and stored with 1 month retention for about $5 a month. The total raw backup size is 2.2GB per full backup set and 100-400MB for each differential backup set, but the script gets about 10:1 compression so the daily storage on S3 is about 500MB total for all of the full and differential backups.

You will need to modify the settings in both the psBackupAllDBtoAmazonS3.ps1 and ps7ZiptoAmazonS3.ps1 files to get this to work on your server, but once you do you can sit back and let the scripts do it’s magic. You can set it to run as a scheduled task, but we chose to run it as an SQL Agent Job so that our server monitoring software would notify us if there were any issues with the script.

Friday, April 16, 2010

Lap Around .NET 4 with Scott Hanselman

Things were slow at the office today, so I used the free time to catch up on some of the recorded sessions from DevDays 2010. Scott Hanselman’s overview of the new features in .NET 4 is one of the best presentation I have seen about .NET 4 so far. How many presentations include 14,000 generations of monkeys trying to write Shakespeare? Here is the synopsis:

In this session, Scott Hanselman gives a deep and broad tour of the .NET 4 release, with a focus on making your development experience easier. See lots of demos (and very few slides) showcasing the key new features in the .NET Framework 4 including MEF, improvements in ASP.NET, threading, multi-core and parallel extensions, additions to the base classes, changes and additions to the CLR and DLR, what's new for the languages (Visual Basic and C#), and of course, what's new in Windows Presentation Foundation and System.Web. Come and see how all these new features and capabilities improve your overall .NET experience!

Enjoy!

Sunday, March 14, 2010

TED Fullscreen Video Multiple Monitor Userscript: Yay!

I like to think that I can hack out JavaScript as good as any web developer, but occasionally I still fall into rookie mistakes. For instance, I totally forgot to test my last userscript for fixing Slashdot articles while zooming in anything but my default browser (Google Chrome). In theory it should have worked with Greasemonkey, but when I tried testing it today using Firefox it failed miserably. Luckily with a few code changes I was able to update the script and get it to work with both Firefox and Google Chrome as well as fix a huge bug when visiting Slashdot’s main page as a non-registered user (another thing I probably should have tested!).

Considering that it was my first attempt at a user script I am not too upset and will chalk it up to experience. Tonight my plans for the evening fell apart, so I decided to try and fix a big pet peeve of mine: no multiple monitor support for video players on certain websites. Maybe I am the only one that thinks this is a problem, but I hate it when websites have a Flash or Silverlight video player embedded onto their page and don’t have an option to enlarge the video or open it in a pop-out window. Most people do fine using the fullscreen mode built into the player, but both Silverlight and Flash fail at fullscreen with multiple monitors since they will exit fullscreen as soon as you try and click on something outside the video (for security purposes apparently). This means you can either stop multi-tasking and only watch the video or you have to settle for a small video player that is absolutely tiny on anything larger than 1024 resolution width. For instance: here is a video that a friend brought to my attention as viewed on a 1600x900 LCD TV.

Sample

That is a lot of white space going to waste, and while the fullscreen mode works I often watch videos on the LCD TV while browsing the web on a projector connected to the same computer and end up switching in and out of fullscreen mode multiple times. I’ve had this same problem before with Channel9 and fixed it using a bookmarklet, but that actually requires clicking a link in my favorites every time I loaded a video, which was less than ideal. Now that Chrome supports userscripts I am working on converting some of the bookmarklets that I have into userscripts, and since I already have a working solution for Channel9 I thought that I would try creating one for TED.com next.

So if you ever watch TED videos on a machine with multiple monitors then today is your lucky day.  This userscript will enable multiple monitor fullscreen video for any page that uses the TED.Com flash player. Videos that are embedded from YouTube currently are not supported, but YouTube already lets you increase the width of the player, open it in a pop-out window, or even use the browser’s zoom feature to resize the video so that shouldn’t be a problem. And this time I did do some basic testing in both Firefox and Chrome so hopefully it will work as advertised.

Enjoy!

Friday, March 5, 2010

Slashdot Zoom Fix: a Greasemonkey and Chrome compatible userscript

One of my top ten visited websites is probably Http://www.slashdot.org who’s headline is “News for nerds, stuff that matters”. The site usually posts over 20 stories a day covering technology, math, physics, and computer related topics, which help me keep informed and entertained. One problem that I often have is that the text size is very hard to read when using a high resolution monitor or when using the projector that I have setup on a media center computer at home. Usually this is easy to solve, since one of the 11 buttons on my mouse is mapped to the CTRL button and can be combined with the scroll wheel to zoom in and out on web pages quickly. However in this case there are boxes on the right hand side of the screen that also enlarge when the page is zoomed, causing the text width to be smaller and more difficult to read. Here is an article on Slashdot normal and zoomed:

Slashdot Normal
Normal
Slashdot Zoomed
Zoomed

The site is designed using CSS and for some reason the advertisement and “Interviews” box will get bigger when the page is zoomed, shrinking the width of the textbox significantly. If you are a registered user Slashdot will not display the ad but will still have a side panel on the right that grows larger when you try and zoom in. I initially created a bookmarklet that could be used to remove the side panel by clicking on a link in my favorites to inject some javascript into the webpage to alter the CSS classes, but that still required that I manually click the button every time that I visit he website.

I planned on making a Google Chrome Extension, but recently Chrome announce that they now will automatically convert Greasemonkey scripts into Chrome Extensions. This means you can write a cross-browser compatible script that can run in Firefox, Chrome or any other browser that supports Greasemonkey scripts. It only took a few minutes to convert the bookmarklet into a userscript, and I posted it on userscripts.org so that anyone can use it. The script will automatically remove the sidebar so that zooming works much better. You can even change the script to automatically zoom a specified amount, however I left this off by default since the right amount highly depends on the width of your screen. Here are the results:

Slashdot Fix Zoom User Script  
Normal
Slashdot Fix Zoom User Script (zoomed)
Zoomed

 

Hopefully someone else will find this useful. I have a few other bookmarklets that I use and will probably convert into userscripts soon, as well as a few new ones that I want to work on when I get some free time.

Friday, February 19, 2010

Powershell, 7-Zip, Amazon S3 Upload Script with AES-256 Encryption

I recently was tasked with finding a way to store some backup files from our server in a secure and reliable off-site location. After talking with our hosting provider, who wanted around $600 a month for off-site tape rotation, we decided to look at using Amazon Simple Storage Service (Amazon S3) to store the files in the cloud instead. We needed a way to automate the upload process and make sure that the data was encrypted, so I spent a few days working on a Powershell script (using the excellent PowerGui Script Editor) that uses 7-zip to create a .7z archive with AES-256 encryption and then send it up to Amazon S3 using the Amazon Web Services SDK for .NET. Here is the script:

It seems to work pretty well so far, taking about 5-10 minutes to zip and encrypt 1GB of SQL Backups down to 100MB and then upload it to Amazon S3. From there we can use tools like CloudBerry S3 Explorer to browse or download files when needed. The monthly costs to keep data on Amazon S3 is $0.150 per GB, with $0.10 per GB transfer in and $0.150 per GB transfer out. With a 1 week backup retention and minimal data-out transfers we expect to pay around around $10 to $20 a month and should be able to access it much quicker than if we were using off-site tape storage. Cloud computing FTW!

UPDATE 6/14/2010: I just posted the script used to create the Database backup folder.

Friday, February 12, 2010

Parallel ForEach file processing in IronPython: Get some TPL love in IPY!

I recently have started playing around with the Task Parallel Library (TPL) that will be shipping in .NET 4, which lets you easily spread a workload across multiple cores using a simple Parallel.ForEach statement. I have a few IronPython scripts that I want to convert to start using multiple threads, so I thought I would try using the .NET 3.5 version of the TPL from inside of IronPython. It turns out that the TPL works just fine, but writing thread safe code in IronPython can be tricky. First off, the System.Threading.Interlocked class does not work on IronPython integers because python integers are immutable. That really sucks, because it means that the only way to change a global integer value in a thread safe manner is to use locks or the System.Threading.Monitor class. One interesting workaround is to use a python list instead of an integer. Appending a value to a list is an atomic operation, so instead of calling Interlocked.Add you can just append all the values to an empty list an use sum to return the aggregate value or len() to get the number of items in the list (emulating incrementing a variable). It is a bit of a hack, but sometimes easier that using locks. Also, you have to remember that the print statement is not thread safe, so instead you should use something like System.Text.StringBuilder to buffer your print statements and then print them once you are back to a single thread code section.

Below is a small sample script that I wrote to test processing a large number of text files using IronPython. It will search through all files in a given directory that match a given pattern (ie: *.txt) and return the total number of files and lines. It will also search for a given tolken inside each file and return the total number of matches. This is a task that is very easy to run on multiple cores and is a perfect fit for the Parallel.ForEach method. By default it will search your temporary files folder, which on my workstation had 69 .txt files with a total of 469,286 lines. The single threaded version took about 2.3 seconds to run and the multi-threaded version took between 0.8 and 1.0 seconds. The workload was spread across 8 cores, which caused a 2x improvement in speed, even thought this example is still primarily bound by the drive IO speed. Still it shows how using the TPL can greatly increase performance for common tasks.

Saturday, February 6, 2010

TPL and Parallel.ForEach in .Net 3.5 using Reactive Extensions for .NET (Rx)

The next version of the .Net framework and Visual Studio both have some pretty cool features to help programmers work with multiple cores, which is great but doesn’t help the majority of us that are stuck with .Net 3.5 for the foreseeable future. Luckily Erik Meijer and the Cloud Programability Team have back ported the Parallel Extensions Framework (PFX) to .Net 3.5 and Silverlight 3 as part of the Reactive Extensions for .NET (Rx). Rx adds the IObservable<T> and IObserver<T> interfaces, which are the mathematical duality of the IEnumerable<T> and IEnumerator<T> and provide tools for doing Reactive Programming. There are many different ways to use Rx, but internally they all use the Task Parallel Library (TPL) as the “special sauce” to automate processing tasks across multiple threads.

Parallel.ForEach is a part of the TPL that can be used to unroll an outer loop and have it run across multiple threads. Take the following example. This is standard single threaded code that loops through a collection of 2 card Texas Holdem starting hands and evaluates all possible 7 card hands that include those two cards:

There are over 2 million hands in the inner loop, which gets run once for each of the possible starting hands in the outer loop. There are 12 offsuit starting hands and 4 suited starting hands with an Ace and a King, which means that the outer loop would run 16 times, however those 16 executions are separate and could easily be run across multiple threads. That is where Parallel.ForEach comes in. Here is the same code, which will automatically be scheduled across multiple threads:

The bulk of the code is the same, with the only changes being that the inner block gets converted into a Lambda Expression (could have also used a delegate) and instead of incrementing lCount in the inner loop we increment a local loop variable and then atomically add it to the global value. We could have used System.Threading.Interlocked.Increment to atomically increment lCount inside the inner loop, but this adds a lot of unneeded locks that slow down all the threads. Keeping a local copy of the values and only locking once at the end provides much better performance.

And with those few small changes we are able to start using multiple threads, which on my local machine with 4 cores and 8 threads ended up decreasing the processing time from 1.55 seconds to 0.67 seconds and more than doubling the number of hands processed per second from 16,420,998 to 37,816,595.

If you want to start using the TPL you can download it here and add a reference to the System.Threading.dll located in the “C:\Program Files (x86)\Microsoft Reactive Extensions\Redist\DesktopV2” folder. And while you are at it you might as well play around with Rx too!

Enjoy!

Blog.TheG2.Net - Your guide to life in the Internet age.