Cleaning and scrubbing the music library
With the arrival of the Squeezebox I’ve taken on a task I should have completed ages ago: cleaning up the mess that is my music collection. There is more than one – tons of MP3 files are sitting around on external drives waiting to be added to iTunes. Multiple copies of the same file have been placed in my initially carefully considered hierarchical directory structure, resulting in many duplicate titles.
Whilst there are various ways of trying to identify duplicate files, I wasn’t in the mood to rely on iTunes‘ ability to display duplicate tracks. There are command line utilities for checking and identifying duplicate files. Once again, that isn’t something I’m interested in: quick identification and easy deletion or archival where at the top of my list of priorities. A quick Google search resulted in me downloading the first program I came across on a legitimate link: Araxis FInd Duplicate Files. The application appeals because of a very simple user interface and an ability to check each file for its size and checksum, amongst other attributes. The fully-featured application is free to use for a couple of days. After a quick test, I purchased it at USD 15, a very reasonable price.
Find Duplicate Files provides a simple user interface: select the folders or locations to scan, then start the scan an walk away. Various preferences can be set that allow only certain file types to be identified and the action to take once duplicate files are processed.

One improvement I would suggest is that the list of found duplicates be populated in real-time. My current music library was roughly 340GB in size, requiring a fair bit of searching to run through – during the entire time, the results section of the application stayed empty, until the search operation had concluded. A minor gripe, but something that would alert the user to the fact that something is busy happening.
Once the identification of duplicate files has been completed, a list of all attributes is presented, with various colours being used to separate individual groups of duplicates. In the case of the music tracks, cover art is displayed if available.

I performed a few rudimentary checks to ensure the application had indeed found five or so examples of where I knew duplicates to exist. Instead of deleting the duplicates, I decided to archive them to another location just in case. Over 10000 duplicates existed, freeing up over 50GB of disk space. Not bad going, and certainly a great way to rid the iTunes library of additional burden.