Duplicate music file finder code.
Jan 21st, 2008 by jbloom
I’ve been consolidating my music collection and found that there were lots of duplicate files.
Most of the dupes were named something like “Happy Birthday 1.mp3″ and “Happy Birthday.mp3″ would exist in the same directory. I’m not sure which program added these dupes, but removing 2500 or so of em by hand would not be fun.
Without further ado, here’s some python code that takes care of that problem for you. It only examines the filename, not the date or the bitrate or the actual file contents, etc. But you could of course extend it to do all those things.
Enjoy.
#------------------------------------------------------------------------------- # Name: cleanDuplicateMusicFiles.py # Purpose: Loops over a directory structure looking for 'duplicate' music files # and moving them to a safe directory for deletion. # # Author: Joshua Bloom # # Created: 01/18/2008 #------------------------------------------------------------------------------- #!/usr/bin/env python import os import sys dupList = [] rootDirectory = "/Users/joshbloom/Music/iTunes/iTunes Music" sequesteredFilesDirectory = "/Users/joshbloom" def main(): print "Starting search ..." checkDir(rootDirectory) print "Found %s dupes" % len(dupList) def checkDir(path): print "Checking path '%s' for duplicates" % os.path.basename(path) for item in [ os.path.join(path, x) for x in os.listdir(path) ]: if os.path.isdir(item): checkDir(item) else: checkForDupe(item) def checkForDupe(fName): '''Example: if we find 'Happy Birthday 1.mp3' and 'Happy Birthday.mp3' exists in the same directory we consider this a duplicate and send it for re-education. ''' fileName = os.path.basename(fName) folderList = os.listdir(os.path.dirname(fName)) if fileName.endswith("1.mp3"): for otherName in folderList: if otherName != fileName: #Make sure we aren't comparing with the current file if os.path.basename (otherName).startswith(fileName[:-6]): #This is a duplicate dupList.append(fName) sequesterDup(fName) def sequesterDup(fName): ''' Move em to a new folder, if you were confident you could change this function to delete the file. ''' try: print "Moving file: %s" % fName os.rename(fName, os.path.join(sequesteredFilesDirectory, os.path.basename(fName)) ) except Exception, E: print E if __name__ == '__main__': main()