Find tracks not in iTunes library

I’ve written a python script that walks through a path tree, checking to see if each file in the tree is a track in the current user’s iTunes Music Library.xml file.

 1     import os, re
 2     
 3     startpath = "/Volumes/Media/Music/"
 4     prefix = "file://localhost"
 5     library = os.path.expanduser("~")+"/Music/iTunes/iTunes Music Library.xml"
 6     
 7     def eachpath(arg, path, tracks):
 8         for track in tracks:
 9             if os.path.isfile(os.path.abspath(path)+'/'+track):
10                 trackpath = os.path.join(os.path.abspath(path),track)
11                 grepstr = prefix+trackpath.replace(" ","%20")
12                 if grepstr not in data:
13                     arg.append(grepstr)
14     
15     data = open(library).read()
16     missing = []
17     os.path.walk(startpath, eachpath, missing)
18     
19     print missing

It’s not flawless – on my machine it eats up over 11 meg of memory, and takes ages to run, but as a proof of concept, it works okay. The memory it uses is mostly because is stores the whole iTunes library file in memory, so that’s 9 meg on my system already. The main loop is doing a string1 not in string2, which is probably not optimal, but it was easy to code, for now. I’m still waiting to see how long it takes to do my whole library, but I’m getting bored with waiting. Edit: to reduce the time taken, I used the following code in the final if clause in the function:

1 try:
2     if not re.search(grepstr, data):
3         arg.append(grepstr)
4 except:
5     if grepstr not in data:
6         arg.append(grepstr)

The re one is much faster, but fails in some cases: the second one, while slower, is a fallback. There are also some other issues, at this stage I have not cared that much about escaped characters, which iTunes uses when storing the information. But, I came up with a quicker method than python’s os.path.walk(). Using the find command is much quicker:

1     find /Volumes/Media/Music -type f -not -name .aacgained -not -name ._* -not -name .DS_Store

takes between 12-36 seconds for my 5700+ library stored on my NSLU2. If I telnet into the NSLU2 and run the equivalent command:

1     find ~media/Music -type f -not -name .aacgained -not -name ._* -not -name .DS_Store

it takes on average less than one second to complete. So, that’s more than an order of magnitude, even if the network traffic is low. Oh, and it compares very favourably with the python version, which takes at least one minute to run.

iTunes Shared Library

Jaq and I share two computers, an iMac G4, and a Dell PC. I also bought a Linksys NSLU2 and a large USB Hard Drive, so that all of our music and videos can be stored on a server, and accessed from either computer (or the Xbox) without having to make sure the iMac was on. (That was the main computer, and the one we fight to get onto). Of course, the NSLU2 helped remove clutter from the iMac’s hard drive, not to mention freeing up a heap of space. Anyway, because we have a rather large music collection, it’s meaningless and wasteful to have copies of music stored in two places – I set up an SMB share on the NSLU2, wrote a small AppleScript to mount this on bootup, and pointed iTunes towards this location. This has the feature of storing one copy of all of our music, in the one location. There are some drawbacks, however:

  1. If I import music, it doesn’t appear in Jaq’s library by default. Similarly, if she imports, I don’t see it. Every now and then you need to drag the Music folder onto iTunes, and wait for it to update the library. Both of us need to do this, incase both of us have imported music.
  2. If one of us edits a track’s artist, title or album, iTunes for the other user sometimes cannot find the track. If you then re-import you wind up with two copies of the track, one of them (the one with the rating, and playcount) is a ‘dead track’.
  3. Sometimes a re-import causes a track to appear twice in the library. Sometimes it creates a second copy of the file in the directory.
  4. There is a lot of music in our library that one or other of us doesn’t really like that much. For instance, I listen to a lot of Classical music, but don’t like Red Hot Chilli Peppers. Even if you remove a track from your library, it gets re-added when you do a re-import.

Of course, there are some great benefits, too:

  1. Each person gets to have their own rating and playcount for each track. Initially we had a shared iTunes library file (both of us had read/write access to it), which worked well when only one user could run iTunes at a time, but fails dismally when multi-user is taken into account.
  2. Our iTunes library only takes up half of the space.
  3. I can modify the tags belonging to a track, and it gets propogated to her library.

I’m fairly confident the benefits outweigh the costs, but I’m still keen to come up with a better solution. Here are some ideas I have had to resolve some of the issues. iTunes stores a copy of it’s library in an XML file – it should be a trivial task to scan this and get some information that might be useful. For instance, compare the location field of each track to the directory structure, and work out if there are tracks that need to be added, or the path they have needs to be updated.

This requires a couple of things: * A decent XML parser (in some cases just a simple grep will do the trick – for instance seeing if a filename exists in the XML file). * An interface (AppleScript) to iTunes, to tell it to add/remove/re-locate files.

iPod Rating

I have been using Jaq’s iPod the last couple of days – mainly because when I am waiting at the Adelaide Railway Station, and at Lynton Station, I cannot get good enough radio reception on my phone’s radio. So, I copied all of my Classical music onto the iPod, and I’ve been listening away. I’d like to be able to rate tracks on the iPod, but I don’t really want to, for one big reason. I would only be able to rate them at 0-5 stars, not using the 0-100% ratings I use from iTunesRater. I wonder if it’s possible to hack the iPod firmware so that rating occurs by a finer gradient?