In our last post, we gave a brief overview of how podcasts worked and defined some terms. Now, it is time to get into the dirty details of looking at podcast traffic logs.
The first place to look is our file download logs. The podcast sound files for our modest in-house blogs were downloaded 2,334 times on January 16th. That seems pretty impressive, as it suggests we had an audience of more than 60,000 people per month. If we group our results by IP address, we can see that 1,949 distinct IP addressed downloaded from our site. This is because a number of IP addresses downloaded files from us, more than once. 311 of our 1949 IP addresses download more than one copy of the same file. Their 696 downloads are about 30% of our 2,334 total downloads.
Why were particular users (IP addresses) downloading the same file over and over? For instance, IP address 68.46.125.236 turns out to be Comcast--a major internet provider. (It is easy to look up IP addresses using reverse DNS lookup from sites such as this.) The 7 downloads of our reading of the poem "Dreams By Langston Hughes" could have been from a single user hitting the download button 7 times. However, it is more likely that seven of our listeners are using computers served by Comcast. When these users think they are connecting to us, they are actually connecting to a Comcast server that feeds their request through a different server and to us.
Note that Comcast also shows up also as IP address 76.27.59.224 and several other addresses. Major Internet Service Providers (ISPs) such as Comcast, Verizon, and Speakeasy feed their traffic through many different IP addresses--as do portals such as Yahoo!, AOL, and MSN.
We also see multiple downloads from IP addresses such as 202.108.23.56, which was responsible for a total of 80 downloads, spread over 21 different files. A reverse DNS on this address tells us only that the source is in China. We can assume that these requests are coming from a Chinese internet provider similar to Comcast--one who has not fully disclosed its ownership of this server. Since our "house" podcasts include English readings of the Bible, English readings of the Koran, and English readings of both classic and contemporary poetry, I'd guess that a lot of Chinese people are listening to them as part learning English. It is nice to know that we are doing some good for the world! By the way, lest you think that only Chinese people want to learn English, we also had at least 3 downloads from 83.202.100.211, which is Wanadoo, in France.
Of our 2,334 downloads, all but 41 were done by "agents." These agents are also called "podcatchers." There are twenty podcatchers listed on PodcatcherMatrix--and this nicely organized site helps you compare their features and functions. The main function of a podcatcher is to watch the RSS feed of a podcast for new episodes. When a new episode appears, the podcatcher grabs it and downloads it to the user's computer. They can also provide a license or password authorization (when it is required), allow searching for new podcast feeds, and manage sychronizing the list of episodes on a user's computer with the list of episodes on her or his iPod or other portable player.
iTunes was responsible for more than 50% of the downloads on our site that were handled by agents. Mozilla (the agent name used by Firefox browsers) added another 13%, followed by another 3% from download agents such as Doppler, Juice, and NS Player. Unfortunately, about 34% of our downloads come from what are commonly referred to as "bots." These are programs that scan the Internet looking for useful and interesting information. When they find "content" such as our sound file, they greedily download it and store it away for later review and digestion by their owners. Some of these bot downloads probably are from legitimate search firms or podcast directories. The rest...well no one really knows where all the data absorbed by bots, really goes.
(By the way, remember that the RSS document was 64% iTunes? Once you subtract the bots from the file download log, we get 77% of downloads via iTunes. Our RSS log also gets bot requests--about 10% of our requests come from them. Take these out and we get an adjusted percentage of iTunes RSS request of about 70%. This is close enough to make us comfortable.)
If we subtract the 803 downloads from bots, we are left with a net of 1,531 downloads to listeners. Of course, some of the multiple downloads may be due to retries (when a download fails and a user tries a second time). I have seen an estimate on Jason VanOrden's blog that retries account for between 5% and 20% of all downloads. I think I'll use the 5% number in our situation, and claim that we had 1,531 x 95% = 1,454 downloads on January 16th.
Short of getting users to install some kind of tracking software (possible--Volomedia has a pretty cool one), there is no way to know how many of these downloads were actually listened to. But, no one knows how many people actually read newspapers or listen to programs that are broadcast on radio or TV. Still, we'd like to know a bit more about our listeners, including how they found our podcast and whether or not they have subscribed to it. We'll try to get more data on this from looking at our RSS request log, in my next post.



