Ever since 1999, when the first P2P application, called Napster, appeared on the scene, downloading files using file sharing networks has become a national sport in almost all countries of the world. Today, almost everybody knows about torrents, sites like The Pirate Bay are more popular than ever, BitTorrent makes up to a very large number of the world wide Internet traffic.
But many people don’t know how file sharing works, some don’t even understanding what the sharing part is all about. After all, you just download files, and don’t publish them yourself, don’t you?
What is P2P file sharing?
P2P stands for person to person, meaning that data gets exchanged directly between users. For example, when you download a torrent, or use one of the many other file sharing networks, your client (uTorrent, Vuze etc.) connects to other people also looking to download the same file (peers) or have already finished downloading but are still connected (seeds).
So all the data you download comes from other users, and, sure enough, as soon as you also receive some chunks of the file, your own client becomes a distributor, sending parts of the file to others. Your computer has become a participant in the network (called “swarm” in BitTorrent) and shares data with the others.
This is where the “sharing” comes from. Now, before BitTorrent became incredibly popular, other networks were well-known in the global community. All rely on the same principle however: Eliminating or severely reducing the need for central servers with loads of bandwidth, by instead shifting the job of transferring data to individual users.
How does your file sharing software know who else on the Internet is also part of the fun? This depends on the network you use, and there are various protocols.
The easiest scenario is a centralized system, based on one server who coordinates the whole data exchange; the now defunct Napster, DC++ (aka Direct Connect), OpenNap, eDonkey, or Soulseek (which is still alive and kicking) are notable examples. Here, one central server has a database of all users, often along with what files they share, what IP they can be reached at and so on, organizes the whole thing.
In consequence, a number of problems arise. For example, it is really easy to attack the network in this scenario: One only has to get a hold on the owner of the server, and threaten him with lawsuits, force him into applying filters for what files can be shared, and so on.
BitTorrent also used to heavily rely on such centralized servers, the so called trackers. However, there is not only one but a rather large number of these servers, making it difficult to take the network down by just attacking one person, and other protocols such as the decentralized DHT evens things out even more.
To address these problems, numerous other protocols have been invented, all more or less decentralized. All of these rely on using no servers, instead each client keeps its own list of known network participants, which are exchanged between each other. New users get “initiated” into the club by receiving a list of known users as a starting point.
Any request made gets passed around among users, without the need for a central server. Some clients deemed particularly strong, are given the role of “supernode” in some protocols, adopting the role of a server or at least some of its features, to speed things up. One notable example of this practice is FastTrack, the network both file sharing software KaZaA and the voice-over-IP program Skype rely on.
Other examples of decentralized networks include DHT (for BitTorrent), Kademlia (eMule), Gnutella (Limewire) and Gnutella2 (Shareaza).