Peer-to-peer is “The sharing of computer resources and services by direct exchange between systems” (Ding, et al., 2004). This was initially created to allow for higher performance by connecting computers together directly without needing to be coordinated by a central server (which would have to be very powerful to coordinate a large number of computers). This had the side effect of creating decentralised networks where no computer is more powerful than the others, with each node on the network storing it’s own data and nodes managing trust of each other to decide which other nodes to work with.
Peer-to-peer technologies are often associated with piracy due to the early uses by the likes of Napster and later KaZaa, however they have more recently been used by projects as diverse as medical research, search for extra-terrestrial intelligence, or even distributed currencies.
Through discussion we decided that having a centralised store of the tags would dissuade other companies from using the system due to the fact that they often like to be in control of their own data. A peer-to-peer system ensures that even if someone wanted to take control of the tags they could not as the other peers could simply start ignoring that node.
Most peer to peer systems use their own binary protocol for communications although some standardised communication protocols exist which are open, such as Sun JXTA. This project has not really caught on and complete implementations only exist for C and Java. JXTA is XML based and as such is quite verbose – a quality which is not really needed in a high performance peer-to-peer network.
It is likely that the servers in the etags system will use a custom (but standardised) protocol to communicate. Possibly using HTTP (so that implementations could be created using entirely web-based languages such as PHP), although this imposes its own overhead on communication so this may need to be avoided. More details about the protocol will be published on this blog when they are known.
For now, one potential problem with a decentralised tag store as we have proposed is that whenever a server is attempting to find out if another servers “owns” a user’s data, it is possibly that a malicious server could claim ownership of every user, and establish itself as a very important node on the network. This could be avoided by first posting the “looking for the owner” request with a hashed version of the user’s identification, and the server claiming ownership must respond with the identifying information hashed using a different algorithm. This would prove that both servers know the user and will stop faking.
Ding, C. H., Nutanong, S., & Buyya, R. (2004, February 10). P2P Networks for Content Sharing. arXiv.org.