Freenet

Freenet • Freenet Architecture • Goals • Properties • Searching a network • Searching/Routing algorithm • Adaptive behaviour • Differences with other algorithms • Keys • KSK keys, SSK keys and CHK keys • Network Evolution and Clustering • Clustering keys

Freenethttp://freenetproject.org • A decentralized system for storing and retrieving files within a distributed network. • Each participant provides some network storage space. • peers are servents – both provide storage and request it. • different philosophy to Gnutella - you do not have write access in Gnutella • Freenet is a storage and retrieval facility. • Clients add a file to the network but do not know the actual storage location • Information is kept private by employing various levels of encryption as the data traverses through the network. • Freenet also adapts itself according to usage patterns

Anonymity • the node requesting data does not normally connect directly to the node that has it. • instead, the data is routed across several intermediaries, none of which know which node requested the data or which one had it. • Encryption of data and relaying of requests makes it difficult to determine • who inserted content • who requested that content • where the content was stored • What the content actually is

Architect and Inventor of Freenet • Ian Clarke • Chief Executive Officer of Cematics Ltd • company he founded to commercialiseFreenet technology • Co-founder (and formerly the Chief Technology Officer) of Uprizer Inc., • successful in raising $4 million in A-round venture capital from investors including Intel Capital. • In October 2003, he was selected as one of the top 100 innovators under the age of 35 by the MIT Technology Review magazine • holds a degree in Artificial Intelligence and Computer Science from Edinburgh University, Scotland – now lives in Texas

Why Freenet? • designed to provide extensive protection from hostile attack • from both inside and out by addressing information privacy and survivability issues • Based around the P2P environment, which is inherently unreliable and untrustworthy • assume that all participants in the network could potentially be malicious or their peer could fail without warning. • implements a self-organizing routing mechanism over a decentralized structure • This algorithm dynamically creates a centralized/decentralized network..

Why Freenet? • The network learns • it routes queries in a better fashion from local not global knowledge • Achieves this by using file keys and sub-dividing the key space to partition the location of the stored files across the network • Freenettherefore provides a good example of how the various technologies discussed so far can be used within a innovative system: It addresses: • Centralized/Decentralized • DHT • Security (and Privacy) • Scalability

Populating the Freenet Network • File Keys: are used to route storage or retrieval requests onto the Freenet network • File keys are constructed from either user or the file itself (discussed later). • Routing Tables: each peer has a routing table • Stores file keys and location of key (i.e. on connected peers) e.g. see next slide

P1 1. Create Key e.g. from descriptive String 2. Ask Next Node 3. (a) Check Local Store (b) Check routing Table and find peer with closest key P2 4. Ask Next node Routing Table File Key – Peer ID (p4) File Key – Peer ID (p5) File Key – Peer ID(p3) … P3 P4 P5

Searching/Requesting • Searching: peers try and intelligently route requests • Peers ask neighbours (like Gnutella) BUT … • Peers do not forward request to all peers • They find the closest key to the one supplied in their local routing table and pass the request only to this peer - intelligent routing (subdividing keyspace) • At each hop keys are compared and request is passed to the closest matching peer And so on…

Example Key Mapping (X/2) + 1 -> X 0 -> X/2 X -> Y Y -> N 0 -> X for key = X/2 + 2

F A 1. A initiates request and asks B if it has file 2. B doesn’t so it asks best-bet peer = F 3. F doesn’t also and no more nodes to ask so returns “request failed” message 12. B sends file back to A4 B 7. B now detects that it has seen this request before so returns a “request failed” message 4. B Tries its second choice D 11. File sent to B E File is Here! 6. Nor C so forwards request to B 9. D now tries its second choice E 10. Success!! E then returns file back to D who propagates it back to A 5. D doesn’t have it so forwards request to C C D 8. C forwards “request failed back to D

Updating Routing Tables (pre v0.7) • if a peer forwards the request to a peer that can retrieve the data • then the address of the upstream peer (which contains or is closer to the data), is included in the reply. • This peer uses this information to update its local routing table to include the peer that has a more direct route to the data. • Then, when a similar request is issued again the peer can more effectively send the request to a node that is closer to the data.

Updating Routing Tables • Problem – easy to find nodes because node information is travelling with the messages • susceptible to ‘harvesting’ (as are Gnutella, DHTs etc) • easy to attack peers • Freenet is supposed to support anonymous publishing • version 0.7 supports Darknet mode • Connections are made manually by users • only share IP address with people you trust • problem: typically networks don’t scale, become fragmented • Problem: only efficient if network is clustered correctly • network must follow small world model

Jon Kleinberg’s explanation • Professor in CS at Cornell University • The possibility of routing efficiently depends on the proportion of connections that have different lengths with respect to the “position” of the nodes. • The proportion of connections with a certain length should be inverse to the length. • many short connections, few long connections • remember Chord DHT? • Remember Stanley Milgrim?

How Does Freenet do This? • Reverse engineer the node’s “position” in terms of the keyspace it inhabits. • The actual connections don’t change • i.e. look at the connections and try to reduce the distance of many, and keep a few longer distance connections. • How? • swap positions with other nodes. • nodes settle into a centralized/decentralized architecture • small world Video at http://freenetproject.org/22c3vid.html

Adaptive behaviour? • dynamic algorithm used by Freenet to update its knowledge is analogous to the way humans reinforce decisions based on prior experiences. • Milgrim noted that 25% of all requests went through the same person (the local shopkeeper). The people in this experiment used their experience of the local inhabitants to attempt to forward the letter to the best person who could help it reach its destination.

Adaptive behaviour? • the local shopkeeper was a good choice because he knew a number of out-of-town people and therefore could help the letter get closer to its destination. • If this experiment were repeated using the same people, then surely the word would spread quickly within Omaha that the shopkeeper is a good place to forward the letter to and subsequently, the success rate and efficiency would improve - people in Omaha would learn to route better ! • This is what Freenet does -> adapts routing tables based on prior experiences

Adaptive behaviour? • Freenet supports this with both Opennet and Darknet • Opennet – dynamically discover nodes who are more likely to have keys within a particular range by logging responses • Darknet – dynamically change routing table to achieve a Kleinberg distribution • many short distances, few long distances • achieve this within the group of trusted peers • Opennets and Darknets can be bridged • so a Darknet is not cut off from the open network

Similarities with Other Techniques? • Gnutella: a user searches the network by broadcasting its request to every node within a given TTL. • Napster: on the other hand, uses a central database that contains the locations of all files on the network. • DHTs: optimize search through the use of a key space and a ‘distance metric’ – how far is a node ID from a key? • Gnutella, in its basic form, is inefficient and Napster, also in its simplest form, is simply not scalable and is subject to attack due the the centralization of its file indexing. • However, both matured into using multiple caching servers in order to be able to scale the network • Resulting in a centralized/decentralized topology • DHT efficiency relies on peers being equal – flat topology

But the Freenet Approach … • caching services (I.e. super peers or Napster indexes) form the basic building block of the Freenet network • each peer contains a routing table • Keys are used as in DHTs. • The key difference is that Freenet peers do not store locations of files like in Gnutella/Napster. • Rather they contain file keys that indicate the direction in the key space where the file is likely to be stored. • routing evolves based on previous requests, unlike DHTs. • but there are many different types of keys …

Keys Three types of keys: • Keyword-Signed Keys (KSK): the simplest of Freenet keys • derived directly from a descriptive string that the user chooses for the file • Signed-Subspace Keys (SSK): are used to create a subspace • to define ownership • to make pointers to a file or a collection of files. • Content-Hash Keys (CHK): used for files that don’t change • obtained by hashing the contents of the data to be stored.

KSKeys Descriptive String Deterministically Generate i.e. string always creates the same keys because it is used as the seed Public Key Private Key Symmetric Key Hash – used for storage Digitally Sign encrypt Keyword Signed Keys (KSK) Derived from short File description. File KSK

KSK Keys • The file description is used to generate a public/private keypair, and a symmetric encryption key. • The public half of the keypair is stored with the data. This is used to verify the data. • The symmetric encryption key is used to encrypt the file itself. • plausible deniability • The private half of the keypair is used to sign the file. • So the data can be verified against the public key • To retrieve the file, someone only needs to know the file description, since the decrypting key and the file's index can be derived from this. • Problems: • a flat namespace, collision of files with the same description • 'key-squatting' - inserting junk under common descriptions.

KSK Keys • Key Generation: • derived from a descriptive string in a deterministic manner • Therefore same key pair gets created for the same description • Change the string a new key gets generated and therefore a new file gets created • Create the same key, old file gets overwritten • Ownership: • None -> file is owned only by descriptive string

Signed Subspace SSKeys Private Key Public Key Symmetric Key Hash Sign Hash encrypt File Signed Subspace Keys (SSK)

SSKeys • Key Generation: • derived from subspace key pair + symmetric key • Unique because the keys are generated randomly • The public key hash is used for searching • The symmetric key is needed to decrypt the file, but not for searching – storage nodes do not keep this. • Ownership: • Creates a read-only file system for all users • Only owners of the subspace can over-write the files within the subspace i.e. need private subspace key to generate the correct signature. • Nodes storing the file need to honour this write access • But authenticity can always be determined (i.e. they can’t pretend it was you).

Updateable SSKeys • A user friendly wrapper around SSKeys • Allows a version number to be appended to the key. • A positive version number means your local freenet node will return the nearest version it has and then go off in the background and try to find closer ones. • A negative version number means your local freenet node will search for this version + four newer versions. If it finds only the specified version, it will return that. If it finds any of the others it will begin a search for another batch of five. • When inserting, your local node will set the version number automatically.

File to Store CHKeys SHA-1 Secure Hashing Content Hash Key (CHK) encrypt Symmetric Key File GUID (Direct reference to file contents)

Content Hash Keys • Key Generation: • derived directly from the contents of the file • symmetric key is used to encrypt the file • nodes storing the file do not keep this key so they can ‘plausibly deny’ knowledge of its contents • Ownership: • None -> normally associated with a subspace to define ownership • i.e. a SSK is like a folder containing files accessed via CHKs

Analogies for Keys Three types of keys: • Keyword-Signed Keys (KSK): • Like filenames on a file system • But analogous to having all files in one directory • Signed-Subspace Keys (SSK): • Can contain collections of filenames • Analogous to using (multiple level) directories • Content-Hash Keys (CHK): • Like inodes on a file system I.e. a pointer to the file on disk

Distribution of keys within the Keyspace • Key Generation: • ALL keys use hash functions to create final key value • Hash functions have a good avalanche effect • Therefore input has no correlation with output • So, 2 very similar files will create two completely different hash keys (CHKs) • Therefore, similar files will be put in completely different parts of the network

Properties of key Distribution • No, it helps the file distribution across the network • Imagine an experiment -> all data may be quite similar (e.g. peoples faces, star characteristics etc.) • But the Freenet keys will create quasi-random keys from these files • Ensures even (random) distribution across ALL peers within the network. • So concept of node and key ‘distance’ in Freenet is not to do with semantic closeness or even similarity in terms of the bytes of file. • instead, similarity between keys • Does this random behaviour matter?

Freenet • The end. • Demonstrates how some of the technologies can be used in a system e.g. security and privacy policies/techniques • Show how centralized-decentralized models can be dynamically created in a self-organizing fashion

Freenet

Freenet

Presentation Transcript

Freenet: A Distributed Anonymous Information Storage and Retrieval System

Freenet: A Distributed Anonymous Information Storage and Retrieval System

Freenet

FreeNet: A Distributed Anonymous Information Storage and Retrieval System

Freenet

Freenet: A Distributed Anonymous Information Storage and Retrieval System

Freenet

Freenet: A Distributed Anonymous Information Storage and Retrieval System

Gnutella and Freenet

Freenet

Freenet

Freenet

P2P Networking: Freenet

Freenet - A Distributed Anonymous Information Storage and Retrieval System

Gnutella, Freenet and Peer to Peer Networks

Freenet: A Distributed Anonymous Information Storage and Retrieval System

Freenet A Distributed Anonymous Information Storage and Retrieval System

Freenet: A Distributed Anonymous Information Storage and Retrieval System

Freenet