Network Coding for P2P Content Distribution
Primary Investigators: Di Niu, Baochun Li
From a theoretical perspective, this project aims to understand the performance of randomized network coding in P2P content distribution applications. Stochastic modelling, randomized algorithms, and the large deviation theory are among the tools heavily used. We mainly have 3 pieces of contributions, focusing on the optimality, the topological effects and peer churn, respectively.
Randomized P2P Broadcast with Network Coding and its Asymptotic Optimality: In this work, we consider the problem of distributing k blocks from a source to N nodes in a P2P network with both node upload and download capacity constraints. As k scales up, we prove for homogeneous networks that if network coding is allowed, randomly matching senders and receivers in each time slot asymptotically achieves the maximum downloading rate at each node. For heterogeneous networks with network coding allowed, we show that a fair and optimal downloading rate at each node can be approached, if in each time slot, each node randomly allocates its upload bandwidth to its receivers that have available download bandwidth. We also give a performance lower bound of the above randomized coded dissemination when both k and N scale under certain conditions. The proofs involve a large deviation analysis of randomized algorithms on a trellis graph.

Disseminating multiple blocks in a P2P network with both node upload and download capacity constraints. In contrast, most previous work assumes infinite download capacity.
These results demonstrate that with network coding, simple randomized receiver selection and rate allocation are enough to achieve P2P broadcast capacity, laying down a theoretical foundation for mesh-based P2P networks with network coding. The main results have been published in IEEE INFOCOM 2011 (pdf). Some preliminary results on non-coding randomized broadcast and topologies other than complete graphs have also been published in Allerton Conference on Communication, Control, and Computing 2009 (pdf).
Topological Properties Affect the Power of Network Coding in P2P Broadcast: There exists a certain level of ambiguity regarding whether network coding can further improve download per- formance in P2P content distribution systems, as compared to commonly applied heuristics such as rarest first protocols. In this work, we revisit the problem of broadcasting multiple data blocks from a single source in an overlay network using gossip- like protocols. Our new finding reveals that the marginal benefit of network coding critically depends on the dynamics of network topologies. We show that although network coding is optimal as a block selection mechanism, simple non-coding protocols are close to optimal in complete and random graphs, leading to marginal benefits of network coding. However, network coding demonstrates salient benefits in clustered and time-varying topologies, which are common in real-world systems with ISP-locality mechanisms implemented. Through both theoretical analysis and simulation results, we unveil the underlying reasons behind discrepancies in the power of network coding under different scenarios.
These results have been published in IEEE INFOCOM 2010 (pdf).
The Content Availability and Resilience of Network Coding in P2P Networks with Churn: Most current-generation P2P content distribution protocols use fine-granularity blocks to distribute content to all the peers in a decentralized fashion. Such protocols often suffer from a significant degree of imbalance in block distributions, especially when the users are highly dynamic. As certain blocks become rare or even unavailable, content availability and download efficiency are adversely affected. Randomized network coding may improve block diversity and availability in P2P networks, as coded blocks are equally innovative and useful to peers. However, the computational complexity of network coding mandates that, in reality, network coding needs to be performed within segments, each containing a subset of blocks.
- When randomized network coding is performed over different segment sizes, its resilience to the content loss caused by peer churn is different. This work characterizes the resilience of network coding as a dynamic P2P content distribution session proceeds stochastically.
In this work, we quantitatively evaluate how network coding may improve content availability, block diversity and download performance in the presence of churn, as the number of blocks in each segment for coding varies. By using a differential equation approach to approximate a class of density dependent jump Markov processes, we explore the fundamental tradeoff between the resilience gain of network coding to peer dynamics and its inherent coding complexity. We conclude that a small number of blocks in each segment is sufficient to realize the major benefits of network coding, with acceptable coding cost.
These results have been published in IEEE International Workshop on Quality of Service 2007 (pdf) and will appear in IEEE Transactions on Parallel and Distributed Systems (pdf).




