# Nexus of Information and Computation Theories Distributed Computation and Communication Theme February 1 – 12, 2016

### Titles, Abstracts, and Videos

Allison Bishop (Columbia)       (Video)
Interactive Coding for Intercative Proofs

Karen Censor-Hillel (Technion)
Distributed Algorithms as Combinatorial Structures

Abstract: For many distributed computing problems there exists a characterizing combinatorial structure. In other words, the existence of an algorithm for solving the problem on any graph $$G$$ in time $$T(G)$$ implies the existence of the structure in $$G$$ with some parameter related to $$T(G)$$, and vice versa. Such relations go back to classic examples, such as synchronizers and sparse spanners, and continue to emerge in recent studies of gossip algorithms and multicast algorithms in different communication settings. In this talk I will give an overview of both old and recent results that illustrate these connections. I will show how finding distributed algorithms as well as proving lower bounds can be reduced to studying combinatorial graph structures.

No previous knowledge in distributed computing will be assumed.

Topology matters in communication

Abstract: We study communication cost of computing functions when inputs are distributed among k processors, each of which is located at one vertex of a network/graph called a terminal. Every other node of the network also has a processor, with no input. The communication is point-to-point and the cost is the total number of bits exchanged by the protocol, in the worst case, on all edges.

Our results show the effect of topology of the network on the total communication cost. We prove tight bounds for simple functions like Element-Distinctness (ED), which depend on the 1-median of the graph. On the other hand, we show that for a large class of natural functions like Set-Disjointness the communication cost is essentially n times the cost of the optimal Steiner tree connecting the terminals. Further, we show for natural composed functions like ED of XOR and XOR of ED, the naive protocols suggested by their definition is optimal for general networks. Interestingly, the bounds for these functions depend on more involved topological parameters that are a combination of Steiner tree and 1-median costs.

To obtain our results, we use some tools like metric embeddings and linear programming whose use in the context of communication complexity is novel as far as we know.

Based on joint works with Jaikumar Radhakrishnan and Atri Rudra.

Strong Data Processing and the Entropy Power Inequality

Abstract: Proving an impossibility result in information theory typically boils down to quantifying a tension between information measures that naturally emerge in an operational setting and then showing that the extremal points satisfy a single-letterization (tensorization) property. In contrast, studies on ‘strong data processing’ essentially dispense with the operational aspect and directly investigate functionals involving information measures and the properties they enjoy (e.g., tensorization). In this talk, we adopt the latter approach and prove a strengthening of Shannon's Entropy Power Inequality (EPI) that is closely tied to strong data processing for Gaussian channels. The new inequality generalizes Costa's EPI and leads to short converse proofs for Gaussian multi terminal source coding problems where none were previously known.

Klim Efremenko (Tel-Aviv University)    (Video: Part 1, Part 2)
Lower bounds for coding for multiparty interactive communication

Abstract: We study coding schemes for multiparty interactive communication over synchronous networks that suffer from stochastic noise. In the work of Rajagopalan Schuman from 94, it was shown that there exists a coding scheme for any network with slowdown of $$O(\log n)$$. However, until now there was no lower bounds on the rate of coding in networks. In this talk, I will show first lower bound of $$\Omega(\log n/\log \log n)$$ for coding in star network.

Ofer Feinerman (Weizmann Institute)    (Video)
Colony or ant, whose in charge?

Abstract: Cooperative transport of large items is a behavior that is extremely rare outside humans and ants. Indeed, this is a complex behavior that requires non-trivial coordination not only in tugging but also in more cognitive tasks such as navigation and problem solving. I will present several aspects of this behavior while focusing on the origins of this group cognition. Is it a manifestation of the abilities of a single ant or, rather, an emergent consequence of the communication between a large number of individuals?

Péter Gács (Boston University)     (Video: Part 1, Part 2)
Relation between monotonic complexity and algorithmic probability

Abstract: Some versions of Kolmogorov complexity are better suited than others when regarding finite sequences as starting segments of infinite sequences. Levin and Schnorr introduced monoton complexity, while Solomonoff and Levin introduced algorithmic probability, the distribution at the output of a universal monotonic machine reading a random coin-tossing input sequence. Monotonic complexity and the negative logarithm of algorithmic probability are very closely related, but can still be distinguished – as an old result of the author showed, using a game argument. The talk will survey the result and its proof, the significant improvement by Adam Day, and the open questions.

When does a set of large algorithmic probability have simple members?

Abstract: Consider algorithmic probability over finite strings: the output distribution of an optimal prefix machine. The negative logarithm of this quantity is equal to the prefix complexity. If a set has large algorithmic probability it does not necessarily have a simple member, since it could be deliberately assembled to avoid this. However, as Epstein and Levin have shown, such sets are exceptional: they have large information about the halting problem. I will discuss this result and outline its proof.

Ran Gelles (Princeton University)   (Video: Part 1, Part 2)
Coding for interactive communication - a tutorial

Abstract: In this tutorial I will introduce the task of coding in the interactive setting, and survey two of the main methods used in coding schemes: “tree codes” and “rewind-if-error”.

Tree codes - relaxations, constructions and applications

Abstract: In this talk/tutorial I will describe several relaxations of tree codes that can be efficiently constructed (such as local tree codes and potent tree codes), discussing their applications to coding in the interactive setting and describing their efficient constructions. Time permitting, I will discuss stronger types of tree codes, such as edit-distance tree codes.

Communication with Contextual Uncertainty

Abstract: Many natural forms of communication involve two (or more) players interacting based on a shared context, where the context is huge and imperfectly shared. This large context typically helps compress communication even when the sharing is imperfect. How can we explain this phenomenon mathematically?

In this talk, we will argue that communication complexity gives the right lens to view this class of problems, and this class of problems gives rise to new questions about communication complexity. I will describe some of these questions, some partial answers and some confounding open questions.

Based on joint works with Ilan Komargodski, Pravesh Kothari and Madhu Sudan.

George Giakkoupis (INRIA)     (Video)

Abstract: Randomized rumor spreading is a fundamental randomized primitive for broadcasting information in networks, in a simple, efficient, and fault-tolerant manner. It was proposed more than 30 years ago, but it has been intensively studied over the last few years. In the first half of the talk, I will give an overview of selected recent results in the area. A standard assumption in the analysis of randomized rumor spreading has been that nodes  take steps in perfectly synchronized rounds. In the second half of the talk, I will present some new results on the impact of relaxing this strict synchrony assumption.

Amin Aminzadeh Gohari (Sharif University of Technology)    (Video: Part 1, Part 2)
From Gács-Körner’s common information to Distributed Santha-Vazirani Sources

Abstract: This talk relates a result on randomness extraction in information theory to one in computer science. Gács and Körner characterize common randomness extraction from i.i.d. sources. We extend this result to adversarial sources. In doing so, we generalize the result of Santha and Vazirani on deterministic randomness extractors to non-binary distributed sources. A Santha-Vazirani source is a sequence of random bits where the conditional distribution of each bit, given the previous bits, can be partially controlled by an adversary. Since the proof of Gács and Körner using the tensorization property of maximal correlation is very specific to i.i.d. distributions, the key to this generalization is finding a new proof for the Gács and Körner's result that has the flavor of Santha-Vazirani's proof and admits a generalization to adversarial sources.

High Probability Guarantees in Repeated Games: Theory and Applications in Information Theory

Abstract: We introduce a “high probability” framework for repeated games with incomplete information. In our non-equilibrium setting, players aim to guarantee a certain payoff with high probability, rather than in expected value. We provide a high probability counterpart of the classical result of Mertens and Zamir for the zero-sum repeated games. Any payoff that can be guaranteed with high probability can be guaranteed in expectation, but the reverse is not true. Hence, unlike the average payoff case where the payoff guaranteed by each player is the negative of the payoff by the other player, the two guaranteed payoffs would differ in the high probability framework. One motivation for this framework comes from information transmission systems, where it is customary to formulate problems in terms of asymptotically vanishing probability of error. An application of our results to a class of compound arbitrarily varying channels is given.

Stephan Sebastian Holzer (MIT)    (Video)
Lower Bounds for Distributed Graph Problems via Information Theory

Prakash Ishwar (Boston University)    (Video)
On the Ultimate Limit of Two-Terminal Interactive Computing

Abstract: We approach two-terminal interactive computing via distributed source coding in information theory. We characterize the minimum rate (bits per sample) for asymptotically error free computation when there is no limit on the number of messages that can be exchanged. Here, the minimum rate is viewed as a functional of the joint source distribution. The functions to be computed at the two terminals can be different. The functional characterization leads to *exact* analytic closed-form expressions for the minimum rates for computing any Boolean function of independent inputs at only one terminal and both terminals. It also leads to an alternating “concavification” algorithm for approximating the minimum rate for general functions and distributions. We conclude by discussing connections to amortized distributional communication complexity and some open problems. This is based on joint work with Nan Ma at Boston University.

Yael Kalai (Microsoft Research)   (Video)
Interactive Coding for Multi-Party Protocols

Joint work with Abhishek Jain and Allison Bishop.

Polynomial-time interactive coding

Joint work with Zvika Brakerski.

Amos Korman (CNRS/LIAFA)   (Video)
Confidence sharing: an economic strategy for efficient information flows in animal groups

Abstract: Social animals may share information to obtain a more complete and accurate picture of their surroundings. However, physical constraints on communication limit the flow of information between interacting individuals in a way that can cause an accumulation of errors and deteriorated collective behaviors. In this talk, I will theoretically discuss a general model of information sharing within animal groups, and take an algorithmic perspective to identify efficient communication schemes that are, nevertheless, economic in terms of communication, memory and individual internal computation. I will present a simple algorithm in which each agent compresses all information it has gathered into a single parameter that represents its confidence in its behavior. Confidence is communicated between agents by means of active signaling. It turns out that this algorithm competes extremely well with the best possible algorithm that operates without any computational constraints.

The proofs rely on the Cramer-Rao bound and on our definition of a Fisher Channel Capacity. These concepts are used to quantify information flows within the group and to obtain lower bounds on collective performance. The results suggest confidence sharing as a central notion in the context of animal communication.

János Körner (Sapienza University of Rome)    (Video)
Codes in graphs

Abstract: In a suitable reformulation, standard problems in Erdösian extremal combinatorics, especially intersection theorems, are in a surprisingly close connection with zero-error problems in the Shannon theory of information. Combinatorialists are, however, mostly interested in problems where the conjectured solution has a beautiful and simple structure (mostly kernel structures), and thus they seem to ignore the problems closest to information theory.

Our aim is to introduce a wealth of new problems where the objects playing the role of codewords are not appropriately different strings from some finite alphabet. Rather, they are permutations or even Hamilton paths in large complete graphs. We discuss some results in an attempt to understand why and when standard information-theoretic methods (greedy algorithms, random choice) fail to yield a solution.

Paris Koutris (University of Wisconsin)    (Video)
Worst-Case Optimal Algorithms for Parallel Query Processing

Abstract: We study the communication complexity for the problem of computing a conjunctive query on a large database in a massively parallel setting. In contrast to previous work, where upper and lower bounds on the communication were specified for particular structures of data, in this work we focus on worst-case analysis of the communication cost. We also show a surprising connection to the external memory model, which allows us to translate parallel algorithms to external memory algorithms. This technique allows us to recover several recent results on the I/O complexity for computing join queries, and also obtain optimal algorithms for other classes of queries.

Joint work with Paul Beame and Dan Suciu.

Michael Langberg (SUNY at Buffalo)   (Video)
A reductionist view of network information theory.

Abstract: The network information theory literature includes beautiful results describing codes and performance limits for many different networks. While common tools and themes are evident in the proofs of these results, the field is still very far from a unifying theory that not only explains how and why the optimal strategies for well-studied networks differ but also predicts solutions for networks to be studied in the future.

In this talk, I will present one step towards the derivation of such a theory based on the paradigm of reduction. A reduction shows that a problem A can be solved by turning it into a different problem B and then applying a solution for B. While the notion of a reduction is extremely simple, the tool of reduction turns out to be incredibly powerful. The key to its power is the observation that reductive proofs are possible even when solutions to both A and B are unavailable.

The paradigm of reduction can be used to show that a communication problem is “easy” if it can be mapped to other problems for which solutions are well understood. It can be used to show that a problem is “hard” if a notorious open problem can be reduced to it. It can be used to show that proof techniques and results from the study of one problem apply to other problems as well. It can be used to derive provable, quantifiable connections between problems that are intuitively very similar or problems that are apparently unrelated.

In this talk I will give an overview of several reductive connections that have emerged recently in the literature between seemingly disparate information theoretic problems. These include connections between network coding and index coding, connections between secure and standard communication, connections between noisy and noiseless networks, and more.

Yoram Moses (Technion)     (Video: Part 1, Part 2)
Knowledge and Coordinated Action

Abstract: Decisions taken by an agent in a multi-agent system depend on the agent's local information. A new formulation of the connection between knowledge and action in multi-agent systems allows new insights into the design of such systems. This talk will discuss how a theory of what agents know about the world and about knowledge of others can help in the design and analysis of distributed computer systems. Based in part on joint work with Armando Castaneda and Yannai Gonczarowski.

Coordination and Time

Abstract: The first part of this talk will use the results of the first talk to relate knowledge to coordinated action in a multi-agent setting. The second part of the talk will use the observations made in the first part to capture the role of time and timing information in coordinating actions in systems with clocks. While message chains play a dominant role in asynchronous systems, a corresponding notion is established for systems with clocks and time bounds on message delivery. The second part is based on joint work with Ido Ben Zvi.

Chandra Nair (Chinese University of Hong Kong)   (Video: Part 1, Part 2)
Tensorization of information functionals (Part I): Forward and Reverse hypercontractivity using information measures

Abstract: Equivalent characterizations of hypercontractivity using information measures is presented. These characterizations are then used to recover existing results as well as obtain new sharp results.

Tensorization of information functionals (Part II): Applications in establishing optimality of Gaussian distributionsand in a Boolean function Conjecture

Abstract: A method for proving optimality of Gaussian distributions in multiterminal information theory problems is developed. It is then used to establish the capacity region for the vector broadcast channel.

Boaz Patt-Shamir (Tel Aviv University)    (Video)
Randomized proof-labeling schemes

Anup Rao (University of Washington)    (Video: Part 1, Part 2, Part 3)
exponential separation of information and communication, and how to prove lower bounds on disjointness, and the non-negative rank of matrices

Abstract: We discuss how to prove lower bounds on the randomized communication complexity of disjointness, and outline some applications to proving lower bounds on linear programs, boolean circuit depth and data structures. We will also explain why the information cost of a protocol can be much smaller than that the communication complexity of protocols.

Andrei Romashchenko (LIRMM)     (Video)
On Parallels Between Shannon’s and Kolmogorov’s Information Theories (where the parallelism fails and why)

Abstract: Two versions of information theory - the theory of Shannon's entropy and the theory of Kolmgorov complexity - have manifest similarities in the basic definitions as well as in deep technical theorems. The interplay between these two theories often lead to remarkable insights. In the talk we will show different examples of this interplay concerning information inequalities and conditional encoding theorems, and discuss the limits of this parallelism.

Noga Ron-Zewi (IAS-DIMACS)
Towards Optimal Deterministic Coding for Interactive Communication

Abstract: We study efficient, deterministic interactive coding schemes that simulate any interactive protocol both under random and adversarial errors, and can achieve a constant communication rate independent of the protocol length. For channels that flip bits independently with probability $$\epsilon < 1/2$$, our coding scheme achieves a communication rate of $$1 − O(\sqrt{H(\epsilon)})$$ and a failure probability of $$exp(−n/\log n)$$ in length n protocols. Prior to our work, all nontrivial deterministic schemes (either efficient or not) had a rate bounded away from $$1$$. Furthermore, the best failure probability achievable by an efficient deterministic coding scheme with constant rate was only quasi-polynomial, i.e., of the form $$exp( −(\log(n))^{O(1)} )$$ (Braverman, ITCS 2012). For channels in which an adversary controls the noise pattern our coding scheme can tolerate $$\Omega(1/\log n)$$ fraction of errors with rate approaching $$1$$. Once more, all previously known nontrivial deterministic schemes (either efficient or not) in the adversarial setting had a rate bounded away from $$1$$, and no nontrivial efficient deterministic coding schemes were known with any constant rate. Essential to both results is an explicit, efficiently encodable and decodable systematic tree code of length n that has relative distance $$\Omega(1/ \log n)$$ and rate approaching $$1$$, defined over an $$O(\log n)$$-bit alphabet. No nontrivial tree code (either efficient or not) was known to approach rate $$1$$, and no nontrivial distance bound was known for any efficient constant rate tree code. Another central contribution in deriving our coding schemes for random and adversarial errors, is a novel code-concatenation scheme, a notion standard in coding theory which we adapt for the interactive setting.

Leonard Schulman (Caltech)    (Video)
Prospects for tree codes

Alexander Shen (LIRMM, Montpellier)     (Video: Part 1, Part 2)
Different versions of Kolmogorov complexity and a priori probability: a gentle introduction

Abstract: The informal idea – the complexity is the minimal number of bits needed to describe the object – has several different implementations. They are not only technical differences, but all they are there for a reason: we may consider binary strings (both as objects and descriptions) as self-contained objects or as partial information about infinite objects (that fixes some prefix of an infinite sequences). We will try to explain basic results about different versions of complexity and their relation to the notion of the a priori probability.

Alex Sprintson (Texas A&M)    (Video)
Secure and Reliable Codes for Cooperative Data Exchange

Abstract: In many practical settings, a group of clients needs to exchange data over a shared broadcast channel. The goal of cooperative data exchange problem is to find a schedule and an encoding scheme that minimize the total number of transmissions. We focus a wide range of practical settings in which the communication is performed in the presence of unreliable clients as well as in the presence of active and passive adversaries. In such settings, the problem of finding an efficient code is computationally intractable (NP-hard). Accordingly, we present approximation schemes with provable performance guarantees.

We also focus on the design of coding schemes that achieve weak security, i.e., prevent the adversary from being able to obtain information about any individual file in the system. The weak security is a low-overhead light-weight approach for protecting users’ data. In contrast to traditional information-theoretic and cryptographic tools, it does not require an exchange of secure keys and does not reduce the capacity of the network. We conjecture that it is possible to linearly transform a Vandermonde matrix to obtain a weakly secure code over a small field. This conjecture admits a number of reformulations that lead to interesting conjectures in algebraic geometry, abstract algebra and number theory.

Dan Suciu (University of Washington)    (Video)
Communication Cost in Parallel Query Processing

Abstract: We consider the following problem: what is the amount of communication required to compute a query in parallel on p servers, over a large database instance?  We define the Massively Parallel Communication (MPC) model, where the computation proceeds in rounds consisting of local computations followed by a global reshuffling of the data.  Servers have unlimited computational power and are allowed to exchange any data, the only cost parameters are the number of rounds and the maximum amount of communication per server. I will describe tight bounds on the amount of communication for the case of single round algorithms on non-skewed data, and discuss some partial results for multiple rounds and for skewed data.

Joint work with Paul Beame and Paris Koutris

Nikolay Vereshchagin (Moscow State University and Higher School of Economics)
Logical operations and Kolmogorov complexity

Abstract: The conditional complexity $$C(x|y)$$ can be interpreted as the complexity of the task $$x \rightarrow y$$ (“transform $$x$$ to $$y$$”). In this vein on can associate with any expression involving words $$x,y,\ldots$$ and logical connectives AND, OR, $$\rightarrow$$ an algorithmic task and define the Kolmogorov complexity of that task. The complexity of most tasks of this form can be expressed in terms of the complexities of $$x,y,\rightarrow$$ their pairs, triples etc. We will present several interesing examples of tasks with this property and an example of a task that does not have this property.

Cédric Villani (Université de Lyon / Institut Henri Poincaré)    (Video)
Information Theory: A Classical, Partial View

Omri Weinstein (Courant Institute (NYU))   (Video: Part 1, Part 2, Part 3, Part 4, Part 5, Part 6)
Interactive Information Complexity and Applications

Abstract: Communication complexity had a profound impact on nearly every field of theoretical computer science, and is one of the rare methods for proving unconditional lower bounds. Developing new tools in communication complexity is therefore vital for making progress in other computational models, such as circuit complexity, streaming algorithms, property testing, data structures and VLSI chip design.

In this 3-hour mini-course, I will give an introduction to Information Complexity, an interactive analogue of Shannon's information theory, which has recently found many applications in theoretical computer science and in particular for understanding the limitations of parallel computing. Such applications give rise to the fascinating problem of compressing interactive protocols, which will be at the core of this seminar.

I will survey some of the exciting recent progress in the field, and some applications of information complexity to parallel computing and secure computation. No prior knowledge will be assumed in this talk.

Some Information-Theoretic Problems in Theoretical Computer Science

Abstract: In this informal talk, I will present and shortly discuss a few long-standing open problems in theoretical computer science (TCS), including Secret-Sharing, Multi-terminal communication in the “Number-On-Forehead” model, Index-Coding and (time permitting) Private Information Retrieval. All these problems are purely information-theoretic and are very poorly understood. Proving lower bounds (and developing new techniques) for these problems will have far-reaching consequences in TCS.

Marius Zimand (Towson University)    (Video)
Distributed compression- the algorithmic-information-theoretical view

Abstract: Alice and Bob are given two correlated $$n$$-bit strings $$x_1$$ and, respectively, $$x_2$$, which they want to losslessly compress. They can either collaborate by sharing their strings, or work separately. We show that there is no disadvantage in the second scenario, and they can achieve almost optimal compression in the sense of Kolmogorov complexity, provided the decompression algorithm knows the complexity profile of $$x_1$$ and $$x_2$$. Furthermore, compression can be made at any combination of lengths that satisfy some necessary conditions (modulo additive polylog terms). More precisely, there exist probabilistic algorithms $$E_1, E_2$$ and $$D$$ with the following behavior: if $$n_1$$, $$n_2$$ are two integers satisfying $$n_1 + n_2 \geq C(x_1,x_2), n_1 \geq C(x_1 \mid x_2), n_2 \geq C(x_2 \mid x_1)$$, then for $$i \in \{1,2\}$$, $$E_i$$ on input $$x_i$$ and $$n_i$$ outputs a string of length $$n_i + \mathrm{polylog} n$$ such that $$D$$ on input $$E_1(x_1), E_2(x_2)$$ and the complexity profile of $$(x_1, x_2)$$ reconstructs $$(x_1,x_2)$$ with high probability (the complexity profile is $$(C(x_1), C(x_2), C(x_1, x_2))$$, where $$C(x)$$ denotes the plain Kolmogorov complexity of $$x$$). The compression algorithms $$E_1$$ and $$E_2$$ run in polynomial time. The main result is more general, as it deals with the compression of any constant number of correlated strings. It is an analog in the framework of algorithmic information theory of the classic Slepian-Wolf Theorem, a fundamental result in network information theory, in which $$x_1$$ and $$x_2$$ are realizations of two discrete random variables formed by taking $$n$$ independent drawings from a joint distribution. In our result no type of independence is assumed and it still is the case that distributed compression is on a par with centralized compression.