Chinmoy Dutta | publications

When hashing met matching: Efficient spatio-temporal search for ride matches Chinmoy Dutta In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021. [Abs]
Carpooling, or sharing a ride with other passengers, holds immense potential for urban transportation. However, finding ride matches in real-time at urban scale is a difficult combinatorial optimization problem and mostly heuristic approaches are applied. In this work, we introduce a principled approach to ride matching by constructing representations for rides that capture their spatio-temporal aspects, and defining a similarity metric for these representations that expresses matching utility. This lets us mathematically model the ride matching problem as that of near neighbor search (NNS) and devise a novel efficient spatio-temporal search algorithm for it based on the theory of locality sensitive hashing (LSH). The proposed algorithm can find k near-optimal potential matches for every ride from a match pool of n rides in time O(n^(1 + ρ) (k + \log n) \log k) and space O(n^(1 + ρ) \log k) for a small ρ< 1. Our algorithm enjoys several practically useful properties and extension possibilities. Experiments with large real-world datasets show that our algorithm consistently outperforms state-of-the-art heuristic methods thereby proving its practical applicability.

System and method for managing dynamic transportation networks using simulated future scenarios Chinmoy Dutta US Patent 10 769 558, 2020.
Method and system for identifying users across mobile and desktop devices Chinmoy Dutta, Santosh Kancha, Junjun Li, Wanchen Lu, Milind Mahajan, Sandeep Pandey, Xiaochuan Qin, Ameet Ranadive, Vibhor Rastogi, Shariq Rizvi, Abhishek Shrivastava, Yimin Wu, Lei Zhang and Ke Zhou US Patent 10 861 055, 2020. (Extension of US Patent 10 423 985.)

Edge weighted online windowed matching Itai Ashlagi, Maximilien Burq, Chinmoy Dutta, Patrick Jaillet, Amin Saberi and Chris Sholley In Proceedings of the ACM Conference on Economics and Computation (EC), 2019. [Abs] [PDF] [Proceedings] [arXiv]
Motivated by applications from ride-sharing and kidney exchange, we study the problem of matching agents who arrive at a marketplace over time and leave after d time periods. Agents can only be matched while they are present in the marketplace. Each pair of agents can yield a different match value, and the planner’s goal is to maximize the total value over a finite time horizon. First we study the case in which vertices arrive in an adversarial order. We provide a randomized (1/4)-competitive algorithm building on a result by Feldman et al. (WINE 2009) and Lehmann et al. (Games Econ Behav 2006). We extend the model to the case in which departure times are drawn independently from a distribution with non-decreasing hazard rate, for which we establish a (1/8)-competitive algorithm. When the arrival order is chosen uniformly at random, we show that a batching algorithm, which computes a maximum-weighted matching every (d+1) periods, is 0.279-competitive.
Assigning rides based on probability of provider acceptance Chinmoy Dutta, Adam Greenhall, Christopher Sholley, Jimmy Young and Jatin Chopra US Patent App 15/859 111, 2019.
Method and system for identifying users across mobile and desktop devices Chinmoy Dutta, Santosh Kancha, Junjun Li, Wanchen Lu, Milind Mahajan, Sandeep Pandey, Xiaochuan Qin, Ameet Ranadive, Vibhor Rastogi, Shariq Rizvi, Abhishek Shrivastava, Yimin Wu, Lei Zhang and Ke Zhou US Patent 10 423 985, 2019.

Online matching on ride-sharing platforms Chinmoy Dutta and Chris Sholley Presented at the Marketplace Innovation Workshop (MIW), 2017.

Unidirectional lookalike campaigns in a messaging platform Chinmoy Dutta, Junjun Li, Vibhor Rastogi, Wanchen Lu, Sandeep Pandey and Utkarsh Srivastava US Patent 9 361 322, 2016.

Coalescing-branching random walks on graphs Chinmoy Dutta, Gopal Pandurangan, Rajmohan Rajaraman and Scott Roche ACM Transactions on Parallel Computing (TOPC), 2015. [Abs] [PDF] [Journal]
We study a distributed randomized information propagation mechanism in networks we call the coalescing-branching random walk (cobra walk, for short). A cobra walk is a generalization of the well-studied “standard” random walk, and is useful in modeling and understanding the Susceptible-Infected-Susceptible (SIS)-type of epidemic processes in networks. It can also be helpful in performing light-weight information dissemination in resource-constrained networks. A cobra walk is parameterized by a branching factor k. The process starts from an arbitrary vertex, which is labeled active for step 1. In each step of a cobra walk, each active vertex chooses k random neighbors to become active for the next step (“branching”). A vertex is active for step t + 1 only if it is chosen by an active vertex in step t (“coalescing”). This results in a stochastic process in the underlying network with properties that are quite different from both the standard random walk (which is equivalent to the cobra walk with branching factor 1) as well as other gossip-based rumor spreading mechanisms. We focus on the cover time of the cobra walk, which is the number of steps for the walk to reach all the vertices, and derive almost-tight bounds for various graph classes. We show an O(log^2 n) high probability bound for the cover time of cobra walks on expanders, if either the expansion factor or the branching factor is sufficiently large; we also obtain an O(log n) high probability bound for the partial cover time, which is the number of steps needed for the walk to reach at least a constant fraction of the vertices. We also show that the cover time of the cobra walk is, with high probability, O(n log n) on any n-vertex tree for k ≥ 2, Õ(n^(1/d)) on a d-dimensional grid for k ≥ 2, and O(log n) on the complete graph.

On the complexity of information spreading in dynamic networks Chinmoy Dutta, Gopal Pandurangan, Rajmohan Rajaraman, Zhifeng Sun and Emanuele Viola In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), 2013. [Abs] [PDF] [Proceedings]
We study how to spread k tokens of information to every node on an n-node dynamic network, the edges of which are changing at each round. This basic gossip problem can be completed in O(n + k) rounds in any static network, and determining its complexity in dynamic networks is central to understanding the algorithmic limits and capabilities of various dynamic network models. Our focus is on token-forwarding algorithms, which do not manipulate tokens in any way other than storing, copying and forwarding them. We first consider the strongly adaptive adversary model where in each round, each node first chooses a token to broadcast to all its neighbors (without knowing who they are), and then an adversary chooses an arbitrary connected communication network for that round with the knowledge of the tokens chosen by each node. We show that Ω(nk/log n + n) rounds are needed for any randomized (centralized or distributed) token-forwarding algorithm to disseminate the k tokens, thus resolving an open problem raised in [KLO10]. The bound applies to a wide class of initial token distributions, including those in which each token is held by exactly one node and well-mixed ones in which each node has each token independently with a constant probability. Our result for the strongly adaptive adversary model motivates us to study the weakly adaptive adversary model where in each round, the adversary is required to lay down the network first, and then each node sends a possibly distinct token to each of its neighbors. We propose a simple randomized distributed algorithm where in each round, along every edge (u, v), a token sampled uniformly at random from the symmetric difference of the sets of tokens held by node u and node v is exchanged. We prove that starting from any well-mixed distribution of tokens where each node has each token independently with a constant probability, this algorithm solves the k-gossip problem in O((n + k) log n log k) rounds with high probability over the initial token distribution and the randomness of the protocol. We then show how the above uniform sampling problem can be solved using Õ(log n) bits of communication, making the overall algorithm communication-efficient. We next present a centralized algorithm that solves the gossip problem for every initial distribution in O((n + k) log^2 n) rounds in the offline setting where the entire sequence of communication networks is known to the algorithm in advance. Finally, we present an O(n mink, √k log n)-round centralized offline algorithm in which each node can only broadcast a single token to all of its neighbors in each round.
Coalescing-branching random walks on graphs Chinmoy Dutta, Gopal Pandurangan, Rajmohan Rajaraman and Scott Roche In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 2013. [Abs] [PDF] [Proceedings]
We study a distributed randomized information propagation mechanism in networks we call the coalescing-branching random walk (cobra walk, for short). A cobra walk is a generalization of the well-studied "standard" random walk, and is useful in modeling and understanding the Susceptible-Infected-Susceptible (SIS)-type of epidemic processes in networks. It can also be helpful in performing light-weight information dissemination in resource-constrained networks. A cobra walk is parameterized by a branching factor k. The process starts from an arbitrary node, which is labeled active for step 1. (For instance, this could be a node that has a piece of data, rumor, or a virus.) In each step of a cobra walk, each active node chooses k random neighbors to become active for the next step ("branching"). A node is active for step t + 1 only if it is chosen by an active node in step t ("coalescing"). This results in a stochastic process in the underlying network with properties that are quite different from both the standard random walk (which is equivalent to the cobra walk with branching factor 1) as well as other gossip-based rumor spreading mechanisms. We focus on the cover time of the cobra walk, which is the number of steps for the walk to reach all the nodes, and derive almost-tight bounds for various graph classes. Our main technical result is an O(log^2 n) high probability bound for the cover time of cobra walks on expanders, if either the expansion factor or the branching factor is sufficiently large; we also obtain an O(log n) high probability bound for the partial cover time, which is the number of steps needed for the walk to reach at least a constant fraction of the nodes. We show that the cobra walk takes O(n log n) steps on any n-node tree for k ≥ 2, and Õ(n^(1/d)) steps on a d-dimensional grid for k ≥ 2, with high probability.

More on a problem of Zarankiewicz Chinmoy Dutta and Jaikumar Radhakrishnan In Proceedings of the International Symposium on Algorithms and Computation (ISAAC), 2012. [Abs] [Proceedings]
We show tight necessary and sufficient conditions on the sizes of small bipartite graphs whose union is a larger bipartite graph that has no large bipartite independent set. Our main result is a common generalization of two classical results in graph theory: the theorem of Kővári, Sós and Turán on the minimum number of edges in a bipartite graph that has no large independent set, and the theorem of Hansel (also Katona and Szemerédi, Krichevskii) on the sum of the sizes of bipartite graphs that can be used to construct a graph (non-necessarily bipartite) that has no large independent set. Our results unify the underlying combinatorial principles developed in the proof of tight lower bounds for depth-two superconcentrators.
Split and join: Strong partitions and universal steiner trees for graphs Costas Busch, Chinmoy Dutta, Jaikumar Radhakrishnan, Rajmohan Rajaraman and S. Srinivasagopalan In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), 2012. [Abs] [Proceedings]
We study the problem of constructing universal Steiner trees for undirected graphs. Given a graph G and a root node r, we seek a single spanning tree T of minimum stretch, where the stretch of T is defined to be the maximum ratio, over all terminal sets X, of the cost of the minimal sub-tree T_X of T that connects X to r to the cost of an optimal Steiner tree connecting X to r in G. Universal Steiner trees (USTs) are important for data aggregation problems where computing the Steiner tree from scratch for every input instance of terminals is costly, as for example in low energy sensor network applications. We provide a polynomial time UST construction for general graphs with 2^O(√log n)-stretch. We also give a polynomial time polylogarithmic-stretch construction for minor-free graphs. One basic building block of our algorithms is a hierarchy of graph partitions, each of which guarantees small strong diameter for each cluster and bounded neighbourhood intersections for each node. We show close connections between the problems of constructing USTs and building such graph partitions. Our construction of partition hierarchies for general graphs is based on an iterative cluster merging procedure, while the one for minor-free graphs is based on a separator theorem for such graphs and the solution to a cluster aggregation problem that may be of independent interest even for general graphs. To our knowledge, this is the first subpolynomial-stretch (o(n^ε ) for any ε > 0) UST construction for general graphs, and the first polylogarithmic-stretch UST construction for minor-free graphs.

Selecting keywords representative of a document Amit Nanavati and Chinmoy Dutta US Patent 7 856 435, 2010.

Lower bounds for noisy wireless networks using sampling algorithm Chinmoy Dutta and Jaikumar Radhakrishnan In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), 2008. [Abs] [Proceedings]
We show a tight lower bound of Ω(N log log N) on the number of transmissions required to compute several functions (including the parity function and the majority function) in a network of N randomly placed sensors, communicating using local transmissions, and operating with power near the connectivity threshold. This result considerably simplifies and strengthens an earlier result of Dutta, Kanoria, Manjunath and Radhakrishnan (SODA 08) that such networks cannot compute the parity function reliably with significantly fewer than N log log N transmissions, thereby showing that the protocol with O(N log log N) transmissions due to Ying, Srikant and Dullerud (WiOpt 06) is optimal. We also observe that all the lower bounds shown by Evans and Pippenger (SIAM J. on Computing, 1999) on the average noisy decision tree complexity for several functions can be derived using our technique simply and in a unified way.
A tight lower bound for parity in noisy communication networks Chinmoy Dutta, Yashodhan Kanoria, D Manjunath and Jaikumar Radhakrishnan In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), 2008. [Abs] [PDF] [Proceedings]
We show a tight lower bound of Ω(N log log N) on the number of transmission required to compute the parity of N bits (with constant error) in a network of N randomly placed sensors, communicating using local transmissions, and operating with power near the connectivity threshold. This result settles a question left open by Ying, Srikant and Dullerud (WiOpt 06), who showed how the sum of all N bits can be computed using O(N log log N) transmissions. Earlier works on lower bounds for communication networks worked with the full broadcast model without using the fact that the communication in real networks is local, determined by the power of the transmitters. In fact, in full broadcast networks parity can be computed using O(N) transmissions. To obtain our lower bound we employ techniques developed by Goyal, Kindler and Saks (FOCS 05), who showed lower bounds in the full broadcast model by reducing the problem to a model of noisy decision trees. However, in order to capture the limited range of transmissions in real sensor networks, we define and work with a localized version of noisy decision trees. Our lower bound is obtained by exploiting special properties of parity computations in such decision trees.

Tradeoffs in depth-two superconcentrators Chinmoy Dutta and Jaikumar Radhakrishnan In Proceedings of the Symposium on Theoretical Aspects of Computer Science (STACS), 2006. [Abs] [Proceedings]
An N-superconcentrator is a directed graph with N input vertices and N output vertices and some intermediate vertices, such that for k=1, 2, ..., N, between any set of k input vertices and any set of k output vertices, there are k vertex disjoint paths. In a depth-two N-superconcentrator each edge either connects an input vertex to an intermediate vertex or an intermediate vertex to an output vertex. We consider tradeoffs between the number of edges incident on the input vertices and the number of edges incident on the output vertices in a depth-two N-superconcentrator. For an N-superconcentrator G, let a(G) be the average degree of the input vertices and b(G) be the average degree of the output vertices. Assume that b(G) ≥ a(G). We show that there is a constant k_1 > 0 such that 𝑎(𝐺) 𝑙𝑜𝑔 (2𝑏(𝐺)/𝑎(𝐺)) 𝑙𝑜𝑔 𝑏(𝐺) ≥ 𝑘_1 𝑙𝑜𝑔^2 𝑁.
Ontology-based term disambiguation Amit Nanavati and Chinmoy Dutta US Patent App 10/955 255, 2006.

Enhancing bandwidth utilization in Bluetooth using optimal SAR Ashish Agarwal, Chinmoy Dutta and Dheeraj Sanghi In Proceedings of the International Conference on Computer Communication (ICCC), 2002. [Abs] [Proceedings]
This paper focusses on Segmentation and Reassembly (SAR) policies in Bluetooth. Bluetooth baseband packets are small and have large overheads. This makes the optimal size decision for packet size very important. This decision is further complicated by slot reservations for SCO connections and scheduling decisions. The goal of any policy is to provide maximum possible slot utilization and throughput keeping in mind the availability of data at the master and slaves. In this paper, we study two existing SAR policies - Best Fit and Optimum Slot Utilization. We show that each of them may perform poorly in some cases. We propose a new SAR policy, which we call Optimal SAR and present simulation results to show that it performs better than the other two.