Selected Projects

Delay Reduction in Clouds, Communications, and Complex Systems

In modern computer systems, long-running jobs are divided into small tasks and executed on multiple servers. Empirical observations in practical cloud systems suggest that the task service times are highly random and the delay to complete a job (including the waiting time in the queue and the service time to complete all tasks of the job) is bottlenecked by the slowest task. One approach to tame the long job delay is to replicate tasks over multiple servers such that one of the copies is completed early. Google has reported that task replications can reduce the job delay significantly, e.g., in one study replications reduce the 99.9%-th percentile delay from 1,800 ms to 74 ms. However, in many other systems, replications actually increase the delay, because redundant replications can increase the system load. How to optimize the scheduling decisions (e.g., when to replicate, which servers to replicate on, and which job to serve first) to minimize the job delay is an important and yet largely open problem.

In this project, we will investigate delay-optimal job scheduling and task replications in multi-server systems. We have developed low-complexity scheduling policies that make scheduling decisions based on the delay metric to be optimized and the type of service time distribution. For arbitrary number, sizes, arrival times, and due times (soft deadlines) of the jobs, these scheduling policies are proven to be delay-optimal or near delay-optimal in distribution (i.e., in stochastic ordering) among all non-preemptive and causal policies. These results can characterize (near) delay optimality for arbitrary arrival processes, for both transient- and steady-state systems, and for minimizing several classes of delay metrics, including average delay, maximum delay, jitter probability, and maximum lateness. Novel sample-path ordering and coupling methods are developed to prove these general results. Our simulations suggest that it is possible to achieve 10 times reduction in average delay and 1,000 times reduction in the jitter probability at moderate loads, and significantly greater reductions at heavier loads. We will continue to investigate delay reduction in communication networks, cloud storage systems, and distributed machine learning.

My talk on delay minimization: [slides]

  1. Designing Low-Complexity Heavy-Traffic Delay-Optimal Load Balancing Schemes: Theory to Algorithms
    Xingyu Zhou, Fei Wu, Jian Tan, Yin Sun, Ness B. Shroff, ACM Sigmetrics, 2018. [Technical Report]
  2. Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), 2017.

  3. On Delay-Optimal Scheduling in Queueing Systems with Replications
    Yin Sun, C. Emre Koksal, and Ness B. Shroff, Technical Report, March 2016. (Last revised 2/2/17)

  4. Near Delay-Optimal Scheduling of Batch Jobs in Multi-Server Systems
    Yin Sun, C. Emre Koksal, and Ness B. Shroff, Technical Report, Jan 2017. (Last revised 2/2/17)

  5. Provably Delay Efficient Data Retrieving in Storage Clouds
    Yin Sun, Zizhan Zheng, C. Emre Koksal, Kyu-Han Kim, and Ness B. Shroff
    IEEE INFOCOM, 2015. [Technical Report]

  6. When Queueing Meets Coding: Optimal-Latency Data Retrieving Scheme in Storage Clouds
    Shengbo Chen, Yin Sun, Ulas C. Kozat, Longbo Huang, Prasun Sinha, Guanfeng Liang, Xin Liu, and Ness B. Shroff
    IEEE INFOCOM, 2014.

Age-of-Information: Optimizing the Freshness of Real-Time Data

Real-time data is critical for information-update and analytics systems (e.g., crowdsourcing, ads bidding, social networks, and online trading) and for real-time monitoring and control systems (e.g., sensor networks, smart grid, airplane/vehicular control, robotics, Internet of Things, and cyber-physical systems). The "Age-of-Information", or simply "age", is defined as the time difference between the generation time and usage time of the data. This concept was originally proposed in the 1990s, and has recently seen renewed interest because of its applicability to real-time applications.

Our research on Age-of-Information consists of two directions: In one direction, we have developed optimal transmission strategies to minimize the age of information. One natural policy is the zero-wait policy, i.e., the source submits a new update once the previous update is delivered and the channel becomes free, which achieves the maximum throughput and the minimum delay. Surprisingly, this zero-wait policy does not always minimize the age. We have developed the first optimal policy to minimize the age-of-information, and provided a sufficient and necessary condition to characterize when the zero-wait policy is optimal. In the other direction, we investigated how to best sample a real-time signal to be transmitted on a channel. Here, a sensor takes samples from a Gauss-Markov signal and forwards the samples to an estimator via a channel; the estimator provides the best-guess of the current signal value using causally received samples. This "Real-time Sampling" problem is quite different from the classic Nyquist-Shannon uniform sampling because we do not have access to future samples of the signal. We have developed optimal sampling strategy for minimizing the mean square estimation error. This optimal sampling policy is shown to be a threshold policy and the optimal threshold is determined by the signal, sampler, and channel in a simple form. When the sampling times are independent of the observed signal, this problem reduces to an Age-of-Information optimization problem that has been solved recently. This reveals an interesting connection between Age-of-Information and signal processing.

My talk on Age-of-Information: [slides]

A collection of recent papers on Age-of-Information is available here.


  1. Information Aging through Queues: A Mutual Information Perspective
    Yin Sun and Benjamin Cyr, IEEE SPAWC Conference -- Special Session on the Age of Information, 2018. [Technical Report]

  2. Achieving the Age-Energy Tradeoff with a Finite-Battery Energy Harvesting Source
    Baran Tan Bacinoglu, Yin Sun, Elif Uysal-Biyikoglu, and Volkan Mutlu
    IEEE ISIT, 2018.

  3. A Dynamic Jamming Game for Real-Time Status Updates
    Yuanzhang Xiao, Yin Sun
    IEEE INFOCOM Workshops - the 1st Workshop on the Age of Information (AoI Workshop), 2018.

  4. Age-Optimal Updates of Multiple Information Flows
    Yin Sun, Elif Uysal-Biyikoglu, and Sastry Kompella, IEEE INFOCOM Workshops - the 1st Workshop on the Age of Information (AoI Workshop), 2018. [Technical Report]

  5. The Age of Information in Multihop Networks
    Ahmed M. Bedewy, Yin Sun, and Ness B. Shroff, submitted to IEEE Transactions on Information Theory, 2017.

  6. Minimizing the Age of Information through Queues
    Ahmed M. Bedewy, Yin Sun, and Ness B. Shroff, submitted to IEEE Transactions on Information Theory, 2017.

  7. Update or Wait: How to Keep Your Data Fresh
    Yin Sun, Elif Uysal-Biyikoglu, Roy D. Yates, C. Emre Koksal, and Ness B. Shroff, IEEE Transactions on Information Theory, vol. 63, no. 11, Nov. 2017.

  8. Remote Estimation of the Wiener Process over a Channel with Random Delay
    Yin Sun, Yury Polyanskiy, and Elif Uysal-Biyikoglu, IEEE ISIT, 2017. [Technical report]

  9. Age-Optimal Information Updates in Multihop Networks
    Ahmed M. Bedewy, Yin Sun, and Ness B. Shroff, IEEE ISIT, 2017. [Technical report]

  10. Optimizing Data Freshness, Throughput, and Delay in Multi-Server Information-Update Systems
    Ahmed M. Bedewy, Yin Sun, and Ness B. Shroff, IEEE ISIT, 2016.

  11. Update or Wait: How to Keep Your Data Fresh
    Yin Sun, Elif Uysal-Biyikoglu, Roy D. Yates, C. Emre Koksal, and Ness B. Shroff, IEEE INFOCOM, 2016.

Practical and Efficient Designs for LTE-Advanced and 5G Cellular Networks

We collaborate with companies (like HP and Huawei) to work on efficient cellular designs that fit with the constraints in real systems, such as limited backhaul, pilot cost, and feedback delay. In a recent project, we designed a coordinated power control policy for uplink LTE-Advanced systems, called Checks and Balances (C&B), which checks one user's received SNR and its generated INR to other users, and balances the two. Compared with traditional CoMP solutions, C&B has several advantages: it (i) only needs large-scale channel gain and does not need small-scale channel coefficients, (ii) can be implemented in software without changing the physical-layer hardware, (iii) allows for fully distributed implementation, and (iv) does not need backhaul support or extra pilots. To evaluate C&B, we built a system-level simulation platform which contains tens of base stations and hundreds of users, which is carefully calibrated with Huawei. We found that C&B achieves higher average throughput and cell-edge throughput simultaneously, compared to several widely-used power control policies.

We are currently investigating channel estimation, Angle-of-Arrival (AoA) tracking, and link adaptation for massive MIMO and millimeter wave systems.
  1. How to Mobilize mmWave: A Joint Beam and Channel Tracking Approach
    Jiahui Li, Yin Sun, Limin Xiao, Shidong Zhou, and Ashutosh Sabharwal, IEEE ICASSP 2018.

  2. Super Fast Beam Tracking in Phased Antenna Arrays
    Jiahui Li, Yin Sun, Limin Xiao, Shidong Zhou, and C. Emre Koksal, submitted to IEEE Transactions on Signal Processing, 2017.

  3. Analog Beam Tracking in Linear Antenna Arrays: Convergence, Optimality, and Performance
    Jiahui Li, Yin Sun, Limin Xiao, Shidong Zhou, and C. Emre Koksal, Asilomar Conference, 2017. [Technical Report]

  4. Checks and Balances: A Low-Complexity High-Gain Uplink Power Controller for CoMP
    Fangzhou Chen*, Yin Sun*, Yiping Qin, and C. Emre Koksal, IEEE Globecom, 2016. *Co-primary authors.

Research Sponsored by

I gratefully acknowledge ongoing support from ONR (N00014-17-1-2417) for my research group.