segunda-feira, 23 de janeiro de 2012

Analyzing Client Interactivity in Streaming Media

This paper, co-authored by researchers from UFMG, studies how users interact with streaming media services. User interactions,  such as pausing and jumping forward, have an important impact on the the design of realistic synthetic workloads and also on the performance of streaming media servers. The results presented are based on four real workloads from eTeach (educational video), TV/UOL (entertainment video), RADIO/UOL (entertainment audio), and an anonymous online radio refereed as ISP/RADIO. The authors argue that the study of large workloads from different domains enables interesting analyses that arise from the similarities and differences among the way users interact with these systems.

Several important results are presented along the paper. We summarize these results as follows:

Daily and Hourly Load Variations

eTeaches have few user requests, if compared with the other workloads, but delivers large amounts of content on average. Audio workloads also present a high average amount of content delivered, different from TV/UOL, which is composed mostly by small videos. In eTeach, the accesses peak around the middle of the week, while entertainment content presents a more even access distribution. During the day, eTeach accesses peaks in the middle of day and entertainment content is accessed evenly throughout the day, except in the first hours.

File Access Frequency

The frequency/popularity distribution of files is heavy-tailed for all the workloads studied. However, different distributions fit each workload well. Access distribution follows a Zipf for TV/UOL and is better described by two Zipfs for the other workloads. The authors argue that these two Zipfs may be related to the fact that users request files at most once, but it does not explain this pattern completely.

Session Arrival

The best function to fit session inter-arrival times depends on the workload. They found that an exponential function describes well the session arrival for TV/UOL and RADIO/UOL. However, inter-arrival times are better described by a weibull or a lognormal distribution for eTeach, depending on the file size. Moreover, a pareto distribution is the best fit for the ISP/RADIO workload.

Session Start Positions

Starting from the beginning of the file is not a rule, specially for video content. Nevertheless, access is skewed towards the beginning of the file. The authors show that skew faction, workload type and file size are correlated.

ON and OFF Times

The distribution that better describes ON and OFF times varies with day, file size and workload. ON times follow a Pareto (for small files) or a weibull (large files) and OFF times are well described by a weibull distribution in  general.

Session Interactive Requests

Interactive requests are correlated to content type and file size. Large videos, specially educational ones, present a highly interactive user behavior. On the other hand, audio sessions usually have only one client request. The most popular type of request is pause, but as the file size increases, the number of jump forward interactions increases as well. Any interaction type is more frequently followed by another interaction of the same type (repetition). Moreover, jump distances are usually under 45 seconds but increase with file size, which has implications for buffering and client prefetching.

Profiles of Client Interactive Behavior

The larger the video, the shorter the portion of the file is requested at each interaction. For audio, users frequently request the entire file or stop at an arbitrary position.

Implications for Caching

While segments of popular files are accessed uniformly, less popular files have their access skewed towards the beginning of the file. As a consequence, different caching strategies may be more suitable according to the file popularity. In general, considering content popularity is key to achieve good performance, since a large amount of content is accessed sporadically.

This is not my favorite type of paper, but I think that I need to be able to read papers that are out of my comfort zone. The paper is well-written and the ideas are very clear. I missed more in-depth analysis of results and their implications at some points in the paper.

Link: www.iw3c2.org/WWW2004/docs/1p534.pdf

Nenhum comentário:

Postar um comentário