Two Papers about Datacenter Networking

[1] Data Center TCP (DCTCP). Mohammad Alizadeh et al. In SIGCOMM ’10, Aug.–Sep. 2010.

[2] NOX: Towards an Operating System for Networks. Natasha Gude et al. ACM SIGCOMM CCR, 38(3):105–110, Jul. 2008.

Both papers are in the area of Networks but they differ greatly in that the first paper designs a protocol whereas the second paper proposes a system architecture. In addition, the network protocol introduced in paper [1] is designed for a particular environment, in contrast to the operating system in paper [2] that uses more general approach. For another comparison, paper [1] is much more detailed than paper [2] providing more specific and larger amount of clues.

The authors of paper [1] makes an important observation that existing TCP protocol does efficiently handle network flows of the data center network, which requires: low latency for short flows (foreground traffic), and high utilization for long flows (background traffic). Naive processing of the flows reduces the overall performance since flows of high bandwidth dominate buffers at the switches thereby make the system drop flows of low latency. Based on the observation, they propose a variant of TCP protocol, Data Center TCP (DCTCP). Their design is simple: using a flag in the Explicit Congestion Notification (ECN) feature- that is already included in TCP protocol – determine the threshold K, and keep buffer occupancies low using K. Through various evaluations, they prove that this strategy gives the same or improved throughput than TCP maintaining the buffer size to be %90 less. Reducing packet dropping ratio DCTCP allows the applications to handle background traffic faster without affecting foreground traffic.

Paper [2] claims that it is currently difficult to manage a network through low-level configuration of individual components, thereby suggests a centralized operating system NOX that provides a programming interface to observe and control a network. According to its architecture, NOX keeps a single network view in a database and a separate server that controls all the switches. Programmers can think of the network as network in a single machine while writing programs in a high-level language; Python. By showing simple example applications, the authors illustrate NOX’s programming model. For a particular application, called Ethane, implementation without NOX takes about ten times more lines than one with NOX.

Two papers about the Underground Economy

[1] Click Trajectories: End-to-End Analysis of the Spam Value Chain. Kirill Levchenko et al. In IEEE S&P ’11, May 2011.

[2] Re: CAPTCHAs – Understanding CAPTCHA-Solving in an Economic Context. Marti Motoyama et al. InUSENIX Security ’10, Aug. 2010.

Most analysis papers in software security report in technical aspects. It is interesting that analyses of the two reviewed papers are mainly on “market” side. Both of them perform realistic, large-scale analyses on underground economy, related with spam-based advertising and CAPTCHA-solving services respectively. Both papers point out that enterprises invest their budgets on them because the revenues that the two types of service create are larger than their costs. Strictly saying, paper [1] focuses more purely on business facets than paper [2].

Spam-based advertising consists of multiple of technical and business components. The authors of paper [1] claim that although each of the components has been studied deeply, the relationship between them has not yet been well understood. Understanding the characteristics of the “end-to-end” spam value chain can help address spam problems. For that, the analyses include: (1) identify the sites advertising popular classes of goods, (2) the business relationships between spam operators (programs) and the enterprises who want the service, and (3) other relationships between merchant bank affiliation and spam businesses. The data for the analyses are collected through full-message spam feeds, parsing URLs in web pages, running a farm of web crawlers, and (most interestingly) purchasing from affiliate programs. From the results of their study, the authors assert that intervening at the payment tier of the spam value chain is a reasonable way to reduce spam problems.

In comparison, paper [2] describes that the current state of CAPTCHA-solving techniques and the business dynamics of the service providers. CAPTCHA solving currently comes in two ways; (1) automated solving by image processing, and (2) cheap human labors that solve them manually. Although there have been an emergence of automated software CAPTCHA solvers, such as Xrumer and reCaptchaOCR, the authors reveal that most services are based on humans. The analyses, therefore, focusses on the economics of the services; the accuracy, speed, and wages of human CAPTCHA solvers. Because of their defensive mechanisms improving, it is likely that the CAPTCHAs will work less efficiently over time.

Two Papers about Programming for Datacenters

Papers review date: 11/18/2011.

[1] BOOM Analytics: Exploring Data-Centric, Declarative Programming for the Cloud. Peter Alvaro et al. In EuroSys ’10, Apr. 2010.

[2] Nectar: Automatic Management of Data and Computation in Datacenters. Pradeep Kumar Gunda et al. In OSDI ’10, Oct. 2010.

The two papers ease specific tasks related with data centers. Putting more specifically, they solve different problems – that are (1) the complexity of writing and analyzing distributed software and (2) inefficient manual management of data and computation – but in general they both improve the productivity of data centers. The more recent second paper refers to the older first paper.

The major claim that paper [1] makes is that data-centric approach and declarative programming languages are suitable for distributed software model. In order to support their claim the authors modify the internal of the existing interfaces for data-parallel computing, such as MapReduce and HDFS, and build a separate layer of a high-level language, called Overlog, on top of the stack. They point out that using the existing interfaces, programmers suffer from the tedious workload handling concurrent computation and communication. It is interesting that one of their approaches to prove that their system (BOOM) improves the productivity is: to report the person-hours they spent implementing distributed software using their own work (i.e., BOOM). In the paper, they report that they could significantly reduce the number of lines of code to implement the distributed software with minor overhead. They also admit that Overlog gives conciseness but some of its syntaxes reduce the readability of the code.

In contrast, the automatic management system (Nectar) for data centers, introduced in paper [2], is designed to show that manually managing data and computation can waste a large amount of system resources. In their initial experiments, the authors have found out that redundant and obsolete sets of data and computation can be eliminated by adapting a cache server. In their design of the system, they categorize data into “primary” and “derived”, and devise a method to automatically detect program and data dependencies, using a declarative programming language LINQ. Reusing the sub-results of previous computation via caching, their experiments on Nectar demonstrate a great improvement in both space utilization and speed.

Two Papers about “Green” Storage

Papers review update: 11/11/2011.

[1] Massive Arrays of Idle Disks For Storage Archives. Dennis Colarelli and Dirk Grunwald. In SC ’02, Nov. 2002.

[2] Hibernator: Helping Disk Arrays Sleep Through the Winter. Qingbo Zhu et al. In SOSP ’05, Oct. 2005.

The two papers introduce new disk management schemes, MAID [1] and Hibernator [2], in order to increase energy savings and provide high performance of disk arrays. Both papers claim their systems significantly save power consumption while remaining the performance comparable to a RAID system.

Strictly saying, MAID is not a completely automated system that guarantees the highest performance and energy savings. Its performance varies greatly depending on the configuration. I believe that their evaluations are more valuable than their design of MAID. MAID is used as a tool to perform the experiments. The authors show in their evaluations how different configurations result in different energy consumptions and performance; (1) using data duplication (caching) the resulted performance was almost half the performance when not using duplication, and (2) striping improved the performance without large energy loss. They demonstrate that a successful configuration is highly related with the type of workload that disk array handles, thereby an intelligent power management of disk arrays can save significant energy with low performance loss.

In contrast, Hibernator is a more advanced, recent system that well balances performance and power consumption. An assumption that the authors make is: use of multi-speed disk drives. Give that, as the key design concept of Hibernator is to use a coarse-grain algorithm in order to determine the size of a multi-tier data layout, each tier consists of disks running at the same speed. This disk-selection algorithm is novel in that it requires no extra disks and does not reduce reliability. Another interesting algorithm that Hibernator uses is boosting disk speed in case the performance goal is not about to be reached. Their evaluation shows that Hibernator can save energy up to 65% while maintaining the performance comparable to a RAID5 array.

Two papers about Disk Failures

Papers review date: 11/4/2011.

[1] Failure Trends in a Large Disk Drive Population. Eduardo Pinheiro et al. In FAST ’07, Feb. 2007.

[2] An Analysis of Data Corruption in the Storage Stack. Lakshmi N. Bairavasundaram et al. In FAST ’08, Feb. 2008.

Both papers point out the importance of reliable disk storages and understanding the characteristics of disk behaviors. The approaches of the two analysis papers are also equivalent in that their experiments are based on large-scale collections of data from realistic deployment, unlike most existing papers. However, they differ in that paper [1] relates disk failures with disk longevity whereas paper [2] focuses on analyzing data corruption rates.

The authors of the first paper claim that their results of observations from a very large number of disk drives within Google’s infrastructure, show different disk failure patterns from most available data based on extrapolation. In their study, they consider a disk drive has failed if it has been replaced with new one during a repair procedure, to make the meaning and the timing of a ‘disk failure’ clear. Their analyses correlate disk failure rates with numerous factors that are supposed to affect disk lifetime; disk temperature, disk utilization, and the SMART parameters (scan errors, reallocation/probational counts and more). According to the data they have collected, temperature and utilization are less relevant to disk failure rates than it has been expected, and some of the SMART parameters are useful to infer disk failure patterns in large disk populations, though the parameters cannot be always trusted.

In comparison, the authors of the second paper assert that firmwares in disk drives can potentially fail to detect disk corruptions and the widely-used checksum matching cannot be completely trusted. Their study covers on three different classes of data corruption; checksum mismatches, identity discrepancies, and parity inconsistencies. The results of experiments show that checksum mismatching rate is very low overall, and the rates of identity discrepancies and parity inconsistencies are even lower. However, it is still important to know the characteristics of disk corruptions because data loss can be a critical problem even though it does not happen often. Their analyses on the collected data include: (1) nearline disks develop more corruption mismatches than enterprise-level disks, (2) checksum mismatches are dependent both within a local disk and through different disks, and finally (3) checksum mismatches have high spatial and temporal locality.