main-content

This book constitutes the proceedings of the 21st International Conference on Passive and Active Measurement, PAM 2020, held in Eugene, Oregon, USA, in March 2020.
The 19 full papers presented in this volume were carefully reviewed and selected from 65 submissions. They were organized in topical sections named: active measurement; security; best practices and conformance; domain names; topology and routing; topology - alias resolution; and Web.

### Discovering the IPv6 Network Periphery

Abstract
We consider the problem of discovering the IPv6 network periphery, i.e., the last hop router connecting endhosts in the IPv6 Internet. Finding the IPv6  periphery using active probing is challenging due to the IPv6 address space size, wide variety of provider addressing and subnetting schemes, and incomplete topology traces. As such, existing topology mapping systems can miss the large footprint of the IPv6  periphery, disadvantaging applications ranging from IPv6 census studies to geolocation and network resilience. We introduce “edgy,” an approach to explicitly discover the IPv6 network periphery, and use it to find >64M IPv6  periphery router addresses and >87M links to these last hops – several orders of magnitude more than in currently available IPv6 topologies. Further, only 0.2% of edgy’s discovered addresses are known to existing IPv6 hitlists.
Erik C. Rye, Robert Beverly

### Improving Coverage of Internet Outage Detection in Sparse Blocks

Abstract
There is a growing interest in carefully observing the reliability of the Internet’s edge. Outage information can inform our understanding of Internet reliability and planning, and it can help guide operations. Active outage detection methods provide results for more than 3M blocks, and passive methods more than 2M, but both are challenged by sparse blocks where few addresses respond or send traffic. We propose a new Full Block Scanning (FBS) algorithm to improve coverage for active scanning by providing reliable results for sparse blocks by gathering more information before making a decision. FBS identifies sparse blocks and takes additional time before making decisions about their outages, thereby addressing previous concerns about false outages while preserving strict limits on probe rates. We show that FBS can improve coverage by correcting 1.2M blocks that would otherwise be too sparse to correctly report, and potentially adding 1.7M additional blocks. FBS can be applied retroactively to existing datasets to improve prior coverage and accuracy.
Guillermo Baltra, John Heidemann

### FlowTrace : A Framework for Active Bandwidth Measurements Using In-band Packet Trains

Abstract
Active measurement tools are important to understand and diagnose performance bottlenecks on the Internet. However, their overhead is a concern because a high number of additional measurement packets can congest the network they try to measure. To address this issue, prior work has proposed in-band approaches that piggyback application traffic for active measurements. However, prior approaches are hard to deploy because they require either specialized hardware or modifications in the Linux kernel. In this paper, we propose FlowTrace–a readily deployable user-space active measurement framework that leverages application TCP flows to carry out in-band network measurements. Our implementation of pathneck using FlowTrace creates recursive packet trains to locate bandwidth bottlenecks. The experimental evaluation on a testbed shows that FlowTrace is able to locate bandwidth bottlenecks as accurately as pathneck with significantly less overhead.
Adnan Ahmed, Ricky Mok, Zubair Shafiq

### Leopard: Understanding the Threat of Blockchain Domain Name Based Malware

Abstract
Recently, as various detection approaches of malicious domains and malware are proposed, the malware which connects to its command and control (C&C) server using techniques like domain flux can be identified effectively. Therefore, cybercriminals seek new alternative methods and discover that DNS based on blockchains can be used to connect C&C servers. Because of the distributed ledger technology, domain names resolved by blockchain DNS, called blockchain domain names (BDNs), are of inherent anonymity and censorship-resistance. We analyzed the work mechanism of this new type of malware. In order to detect malicious BDNs, we propose a prototype system, named Leopard, which analyzes DNS traffic patterns and resource records of BDNs. To our best knowledge, we are the first to propose the automatic detection of malicious BDNs. In Leopard, we extracted 17 features from collected traffic and distinguished between malicious BDNs and domains operated by generic and country-code top-level domains registries from the Alexa top 5000 using a random forest model. In our experiments, we evaluate Leopard on a nine-day real-world dataset. The experimental results show that Leopard can effectively detect malicious BDNs with an AUC of 0.9980 and discover 286 unknown malicious BDNs from the dataset.
Zhangrong Huang, Ji Huang, Tianning Zang

### To Filter or Not to Filter: Measuring the Benefits of Registering in the RPKI Today

Abstract
Securing the Internet’s inter-domain routing system against illicit prefix advertisements by third-party networks remains a great concern for the research, standardization, and operator communities. After many unsuccessful attempts to deploy additional security mechanisms for BGP, we now witness increasing adoption of the RPKI (Resource Public Key Infrastructure). Backed by strong cryptography, the RPKI allows network operators to register their BGP prefixes together with the legitimate Autonomous System (AS) number that may originate them via BGP. Recent research shows an encouraging trend: an increasing number of networks around the globe start to register their prefixes in the RPKI. While encouraging, the actual benefit of registering prefixes in the RPKI eventually depends on whether transit providers in the Internet enforce the RPKI’s content, i.e., configure their routers to validate prefix announcements and filter invalid BGP announcements. In this work, we present a broad empirical study tackling the question: To what degree does registration in the RPKI protect a network from illicit announcements of their prefixes, such as prefix hijacks? To this end, we first present a longitudinal study of filtering behavior of transit providers in the Internet, and second we carry out a detailed study of the visibility of legitimate and illegitimate prefix announcements in the global routing table, contrasting prefixes registered in the RPKI with those not registered. We find that an increasing number of transit and access providers indeed do enforce RPKI filtering, which translates to a direct benefit for the networks using the RPKI in the case of illicit announcements of their address space. Our findings bode well for further RPKI adoption and for increasing routing security in the Internet.
Cecilia Testart, Philipp Richter, Alistair King, Alberto Dainotti, David Clark

### A First Look at the Misuse and Abuse of the IPv4 Transfer Market

Abstract
Vasileios Giotsas, Ioana Livadariu, Petros Gigis

### Don’t Forget to Lock the Front Door! Inferring the Deployment of Source Address Validation of Inbound Traffic

Abstract
This paper concerns the problem of the absence of ingress filtering at the network edge, one of the main causes of important network security issues. Numerous network operators do not deploy the best current practice—Source Address Validation (SAV) that aims at mitigating these issues. We perform the first Internet-wide active measurement study to enumerate networks not filtering incoming packets by their source address. The measurement method consists of identifying closed and open DNS resolvers handling requests coming from the outside of the network with the source address from the range assigned inside the network under the test. The proposed method provides the most complete picture of the inbound SAV deployment state at network providers. We reveal that 32 673 Autonomous Systems (ASes) and 197 641 Border Gateway Protocol (BGP) prefixes are vulnerable to spoofing of inbound traffic. Finally, using the data from the Spoofer project and performing an open resolver scan, we compare the filtering policies in both directions.
Maciej Korczyński, Yevheniya Nosyk, Qasim Lone, Marcin Skwarek, Baptiste Jonglez, Andrzej Duda

### MUST, SHOULD, DON’T CARE: TCP Conformance in the Wild

Abstract
Standards govern the SHOULD and MUST requirements for protocol implementers for interoperability. In case of TCP that carries the bulk of the Internets’ traffic, these requirements are defined in RFCs. While it is known that not all optional features are implemented and non-conformance exists, one would assume that TCP implementations at least conform to the minimum set of MUST requirements. In this paper, we use Internet-wide scans to show how Internet hosts and paths conform to these basic requirements. We uncover a non-negligible set of hosts and paths that do not adhere to even basic requirements. For example, we observe hosts that do not correctly handle checksums and cases of middlebox interference for TCP options. We identify hosts that drop packets when the urgent pointer is set or simply crash. Our publicly available results highlight that conformance to even fundamental protocol requirements should not be taken for granted but instead checked regularly.
Mike Kosek, Leo Blöcher, Jan Rüth, Torsten Zimmermann, Oliver Hohlfeld

### Extortion or Expansion? An Investigation into the Costs and Consequences of ICANN’s gTLD Experiments

Abstract
Since October 2013, the Internet Corporation of Assigned Names and Numbers (ICANN) has introduced over 1K new generic top-level domains (gTLDs) with the intention of enhancing innovation, competition, and consumer choice. While there have been several positive outcomes from this expansion, there have also been many unintended consequences. In this paper we focus on one such consequence: the gTLD expansion has provided new opportunities for malicious actors to leverage the trust placed by consumers in trusted brands by way of typosquatting. We describe gTLDtm (The gTLD typosquatting monitor) – an open source framework which conducts longitudinal Internet-scale measurements to identify when popular domains are victims of typosquatting, which parties are responsible for facilitating typosquatting, and the costs associated with preventing typosquatting. Our analysis of the generated data shows that ICANN’s expansion introduces several causes for concern. First, the sheer number of typosquatted domains has increased by several orders of magnitude since the introduction of the new gTLDs. Second, these domains are currently being incentivized and monetarily supported by the online advertiser and tracker ecosystem whose policies they clearly violate. Third, mass registrars are currently seeking to profit from the inability of brands to protect themselves from typosquatting (due to the prohibitively high cost of doing so). Taken as a whole, our work presents tools and analysis to help protect the public and brands from typosquatters.

### Counterfighting Counterfeit: Detecting and Taking down Fraudulent Webshops at a ccTLD

Abstract
Luxury goods such as sneakers and bags are in high demand. Many websites offer them at high discounts, which, in many cases, are simply cheap counterfeit versions of the original product. Online shoppers, however, may be unaware they are buying a counterfeit product and end up being scammed and having to deal with financial losses, as has been widely reported by various news outlets. This work presents a multiyear effort of The Netherlands’ .nl country-code top-level domain (ccTLD) in detecting and removing counterfeit online shops from the .nl DNS zone. We have developed two detection systems and partnered with registrars and a large credit card issuer, which ultimately led to more than 4,400 counterfeit online shops being taken down.
Thymen Wabeke, Giovane C. M. Moura, Nanneke Franken, Cristian Hesselman

### When Parents and Children Disagree: Diving into DNS Delegation Inconsistency

Abstract
The Domain Name System (DNS) is a hierarchical, decentralized, and distributed database. A key mechanism that enables the DNS to be hierarchical and distributed is delegation [7] of responsibility from parent to child zones—typically managed by different entities. RFC1034 [12] states that authoritative nameserver (NS) records at both parent and child should be “consistent and remain so”, but we find inconsistencies for over 13M second-level domains. We classify the type of inconsistencies we observe, and the behavior of resolvers in the face of such inconsistencies, using RIPE Atlas to probe our experimental domain configured for different scenarios. Our results underline the risk such inconsistencies pose to the availability of misconfigured domains.
Raffaele Sommese, Giovane C. M. Moura, Mattijs Jonker, Roland van Rijswijk-Deij, Alberto Dainotti, K. C. Claffy, Anna Sperotto

### A First Comparative Characterization of Multi-cloud Connectivity in Today’s Internet

Abstract
Today’s enterprises are adopting multi-cloud strategies at an unprecedented pace. Here, a multi-cloud strategy specifies end-to-end connectivity between the multiple cloud providers (CPs) that an enterprise relies on to run its business. This adoption is fueled by the rapid build-out of global-scale private backbones by the large CPs, a rich private peering fabric that interconnects them, and the emergence of new third-party private connectivity providers (e.g., DataPipe, HopOne, etc.). However, little is known about the performance aspects, routing issues, and topological features associated with currently available multi-cloud connectivity options. To shed light on the tradeoffs between these available connectivity options, we take a cloud-to-cloud perspective and present in this paper the results of a cloud-centric measurement study of a coast-to-coast multi-cloud deployment that a typical modern enterprise located in the US may adopt. We deploy VMs in two regions (i.e., VA and CA) of each one of three large cloud providers (i.e., AWS, Azure, and GCP) and connect them using three different options: (i) transit provider-based best-effort public Internet (BEP), (ii) third-party provider-based private (TPP) connectivity, and (iii) CP-based private (CPP) connectivity. By performing active measurements in this real-world multi-cloud deployment, we provide new insights into variability in the performance of TPP, the stability in performance and topology of CPP, and the absence of transit providers for CPP.
Bahador Yeganeh, Ramakrishnan Durairajan, Reza Rejaie, Walter Willinger

### Unintended Consequences: Effects of Submarine Cable Deployment on Internet Routing

Abstract
We use traceroute and BGP data from globally distributed Internet measurement infrastructures to study the impact of a noteworthy submarine cable launch connecting Africa to South America. We leverage archived data from RIPE Atlas and CAIDA Ark platforms, as well as custom measurements from strategic vantage points, to quantify the differences in end-to-end latency and path lengths before and after deployment of this new South-Atlantic cable. We find that ASes operating in South America significantly benefit from this new cable, with reduced latency to all measured African countries. More surprising is that end-to-end latency to/from some regions of the world, including intra-African paths towards Angola, increased after switching to the cable. We track these unintended consequences to suboptimally circuitous IP paths that traveled from Africa to Europe, possibly North America, and South America before traveling back to Africa over the cable. Although some suboptimalities are expected given the lack of peering among neighboring ASes in the developing world, we found two other causes: (i) problematic intra-domain routing within a single Angolese network, and (ii) suboptimal routing/traffic engineering by its BGP neighbors. After notifying the operating AS of our results, we found that most of these suboptimalities were subsequently resolved. We designed our method to generalize to the study of other cable deployments or outages and share our code to promote reproducibility and extension of our work.
Rodérick Fanou, Bradley Huffaker, Ricky Mok, K. C. Claffy

### Alias Resolution Based on ICMP Rate Limiting

Abstract
Alias resolution techniques (e.g., Midar) associate, mostly through active measurement, a set of IP addresses as belonging to a common router. These techniques rely on distinct router features that can serve as a signature. Their applicability is affected by router support of the features and the robustness of the signature. This paper presents a new alias resolution tool called Limited Ltd. that exploits ICMP rate limiting, a feature that is increasingly supported by modern routers that has not previously been used for alias resolution. It sends ICMP probes toward target interfaces in order to trigger rate limiting, extracting features from the probe reply loss traces. It uses a machine learning classifier to designate pairs of interfaces as aliases. We describe the details of the algorithm used by Limited Ltd. and illustrate its feasibility and accuracy. Limited Ltd. not only is the first tool that can perform alias resolution on IPv6 routers that do not generate monotonically increasing fragmentation IDs (e.g., Juniper routers) but it also complements the state-of-the-art techniques for IPv4 alias resolution. All of our code and the collected dataset are publicly available.
Kevin Vermeulen, Burim Ljuma, Vamsi Addanki, Matthieu Gouel, Olivier Fourmaux, Timur Friedman, Reza Rejaie

### APPLE: Alias Pruning by Path Length Estimation

Abstract
Uncovering the Internet’s router graph is vital to accurate measurement and analysis. In this paper, we present a new technique for resolving router IP aliases that complements existing techniques. Our approach, Alias Pruning by Path Length Estimation (apple), avoids relying on router manufacturer and operating system specific implementations of IP. Instead, it filters potential router aliases seen in traceroute by comparing the reply path length from each address to a distributed set of vantage points.
We evaluated our approach on Internet-wide collections of IPv4 and IPv6 traceroutes. We compared apple’s router alias inferences against router configurations from two R&E networks, finding no false positives. Moreover, apple’s coverage of the potential alias pairs in the ground truth networks rivals the current state-of-the-art in IPv4, and far exceeds existing techniques in IPv6. We also show that apple complements existing alias resolution techniques, increasing the total number of inferred alias pairs by 109.6% in IPv4, and by 1071.5% in IPv6.
Alexander Marder

### Web

#### Frontmatter

Abstract
Adult content constitutes a major source of Internet traffic. As with many other platforms, these sites are incentivized to engage users and maintain them on the site. This engagement (e.g., through recommendations) shapes the journeys taken through such sites. Using data from a large content delivery network, we explore session journeys within an adult website. We take two perspectives. We first inspect the corpus available on these platforms. Following this, we investigate the session access patterns. We make a number of observations that could be exploited for optimizing delivery, e.g., that users often skip within video streams.
Andreas Grammenos, Aravindh Raman, Timm Böttger, Zafar Gilani, Gareth Tyson

Some Myths, Some Truths, and Some Hope
Abstract
Header bidding (HB) is a relatively new online advertising technology that allows a content publisher to conduct a client-side (i.e., from within the end-user’s browser), real-time auction for selling ad slots on a web page. We developed a new browser extension for Chrome and Firefox to observe this in-browser auction process from the user’s perspective. We use real end-user measurements from 393,400 HB auctions to (a) quantify the ad revenue from HB auctions, (b) estimate latency overheads when integrating with ad exchanges and discuss their implications for ad revenue, and (c) break down the time spent in soliciting bids from ad exchanges into various factors and highlight areas for improvement. For the users in our study, we find that HB increases ad revenue for web sites by $$28\%$$ compared to that in real-time bidding as reported in a prior work. We also find that the latency overheads in HB can be easily reduced or eliminated and outline a few solutions, and pitch the HB platform as an opportunity for privacy-preserving advertising.
Waqar Aqeel, Debopam Bhattacherjee, Balakrishnan Chandrasekaran, P. Brighten Godfrey, Gregory Laughlin, Bruce Maggs, Ankit Singla

### Understanding Video Streaming Algorithms in the Wild

Abstract
While video streaming algorithms are a hot research area, with interesting new approaches proposed every few months, little is known about the behavior of the streaming algorithms deployed across large online streaming platforms that account for a substantial fraction of Internet traffic. We thus study adaptive bitrate streaming algorithms in use at 10 such video platforms with diverse target audiences. We collect traces of each video player’s response to controlled variations in network bandwidth, and examine the algorithmic behavior: how risk averse is an algorithm in terms of target buffer; how long does it takes to reach a stable state after startup; how reactive is it in attempting to match bandwidth versus operating stably; how efficiently does it use the available network bandwidth; etc. We find that deployed algorithms exhibit a wide spectrum of behaviors across these axes, indicating the lack of a consensus one-size-fits-all solution. We also find evidence that most deployed algorithms are tuned towards stable behavior rather than fast adaptation to bandwidth variations, some are tuned towards a visual perception metric rather than a bitrate-based metric, and many leave a surprisingly large amount of the available bandwidth unused.
Melissa Licciardello, Maximilian Grüner, Ankit Singla

### Exploring the Eastern Frontier: A First Look at Mobile App Tracking in China

Abstract
Many mobile apps are integrated with mobile advertising and tracking services running in the background to collect information for tracking users. Considering China currently tops mobile traffic growth globally, this paper aims to take a first look at China’s mobile tracking patterns from a large 4G network. We observe the dominance of the top popular domestic trackers and the pervasive tracking on mobile apps. We also discover a very well-connected tracking community, where the non-popular trackers form many local communities with each community tracking a particular category of mobile apps. We further conclude that some trackers have a monopoly on specific groups of mobile users and 10% of users upload Personally Identifiable Information (PII) to trackers (with 90% of PII tracking flows local to China). Our results consistently show a distinctive mobile tracking market in China. We hope the results can inform users and stakeholders on the interplay between mobile tracking and potential security and privacy issues.
Zhaohua Wang, Zhenyu Li, Minhui Xue, Gareth Tyson