Fermat's Library | Examining How the Great Firewall Discovers Hidden Circumvention Servers annotated/explained version.

Examining How the Great Firewall

Discovers Hidden Circumvention Servers

Roya Ensaﬁ

Princeton University

David Fiﬁeld

UC Berkeley

Philipp Winter

Karlstad & Princeton

University

Nick Feamster

Princeton University

Nicholas Weaver

UC Berkeley & ICSI

Vern Paxson

UC Berkeley & ICSI

ABSTRACT

Recently, the op erat ors of the national censorship infras-

tructure of China began to empl oy “active probing” to de-

tect and blo ck the use of privacy tools. This probing works

by passively monitoring the network for suspicious traﬃc,

then actively probing the corresponding servers, and blo ck-

ing any that are determined to run circumvention servers

such as Tor.

We draw upon multiple forms of measurements, some

spanning years, to illuminate the nature of this probing. We

identify the diﬀerent types of probing, develop ﬁngerprint-

ing techniques to infer the physical structure of the system,

localize the sensors that trigger probing—showing that they

diﬀer from the “Great Firewall” infrastructure—and assess

probing’s eﬃcacy in blocking diﬀerent versions of Tor. We

conclude with a discussion of the implications for design-

ing circumvention servers that resist such probing mecha-

nisms.

Categories and Subject Descriptors

C.2.0 [General]: Security and protection (e.g., ﬁrewalls);

C.2.3 [Network Operations]: Network monitoring

General Terms

Measurement

Keywords

Active Probing, Deep Packet Inspection, Great Firewall of

China, Censorship Circumvention, Tor

1. INTRODUCTION

Those in charge of the Chinese censorship apparatus spend

considerable eﬀort countering privacy tools. Among their

most advanced techniques is what the Tor community terms

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full citation

on the ﬁrst page. Copyrights for components of this work owned by others than the

author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or

republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission

and/or a fee. Request permissions from Permissions@acm.org.

IMC’15, October 28–30, 2015, Tokyo, Japan.

Copyright is held by the owner/author(s). Publication rights licensed to ACM.

ACM 978-1-4503-3848-6/15/10 ...$15.00.

DOI: http://dx.doi.org/10.1145/2815675.2815690.

Figure 1: The ﬁrewall cannot determine, by mere inspec-

tion, whether the encrypted connection carri es a prohibited

circumvention protocol. Therefore it issues its own probes

and observes how the server responds.

“active probing”: passively monitoring the network for sus-

picious traﬃc, actively probing the corresponding servers,

and blocking those determined to run circumvention services

such as Tor.

The phenomenon of active probing arose presumably in

response to enhanced circumvention systems that better re-

sist traditional forms of blocking. For example, instead of

employing a protocol recognizable by deep packet inspec-

tion (DPI), some of these systems embed their traﬃc inside

TLS streams. Barring any subtle “tells” in the circumven-

tion system’s communication, the censor cannot distinguish

circumventing TLS from any other TLS, and thus cannot

readily block the circumvention without incurring signiﬁ-

cant collateral damage. Active probing enables the censor

to disambiguate the otherwise opaque traﬃc and once again

obtain a measure of control over it.

Figure

1 illustrates the general scheme of active probing.

T

he censor acts like a user and issues its own connections

to a suspected circumvention server. If the server responds

using a prohibited protocol, then the censor takes a block-

ing action, such as adding its IP address to a blacklist. If

the circumvention server does not incorporate access control

mechanisms or techniques to distinguish the censor’s probes

from normal user connections, the censor can reliably iden-

tify and block it.

The eﬀectiveness of active probing is reﬂected in its diverse

uses. As of September 2015, researchers have documented

445

its use against Tor [32], SSH [20], and VPN protocols [21,

10], and here we document additional probing targets.

Through this work we aim to better understand the nature

of active probing as conducted today against privacy tools

and censorship circumvention systems. We seek to answer

questions such as: What stimuli cause active probing? How

long does it take until a server gets probed? What types of

prob es do we see, and from where do they originate? How

eﬀective is active probing? What does its operation reveal

ab out ways to thwart it?

We consider only “reactive probing:” probing that is trig-

gered by the observation of some stimulus. Censors could

also conceivably employ “proactive probing” by scanning the

Internet (on a particular port, say) without waiting for a

stimulus, but we did not seek to study that. The only pos-

sible exception is in our examination of the logs of a server

that began to receive active probes without our having in-

stigated them—though the server’s status as a Tor bridge

may help explain that.

We draw upon a number of datasets from several vantage

p oi nts, including some extensive longitudinal data, to exam-

ine these questions. Our work makes these contributions:

• We describ e measurement infrastructure for studying

active probing systems.

• We identify various probe types, including some previ-

ously undo c umented, and chart their volume over time

since their ﬁrst appearance in our data in 2013. The

vast majority originate from Chinese IP addresses.

• Using network protocol ﬁngerprinting techniques, we

infer the physical structure of the probing system.

• We localize the sensors that trigger active probes and

show they are likely distinct from China’s main cen-

sorship infrastructure, the “Great Firewall” (GFW).

We structure the rest of the paper as follows. Section

2

covers related work, followed by background in Section 3.

S

ection 4 describes our datasets, and Section 5 delves into

their analysis; Section 6 concl udes.

2. RELATED WORK

Academia and civil society have spent signiﬁcant eﬀorts

analyzing and circumventing the GFW, providing us with a

comprehensive understanding of how it blocks IP addresses

and TCP ports [7], DNS requests [1, 24], and HTTP re-

quests [22, 3]; and the nature of its TCP processing [13].

McLachlan and Hopper [19] warned of the possibility of

Tor bridge discovery by Internet scanning in 2009. The

study of practical, in-the-wild “active probing” associated

with Chinese censorship began in late 2011, when Nixon no-

ticed suspicious entries in his SSH log ﬁles [

20], including

n

on-conformant payloads of seemingly random byte strings.

Careful analysis revealed a pattern: these strange probes,

which originated from IP addresses in China, were triggered

by prior genuine SSH logins, by real users, from diﬀerent

Chinese IP addresses. In 2012, Wilde documented a similar

phenomenon, thi s time targeting the Tor protocol [

30]. Mo-

t

ivated by reports that China was blocking Tor bridges only

minutes after their ﬁrst use from within China, he inves-

tigated and observed the GFW performing active probing,

triggered by observing a particular list of TLS cipher suites,

the one oﬀered by Tor clients. The probing took the form of

TLS connections that attempted to establish Tor circuits.

Wilde also observed “garbage” random binary probes like

the ones seen by Nixon for SSH.

Later in 2012, Winter and Lindskog revisited Wilde’s anal-

ysis using a server in Beijing [

32]. They attracted probers

o

ver a perio d of 17 days and analyzed the probers’ IP ad-

dress distribution, how blocking was eﬀected, and how long

bridges remained blocked. They conjectured, but did not

establish, that the GFW uses IP address hijacking to obtain

its large pool of source IP addresses; that is, that the prob-

ing apparatus temporarily borrowed IP addresses that were

otherwise allocated.

In 2013, reports suggested that the GFW had begun active

probing against obfs2 [

31], an obfuscation transport for Tor

s

p eciﬁcal ly designed to be diﬃcult to detect by DPI. (A

description of obfs2 appears in the next section.) The timing

of these reports corresponds well with our own data.

A year later, Nobori and Shinjo discussed their experience

with running a large VPN cluster for circumvention [21].

They likewise observed a pattern of connections from China

shortly prior to blocking of a server. Other reports indicate

that VPN services receive similar probing [10].

Our work aims to broaden the above perspectives, which

have generally relied upon one-time measurements from sin-

gle vantage points; and to illuminate the nature of active

probing in greater depth, including its range of probing, re-

sp onse times, and system infrastructure.

3. BACKGROUND:

CIRCUMVENTION PROTOCOLS

Active probing is a reaction against the increasing eﬀec-

tiveness of censorship circumvention. In this section we

brieﬂy describe Tor’s place in the world of circumvention,

and the obfuscated protocols (“transports”) that cloak Tor

traﬃc and make it more resistant to censorship. Three of

these protocols—“vanilla” Tor, obfs2, and obfs3—underlie

and motivate our experiments. More than that, though,

these protocols tell a small part of the story of the global

censorship arms race. In their technological advancement,

one can see the correspondingly increasing sophistication of

censors: starting from their ignorance of Tor, moving on to

simple IP address-based blocking, then online detection of

obfuscation, and now active probing.

3.1 Tor

Years ago, Tor found success in evading various types of

censorship such as web site blocks. Censored users found

they could treat the network as a simple proxy service with

many access points (its anonymity properties b ei ng of sec-

ondary importance to these users). Despite this success,

however, the unadorned “vanilla” Tor protocol is not partic-

ularly suited to circumvention. Once censors began looking

for it, they found it easy to block. Tor’s biggest weakness

in this respect is its global public list of relays. A censor

can simply download this list and add each IP address to a

blacklist—and censors began to do exactly that.

In response to the blocking of its relays, the operators of

the Tor network began to reserve a portion of new relays as

secret, non-public “bridges.” Unlike ordinary relays, bridges

are not easily enumerable [

6]. They are carefully distributed

t

hrough rate-limited out-of-band channels such as email and

446

HTTPS, and only a few at a time. The goal i s to make it

p ossible for anyone to learn a few bridge addresses, while

making it hard for anyone to learn them all. By design,

learning many bridge addresses requires an attacker to con-

trol resources such as an abundance of IP addresses and

email addresses, or the abili ty to solve CAPTCHAs.

Even using secret bridge relays, Tor remains vulnerable to

detection by deep packet inspection (DPI). Tor uses TLS in

a fairly distinctive way that causes it to stand out from other

TLS-based protocols. Censors can inspect traﬃc looking for

the “tells” that distinguish Tor from other forms of TLS,

and block connections as they arise. After early eﬀorts to

make their use of TLS less conspicuous [

28], the developers

o

f Tor settled on a more sustainable strategy: wrapping the

entire Tor TLS stream in another layer—a “pluggable trans-

p ort” [29]—that assumes responsibility for protocol-level ob-

fuscation. This model allows for independent innovation in

circumvention, while leaving the core of Tor free to focus on

its main purpose of anonymity.

3.2 Obfs2

The ﬁrst pluggable transport was obfs2 [25], introduced

in 2012. It was designed as a simple, expedient workaround

for DPI of the kind that was then occurring in Iran [4]. It

provides a lightweight obfuscation layer around Tor’s TLS,

re-encrypting the entire stream with a separate key in a

way that leaves no plaintext or framing information that

can serve as a basis for blocking—the entire communication

looks like a uniformly random byte stream in both direc-

tions. The simple scheme of obfs2 had immediate success.

The protocol has a serious deﬁciency, though: it is possible

to detect it completely passively and with high conﬁdence.

Essentially, obfs2 works by ﬁrst sending a key, then sending

ciphertext encrypted with that key. Therefore a censor can

simply read the ﬁrst few bytes of every TCP connection,

treat them as a key, and speculatively decrypt the ﬁrst few

bytes that follow. If the decryption is meaningful (matching

a TLS handshake, for example), then obfs2 is detected and

the censor can terminate the connection.

We turned the weakness of obfs2 to our advantage. Its

easy passive detectability, coupled with its lack of use for

anything but circumvention (and active probing), meant

that we could mine past network logs looking for obfs2 con-

nection attempts. Later we will describe how we used obfs2

prob es to seed a list of past prober IP addresses.

3.3 Obfs3

The follow-up protocol obfs3 [

26] was designed to remedy

t

his critical ﬂaw in obfs2. Its key innovation is a Diﬃe-

Hellman negotiation that determines the keys to be used to

encrypt the rest of the stream. (The key exchange is not

as trivial as it may seem, because i t, like the rest of the

proto c ol, must be indistinguishable from randomness.) This

enhancement in obfs3 deprives the censor of the simple, pas-

sive, reliable distinguisher it had for obfs2. The censor must

either intercede in the key exchange (using a man-in-the-

middle attack to learn the secret encryption keys), or settle

for heuristic detection of random-looking streams. While

either of these options may be problematic to implement,

heuristic detection becomes entirely workable when com-

bined with active probing. An initial, inaccurate test can

identify potential obfs3 connections; then an active probe

conﬁrms or denies the suspicion.

Jul 01 Jul 15 Aug 01 Aug 15 Sep 01

0 4000 8000

Time

Estimated users

●

●●●

●

●●

●

●●

●

●●

●

Vanilla Tor obfs2 obfs3

Figure 2: The estimated user numbers of the three trans-

p

ort protocols we study—vanilla Tor, obfs2, and obfs3—in

July and August 2015. Obfs3 is the most popular proto-

col, followed by vanilla Tor. Obfs2 is superseded and sees

practically no use any more.

3.4 Other Protocols

Though we limited the focus of our active experiments

to Tor-related protocols, in the course of gathering data we

incidentally found evidence of probing for other protocols,

unrelated to Tor except that they also have to do with cir-

cumvention. The ﬁrst of these probes, which we have labeled

AppSpot in this paper, is an HTTPS-based check for domain

fronting [

8], a circumvention technique that disguises access

t

o a proxy by making it appear to be access to an innocu-

ous web page. In all of the examples we found, the probes

checked whether a server is capable of fronting for Google

App Engine at its domain appspot.com. The other probe we

discovered we label SoftEther, because it resembles the client

p orti on of the handshake of SoftEther VPN, the VPN soft-

ware underlying the VPN Gate circumvention system [21].

Because we found these ancillary types of probe activity by

accident, we make no cl aims to thoroughness in our coverage

of them, and suggest that there may be other, yet unknown

types of active probing to discover.

Our study focuses on va nilla Tor, obfs2, and obfs3, these

b eing the commonly used protocols that remain vulnerable

to active probing. There are other, newer protocols, includ-

ing spiritual successors ScrambleSuit [33] and obfs4 [27], that

have resistance to active probing as an explicit design crite-

rion. Although they are gaining in popularity, they have not

yet eclipsed obfs3. The key enhancement of these successor

proto c ols is that they require the client, in its initial mes-

sage, to prove knowledge of a server-speciﬁc secret (trans-

mitted out of band). Put another way, mere knowledge of

an IP address and port is not enough to conﬁrm the exis-

tence of a circumvention server. As of this writing, obfs3

remains Tor’s most-used transport, having around 8,000 si-

multaneous users on average, as shown in Figure 2. The

obfs2 protocol is deprecated, no longer oﬀered in the user

interface, and its use is on the wane.

4. EXPERIMENTS

We base our results on several experiments, each result-

ing in a dataset that oﬀers a distinct view into the behavior

of active probing. The datasets cover diﬀerent time ranges

(see Table 1) and involve diﬀerent setups. Table 2 sum-

marizes the phenomena that each can illuminate, with each

contributing at least one facet not covered by the others.

We now describe each experiment in detail.

447

Dataset Time span

Shadow Dec 2014 – Feb 2015 (three months)

Sybil Jan 29, 2015 – Jan 30, 2015 (20 hours)

Log Jan 2010 – Aug 2015 (ﬁve years)

Counterprobe Apr 22 – Apr 27 (six days)

Table 1: Timeline of our experiments. We created four

datasets that span hours, days, months, and years.

Counter-

Shadow Sybil Log probe

Probing rate X

ISN patterns X

TSval patterns X X X

obfs2/3 blocking X X

Tor bootstrapping X

Probing types X

Architecture X X

Topology X

Table 2: Observed phenomena and their visibility in our

datasets.

4.1 Shadow Infrastructure

We built a “shadow infrastructure” of our own Tor clients

and bri dges for a controlled experiment of active probing

over time. These clients and bridges were not actually used

by any real users, but rather were dedicated exclusively to

our own experimental purposes. The infrastructure tested

vanilla Tor, obfs2, and obfs3 in equal measure, since active

probing is known to target all three of these protocols. Fig-

ure

3 illustrates this setup.

W

e had six Tor clients within China: three in China U ni-

com, a large country-wide ISP; and three in CERNET, the

Chinese Research and Education Network. (We chose CER-

NET because previous work suggested that censorship of

CERNET might diﬀer from the rest of China [7].) Outside

of China, we ran six Tor bridges in Amazon’s EC2 cloud.

Two of the bridges ran vanilla Tor, two ran obfs2, and two

ran obfs3. We assigned each of our six clients in China to a

unique EC2 machine; the clients never contacted any bridge

other than their own assigned one. Initially, all clients at-

tempted to connect to their assigned bridge every 15 min-

utes. After one month, we changed this to ﬁve minutes after

preliminary analysis showed that ﬁner granularity in timing

might be useful.

We also created a control group consisting of nine bridges

(six in Amazon EC2, three in a US university) and a sin-

gle client outside of China. We never connected to any of

the control bridges from one of the clients within China; by

comparing the traﬃc received by our “active” and control

bridges, we can isolate general background scanning from

active probing by the GFW. The three control bridges not

hosted on EC2 allow us to determine whether the GFW

treats EC2-hosted servers diﬀerently from others. The con-

trol client outside of China connects to all of the bridges.

If our control client could not establish a Tor connection to

one of the “active” bridges, we discarded the measurement

we did from China for that bridge.

We took various steps to prevent our bridges from being

discovered by any means other than active probing. We con-

ﬁgured all of them to be private bridges, which means that

Figure 3: Experimental setup for the “Shadow” dataset.

they did not advertise themselves publicly, neither to the

public Tor directory, nor to its database of secret bridges.

As a result, no genuine Tor user should attempt to connect

to one of our bridges. The bridges listened on random ports

in the ephemeral range, to reduce the chance of their dis-

covery by blind Internet scanning. Finally, we used another

EC2 machine to proxy the communication between our Tor

bridges and the ﬁrst public Tor relay in a circuit. This extra

proxy hop is to prevent another potential bridge-discovery

attack, wherein a malicious Tor relay makes a list of all the

IP addresses that connect to it (cf. [

14, §I

I I.D]).

4.2 Sybil Infrastructure

To obtain broader insight into the extent of the censor’s

active probing infrastructure, we designed another experi-

ment to attract

1

many active probers in a short period of

time.

We did so by constructing a “Sybil infrastructure,” so

named because it seemingly consisted of hundreds of dis-

tinct Tor servers. We used a virtual private server (VPS) i n

France and one in China. We ran a vanilla Tor bridge on the

VPS in France and redirected the port range 30000–30600

to our Tor port using ﬁrewall port redirection. The actual

Tor server ran on a separate port in the ephemeral range.

Then, from our VPS in China, we established Tor con-

nections to every port in the port range in ascending or-

der. This took approximately two hours, because we waited

several seconds in between connection attempts. Since t he

GFW blocks by IP:port tuple, not just IP address [32], the

GFW interpreted every single port in the range as a distinct

Tor bridge and probed them separately. This experiment

resulted in 622 active probing connections (and signiﬁcantly

more TCP connections, as we will discuss later) to the VPS

in France.

4.3 Server Log Analysis

This dataset comes from the application logs of a server

op erated by one of the authors, some stretching back to

January 2010. The server runs various common network

services, including the three we use in the analysis: HTTP,

HTTPS, and SSH. In addition to common networking ports,

the server has hosted a Tor bridge si nce January 2011, an

ordinary vanilla bridge without pluggable transports. By

mining the application logs, we found that the server has

b een receiving active prob es from China for over 2.5 years.

An important diﬀerence in the Log experiment compared

1

Our earlier analysis conﬁrmed that simply establishing an

initial TLS handshake with a server suﬃces to attract a

prob er.

448

to the others is that it is not the result of artiﬁcial activity

to induce probing. Our server received its ﬁrst probes in

January 2013, slightly earlier than the ﬁrst public reports

of obfs2 probing [31]. On average, it has received dozens of

prob es every day since active probing began—though there

are long stretches during which it received few probes.

This dataset provides an invaluable longitudinal perspec-

tive, though with the signiﬁcant limitation that application

logs do not record as much forensic information as we would

like: they omit source ports and other transport-layer infor-

mation, and usually truncate probe payloads (see below).

Data Types and Ranges.

The HTTP and HTTPS log go back to January 2010, be-

fore the earliest reports of active probing of any kind. The

SSH log dates only to September 2014; probing of that p ort

was already in eﬀect at the beginning of the log. While ap-

plication logs do not contain as much information as would,

say, packet captures, they suﬃce in many cases to iden-

tify the type of probe and the prober’s IP address. The

HTTPS log is even superior to a packet capture in one re-

sp ect: for TLS probes it contains the decrypted application

data, which would not be accessible from a packet capture.

Along with the application logs, the system has full packet

captures for ports 23, 80, and 443, between December 2014

and May 2015. The packet captures enable us to perform

more detailed analysis on the network- and transport-layer

features of probes. We opened port 23 (Telnet) specially

during this period, and used it to host a multi-protocol hon-

eypot server capable of respondi ng to probes of various types

(TLS, obfs2, and obfs3), and recording the protocol layers

inside them. (To avoid inadvertently capturing potentially

sensitive activity, we do not perform any packet capture or

additional logging on the SSH and Tor ports.)

The server from which we gathered the Log data is a

working server that performs a number of functions apart

from simply absorbing act ive probes. In order to distin-

guish active probes from operati onal traﬃ c, we used a form

of conservative snowball sampling. We started by extract-

ing incontrovertible probes; namely, obfs2 probes and non-

proto c ol-conformi ng payloads that we could not otherwise

explain, for example random binary garbage written to the

HTTP port. (It is easy to detect obfs2, with negligible

false positives, because of the protocol weakness described

in Section

3.2. It requires only the ﬁrst 20 bytes sent by the

c

lient.) We then made a list of the IP addresses that had

sent those probes, and examined all other traﬃc they had

sent at any point in time. Despite that the GFW’s active

probing rarely reuses source IP addresses, we occasionally

found a new probe type. When we did, we added it to our

list of known probes and repeated the process. In t his way,

we slowly expanded our uni verse of known active prober IP

addresses, all the while checking manually to make sure we

were not sweeping up non-probing traﬃc.

This conservative approach enabled us to ﬁnd a variety of

probing behaviors with few false positives, at the potential

cost of missing some novel probes from IP addresses that did

not also send a recognized probe type. This technique led

to our discovery of the “AppSpot” and “SoftEther” probes,

even though we seeded the process only with Tor-related

prob es. Except for a handful of manually excluded hosts

(e.g., systems under our control), we did not consider the

source IP address in deciding whether a log entry indicated

a probe. Speciﬁcally, we did not employ IP geolocation to

ﬁnd probes emanating from China, though Figure

7 shows

t

hat the probes we found did in fact overwhelmingly come

from China.

We found a small amount of non-probing traﬃc (e.g., or-

dinary HTTP requests for actual pages) from IP addresses

that also sent a probes at some other time. The shortest

separation between probe and non-probe was three weeks,

and the longest was two years.

Limitations.

Our application logs have several limitations. The HTTP

and HTTPS (Apache) log truncates at the ﬁrst ‘\0’, ‘\n’, or

‘\r\n’ sequence, and omits a leading ‘\n’ (we account for this

p ossibility when classifying probes). The SSH (OpenSSH)

log truncates at the ﬁrst ‘\0’, ‘\r’, or ‘\n’, and has a hard

limit of 100 bytes.

A signiﬁcant eﬀect of these limitations is that application

logs will not record any Tor probes as such, not even when

they are received by the HTTPS port, which removes the

outer TLS layer. The ﬁrst message that a Tor client sends

after the TLS handshake is a VERSIONS cell, which happens to

start with a ‘\0’ byte, causing the payload to be truncated

to a length of zero in the server log.

4.4 Counterprobing

Active probers seem to share their IP address pools with

normal Internet users. To investigate this, we scanned some

prob ers repeatedly using network diagnostic tools such as

ping, traceroute, and Nmap. We started the ﬁrst scan the

moment a prober showed up. From then on, we repeated

the scan hourly for 24 hours. Interestingly, the very ﬁrst

scan never yielded anything. The probers were unrespon-

sive to all packets. Later scans, however, painted a diﬀerent

picture. In many cases, our port scans identiﬁed the IP ad-

dresses that were used for active probing just hours before

as various versions of Windows and Linux. We found open

p orts used for ﬁle sharing, FTP, HTTP, RPC, and many

other services typical for home users. This indicates that

in many cases active probers share their address pools with

Internet users. We were able to run a small number of coun-

terprob es from a host within China, and the results matched

those of simultaneous counterprobes from outside the coun-

try: prober IP addresses were initially unresponsive, but

o cca sionally became responsive as disparate and seemingly

ordinary Internet hosts.

With the previous experiments in hand, we then set out

to develop tests to probe t he architecture of the passive

monitor that triggers the probes, and the active compo-

nent responsible for sending them. A unidirectional three-

packet sequence—SYN, ACK, and a Tor TLS client hand-

shake packet—sent from a Chinese host suﬃces to trigger

an active probe.

We built several tests and ran them from a Unicom host

and a CERNET host to our EC2 infrastructure. These tests

included a traceroute for the monitor; a fragmentation test

that splits the request across multiple TCP packets to ex-

p ose whether the passive detection maintains per-ﬂow state;

a SYN-ACK traceroute that uses TTL-limited packets to

respond to probe requests; and a “milker” that rep eat edly

sends triggering requests to a Tor bridge.

449

Client Succeeded Failed Total

CERNET vanilla 639 (12%) 4,490 (88%) 5,129

Unicom vanilla 90 (2%) 3,864 (98%) 3,954

CERNET obfs2 4,584 (98%) 81 (2%) 4,665

Unicom obfs2 4,153 (89%) 515 (11%) 4,668

CERNET obfs3 5,015 (98%) 95 (2%) 5,110

Unicom obfs3 3,402 (86%) 552 (14%) 3,954

Table 3: Connection success statistics for our clients in

China. While vanilla Tor is mostly unreachable, obfusca-

tion protocols are mostly reachable.

5. ANALYSIS

In this section, we study the following questions:

• Is active probing successful at discovering Tor bridges?

(§ 5.1)

• What is the delay between a connection attempt to

a Tor bridge and subsequent active probing of the

bridge? (§ 5. 2)

• Where in the network are the probes coming from?

(§ 5.3)

• What types of probers do we see? (§ 5.4)

• Do active probers have “ﬁngerprints” that distinguish

them from normal clients? (§ 5.5)

• What is the underlying architecture of the probing sys-

tem? (§ 5. 6)

5.1 Effectiveness of Active Probing

To determine whether China’s active probing infrastruc-

ture is eﬀectively discovering and blocking Tor bridges, we

analyzed the log ﬁles of our Tor clients inside China to see

which connection attempts were successful in connecting to

their assigned bridge. We consider a connection unblocked

if the TCP handshake succeeded, because previous work

showed that the Chinese censors block Tor bridges by drop-

ping SYN-ACK segments [32]; if a bridge is blocked, it will

b e blocked at the TCP layer. We did this for each of the pro-

to c ols supported by our clients and bridges: “vanilla” Tor,

and the obfs2 and obfs3 obfuscation protocols.

Table 3 shows the results of this experiment. Our results

show that obfs2 and obfs3 were almost always reachable for

our clients both in CERNET and Unicom. This result is sur-

prising because the Great Firewall has the ability to probe

for obfs2 and obfs3 (Section 4.3). Vanilla Tor, on the other

hand, is almost completely blocked. Figure 4 illustrates our

attempts to connect to Tor bridges using vanilla Tor clients

over time. We ﬁnd that although Tor is mostly blo cked, Tor

clients succeed in connecting roughly every 25 hours. We

b elieve that this reﬂects an implementation artifact of the

GFW (e.g., many commercial ﬁrewalls“fail open”when their

access control lists are being updated).

5.2 Delay Between Connection and Probing

We investigate how long it takes the Chinese censors to

prob e a Tor bridge after we initiate a connection to it. In

2011, Wilde observed that active probing occurred in 15-

minute intervals [30], suggesting that the censors maintained

a “probing queue” that was processed every 15 minutes. The

12-17 12-24 12-31 01-07 01-14 01-21

Time

blocked

(vanilla CERNET)

available

(vanilla CERNET)

blocked

(vanilla Unicom)

available

(vanilla Unicom)

Figure 4: Successful Tor connections of our vanilla Tor

clients in China over time. Every dot represents one con-

nection attempt.

Sybil experiment helps us determine whether this is still the

case. We calculated the time between when we established

a connection to our Sybil node in France and when we ob-

served the Chinese censors’ subsequent probing co nnections.

Figure 5 illustrates the results of this experiment. It shows

the time diﬀerence in seconds (Y-axis) for every port (X-

axis) on our decoy bridge. There are two interesting as-

p ects. First, 56% of all probing connections arrived after

less than one second (median 552 ms). We conclude that

the Chinese censors have abandoned a 15-minute queue and

now operate in real time. Second, there are several curious

delay spikes where the Chinese censors took more than one

minute to probe our bridge. We removed fo ur outliers that

had a probing delay ab ove 800 seconds. Interestingly, all

spikes decreased linearly until they reached the default of

real-time probing.

Figure

6 shows a sup erset of the same data, but with the

b

ridge port on the Y-axis and the absolute time on the X-

axis. The line to the left consists of our decoy connections

(red circles), which were quickly followed by probing con-

nections (black crosses), which is basically the data shown

in Figure 5. What Figure 5 did not show is the line to the

right, a secondary “swarm” of probing connections. We cap-

tured these connections approximately 12 hours after our

initial decoy connections (at which point our bridge was no

longer running). The probers tried to connect several times,

without success. Presumably, the Chinese censors were re-

connecting to all bridges in order to verify whether they were

still online.

5.3 Origin of the Probers

In total, we collected 16,083 unique prober IP addresses.

There are 158 IP addresses in the Shadow dataset; 1,182 in

the Sybil; and 14,912 in the Log. 111 of the addresses appear

in two of the three datasets, and just one—202.108.181.70—

app ears in all three.

Prob er IP addresses are rarely reused. Consider just the

Log experiment: it recorded 16,464 probes carrying 15,249

distinct payloads; these probes were sent by 14,163 dis-

tinct IP addresses. 95% of addresses appear only once;

another 4% appear twice. This lack of IP address reuse

precludes simple blacklisting as a counter to active prob-

ing: a blacklisted prober IP is unlikely to be seen again.

There is one cl ear outlier among probers, the aforementioned

450

●

●●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

30000 30100 30200 30300 30400 30500

0 50 100 150

Port on decoy Tor bridge

Probing delay (sec)

Figure 5: The time diﬀerence between decoy connections through our Sybil node and probing connections for every port on

our Tor decoy bridge; the censors probe the ports on this bridge in rapidly ascending order. While the censors probed most

p orts immediately, some port ranges were o nly scanned signiﬁcantly later.

19:00 00:00 05:00 10:00

30000 30200 30400

Time

Destination port

●

Decoy connection

Probing connection

Figure 6: The arrival rate of probing connections. Decoy

c

onnections through our Sybil node are immediately fol-

lowed by probing connections from the Chinese censors (the

red circles and ﬁrst set of black crosses are nearly on top

of one another). After approximately 12 hours, the Chinese

censors probed the same port range again.

202.108.181.70, which appears 248 times i n the Log dataset.

This special IP address has previously been associated with

the GFW’s active probing by Winter and Lindskog [32] and

Majkowski [16]. We have observed it to send the obfs2,

obfs3, and TLS probe types. It lies in AS4837, along with

many other probers. No other prober IP address is in its

/16 network.

Various measurements place the probing IP addresses al-

most entirely within China. Figure 7 shows the distribution

of autonomous systems of prober IP addresses. Nearly ev-

ery probe is from a Chinese AS. Table 4 shows the DNS

Start of Authority (SOA) of the prober IP addresses. The

most common SOA is ns1.apnic.net (Asia Paciﬁc Network

Information Centre, the regional Internet registry for the

Asia Paciﬁc region). The next most common SOAs are .cn

domains and Chinese ISPs.

5.4 Probe Types

We categorized the probes we received into probe types.

We discovered ﬁve distinct probe types across all four of

our experiments, plus one “TLS” pseudo-type, as we de-

scrib e below. Although the probes diﬀer in important ways,

they have in common the targeting of circumvention pro-

to c ols. Section

5.5 describes how seemingly independent

C

ount SOA

7,379 ns1.apnic.net

2,013 none

501–1,000 ns.jlccptt.net.cn, ns.zjnbptt.net.cn

201–500 ns.sdjnptt.net.cn, ns3.tpt.net.cn, soa,

ns.sxtyptt.net.cn, ns.hbwhptt.net.cn,

ns.bta.net.cn, ns.hazzptt.net.cn,

hzdns.zjnetcom.com, dns.fz.fj.cn,

nmdns.hh.nm.cn

101–200 ns.timeson.com.cn, dnssvr1.169ol.com,

HNGTDNS2.hunan.unicom.com,

HNGTDNS1.hunan.unicom.com,

nmc1.ptt.js.cn, NS2.NET.EDU.CN,

ns.dcb.ln.cn, ns1.jscnc.net,

ns.yn.cninfo.net, ns1.ah.cnuninet.net,

lo calhost.localdomain

1–100 53 others

Table 4: DNS Start of Authority of prober IP addresses.

prob es share low-level features (suggesting that they may

originate from shared physical infrastructure), and high-

lights some of the features that may be useful for distin-

guishing active probers from genuine clients.

TLS.

The “TLS” probe type is a TLS client hello, one whose ap-

plication payload we did observe, or one that did not match

any more speciﬁc prob e type. This was the case, for ex-

ample, in the Log exp eri ment when a TLS probe arrived at

a plaintext port: the application log records the beginning

of the TLS header but nothing else. “TLS” probes could

have been one of our other known types—Tor, SoftEther, or

Appsp ot—or something else entirely.

Tor.

The “Tor” probe type i s a Tor VERSIONS cell received within

a TLS connection. The probes we observed used a rela-

tively obsolete “v2 handshake,” superseded since 2011 [

28].

T

his probe type i s presumably aimed at the discovery of Tor

bridges.

Obfs2.

The “obfs2” probe type is the client part of the handshake

of obfs2. Recall that obfs2’s handshake is intended to look

451

CHINATELECOM-CORE-WAN-CN2 China Telecom Next Generation Carrier Network,CN

CNNIC-KUANCOM-AP Beijing Kuancom Network Technology Co.,Ltd.,CN

CMNET-V4HEBEI-AS-AP Hebei Mobile Communication Company Limited,CN

CMNET-HUNAN-AP China Mobile communications corporation,CN

CHINATELECOM-TJ-AS-AP ASN for TIANJIN Provincial Net of CT,CN

CMNET-JIANGSU-AP China Mobile communications corporation,CN

CMNET-V4TIANJIN-AS-AP tianjin Mobile Communication Company Limited,CN

CNCGROUP-GZ China Unicom Guangzhou network,CN

CTTNET China TieTong Telecommunications Corporation,CN

CMNET-GD Guangdong Mobile Communication Co.Ltd.,CN

CHINA169-GZ China Unicom IP network China169 Guangdong province,CN

CHINA169-BJ CNCGROUP IP network China169 Beijing Province Network,CN

GIGAINFRA Softbank BB Corp.,JP

ERX-CERNET-BKB China Education and Research Network Center,CN

CSTNET-AS-AP Computer Network Information Center,CN

CHINANET-BACKBONE No.31,Jin-rong Street,CN

CHINA169-BACKBONE CNCGROUP China169 Backbone,CN

AS4809

AS17969

AS24547

AS56047

AS17638

AS56046

AS38019

AS17622

AS9394

AS9808

AS17816

AS4808

AS17676

AS4538

AS7497

AS4134

AS4837

0 2000 4000 6000 8000

Number of probes

empty (0.8%) short (14.2%) obfs2 (51.7%) obfs3 (27.2%) TLS (1.4%) SoftEther (0.3%) AppSpot (3.9%) HTTP (0.4%)

Probes by AS, Log experiment, ports 22, 23, 80, and 443

Figure 7: Three autonomous systems—AS4837, AS4134, and AS7497—account for 95% of probes observed in the Log ex-

p eriment. We identiﬁed probing behavior by content, not by source address. Despite that, essentially all probes turn out to

originate from China. All of the main ASes sent a variety of probe types; they do not appear to specialize. The most proliﬁc

prob er, 202.108.181.70, lies in AS4837.

like a random bytestream; however its in-band key transmis-

sion makes it easy to identify, a great aid to our retrospective

analysis. The ﬁrst 20 bytes of a connection suﬃce to identify

obfs2 with a negligible error rate.

Obfs3.

The “obfs3” probe type is an obfs3 client handshake mes-

sage. By design, obfs3 is resistant to passive detection,

which poses problems for our retroactive analysis. Though

we have many samples of probes whose properties are con-

sistent with obfs3—that is, they appear random, of length

b etween 192 and 8,386 bytes, and are not obfs2—it is not

p ossible to say with certainty that they are obfs3 probes

(and not, for example, some other random-looking proto-

col). To test our guess that the probes were probably obfs3,

for a short period (Feb. 3–19, 2015) we enabled an obfs3 lis-

tener on the multi-protocol honeypot running on the server

of the Log experiment. The listener would complete the

server half of the obfs3 handshake, and then log everything

received within the encrypted channel. By participating in

the handshake, we found that the probes we would have sus-

p ected of being obfs3 were, in fact, obfs3. We have therefore

labeled all probes bearing such as signature “obfs3,” even

though it is conﬁrmed to be the case only in a small fraction

of cases.

SoftEther.

The “SoftEther” probe type is an HTTPS POST request:

“POST /vpnsvc/connect.cgi HTTP/1.1”. It resembles the ﬁrst

part of the client handshake of SoftEther VPN, the software

that powers the VPN Gate circumvention system [21]. We

d

iscovered this probe type because it appeared many times

in our HTTPS log; and we were able to observe it in de-

tail (including its POST body) when it arrived on port 23

during the time we were running the multi-protocol honey-

p ot listener. We further conjecture that some of the “TLS”

prob es that arrived at port 80 were in fact SoftEther probes,

b ecause they share a TCP timestamp sequence with other

SoftEther probes. We ﬁrst observed SoftEther probes in

Aug. 2014. This is ﬁve months after the ﬁrst active probes

seen by the creators of VPN Gate, shortly after the release

of their software [

21].

AppSpot.

T

he “AppSpot” probe type is an HTTPS GET request

with a special Host header:

GET / HTTP/1.1

Host: webncsproxyXX.appspot.com

The ‘XX’ is a number that varies across probes. The Host

header reveals that this probe typ e is likely intended to dis-

cover servers that are capable of providing access to Google

App Engine through domain fronting [

8]. When a typical

w

eb server receives a request with an alien Host header such

as this, it will respond with its default document, or an er-

ror message. But when a Google server receives the request,

it will forward the request to the web application running

at webncsproxyXX.appspot.com. Circumventors can and do run

various proxies on App Engine using precisely this technique.

These probes would be useful for eliminating any gaps left in

the GFW’s near-total block of Google services. We observed

this probe type only on port 443. Though originating from

the same po ol of IP addresses as the other prob e types, this

one seems somewhat independent, in its TLS ﬁngerprint,

452

the rate of its TCP timestamp counter (Figure 11c), and

the temporal pat terns in its activity (Figure 8).

We crawled webncsproxyXX.appspot.com for values of XX be-

tween 1 and 100. Some of them were instances of GoAgent

(a circumvention tool based on App Engine), while others

are instances of a web-based proxy. This probe type exists

in a few variations. For a time, they probes switched from

requesting / to requesting /twitter.com, which would have

caused the webncsproxy application to retrieve the Twitter

home page. In July 2015, AppSpot probes began to arrive

in pairs separated by a few seconds and originating from

diﬀerent IP addresses. The second prob e in a pair had a

diﬀerent set of header ﬁelds; it also had a similar Host:

webncsproxyXX.appspot.com header, though the value of ‘XX’

was generally diﬀerent from that of the ﬁrst probe.

Figure 8 shows the complete probe history of the HTTP

and HTTPS ports in the Log dataset. It is apparent that

obfs2 and obfs3 are temporally related, and that “short”

prob es of less than 20 bytes also follow the same temporal

pattern. The volume of TLS and SoftEther probes is much

less than that of the other probe types. AppSpot appears to

follow its own independent pattern. There are conspicuous

lulls in probing behavior: between Dec. 2013 and Aug. 2014;

b etween Feb. and Mar. 2015; and after May 2015. We do

not know why probing nearly ceased during these periods.

5.5 Fingerprinting Active Probers

We now seek out telltale ﬁngerprints in active probing:

clues that may help distinguish probers from genuine clients,

as well as clues as to how the probing infrastructure is im-

plemented. We proceed layer by layer, starting with the

IP layer and moving up through several application layers.

Our analysis shows that despite there being a large number

of probing IP addresses, there are likely only a small number

of independent processes controlling them. In particular, our

analysis of TCP initial sequence numbers and timestamps

shows that shared state exists between disparate probing IP

addresses.

IP Layer.

In 2014, anonymous researchers inferred the structure of a

DNS-poisoning censor node by analyzing side channels in the

IP TTL and IP ID [

1]. Figure 9 shows the TTL distribution

o

f all SYN segments we received from probers in our Sybil

experi ment. Five TTL values account for the overwhelming

majority of observed TTLs. We did not, however, ﬁnd any

discernible patterns in the distribution of the IP ID ﬁeld.

TCP Layer.

The TCP header is rich with potentially ﬁngerprintable

ﬁelds, particularly initial sequence numbers (ISNs), source

p orts, timestamps, and options. Patterns across TCP c on-

nections initiated by diﬀerent IP addresses indicate that all

the traﬃc originates from only a few processes (two in the

Shadow dataset, o ne in the Sybil dataset, and about a dozen

in the Log dataset). We analyzed in detail the TCP headers

of SYN segments received in the Sybil experiment and found

that they are all very similar.

TCP options: With respect to TCP options, all SYN seg-

ments:

• Employ an MSS of either 1400 (88% of probes), or

1460 (12%).

36 41 42 43 45 46 47 48 49 50 51

TTL value

Frequency

0 500 1000 1500

Figure 9: The TTL values in the SYN segments sent by

prob ers. 99% of all observed TTL values are in between 45

and 51.

• Use window scaling of 7.

• Permit selective ACKs.

• Use the TCP timestamp option.

• Use the “no operation” option.

In particular, the OS identiﬁcation tool p0f3 [35] yielded

the following (truncated) signatures. The shaded part is

identical in all three observed signatures.

0:1460:mss

*

4,7:mss,sok,ts,nop,ws:df,id+:0 (1% of probes)

0:1460:mss

*

20,7:mss,sok,ts,nop,ws:df,id+:0 (11% of probes)

0:1400:mss

*

4,7:mss,sok,ts,nop,ws:df,id+:0 (88% of probes)

Initial sequence numbers: To protect TCP connections

from oﬀ-path attackers, initial sequence numbers must not

b e guessable by an at tacker. Modern operating systems use

strong randomness to select ISNs. As a result, if all prob-

ing connections came from independent systems, we would

expect no patterns in the distribution of ISNs.

We extracted the 32- bit ISNs of all SYN segments sent

by probers. Figure 10 shows the ISN value on the Y-axis

versus the time captured on the X-axis. To our surprise, the

time series shows a striking, non-random pattern. Instead

of uniformly distributed points over time, we see a “zigzag”

pattern. ISNs increase until 2

32

and then wrap around to 0.

We conclude that the infrastructure derives ISNs from the

current time.

The Sybil experiment induces a large amount of active

probing over a short period of time—which was necessary

for ﬁndi ng this ISN pattern. Our other experiments had too

low a sample rate for the pattern to become apparent.

Source ports: We did not ﬁnd any apparent patterns in

the distribution of source ports. Interestingly, however, the

source port distribution covers the entire 16-bit port range,

including ports below 1024. This selection of ephemeral

p orts diﬀers from that of standard operating systems, which

typically only use port numbers above 1024.

TCP timestamps: We extracted the TSval from the TCP

timestamp option [2] in all SYN segments sent by probers

in the Shadow, Sybil, and Log experiments. Figure 11 il-

lustrates the result. Though the SYN segments came from

many diﬀerent IP addresses, their timestamps form only a

small number of distinct sequences (visible as straight lines

in the graphs). We can characterize every line by its slope

(i.e., its timestamp clock rate) and intercept ( i.e., its system

453

empty

short

obfs2

obfs3

TLS

SoftEther

AppSpot

HTTP

Jan

2013

Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan

2014

Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan

2015

Feb Mar Apr May Jun Jul Aug Sep

Count

1

5

15

30

Probes per day by type, Log experiment, ports 80 and 443

Figure 8: Probe types and volume of the HTTP and HTTPS ports in the Log experiment. The log ﬁle starts in Jan. 2011,

but probes only began in 2013. The “HTTP,”“AppSpot,” and “SoftEther” rows are HTTPS requests to port 443; the others

(including “TLS”) are probes to port 80. The “short” probes are those that appear random, but are too short (< 20 bytes) for

the obfs2 test. We believe that the “short” and “empty” probe types are actually truncated “obfs2” or “obfs3” probes. (Apache’s

log ﬁle truncates requests at t he ﬁrst ‘\0’ or ‘\n’ byte; a random byte string has about a 15% chance of being truncated in

the ﬁrst 20 bytes.) The “HTTP” row represents not probes, but ordinary requests for web pages on the server. They may be

ordinary web users that happened to have an IP address that at another time sent some other type of probe; or t hey may be

ﬁrewall operators web browsing from their probing infrastructure. (The requested pages were related to circumvention, which

would be of interest to Chinese Internet users and ﬁrewall operators alike.) Two “HTTP” dat a points, at A ug. 2011, are not

shown on the graph. They are from an IP address that would later send a “short” probe in May 2013.

●

●●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

09:00 10:00 11:00 12:00 13:00

0e+00 2e+09 4e+09

Time

Initial sequence number

Figure 10: Initial sequence numbers of SYN segments sent by probers. Despite the probers coming from diﬀerent IP addresses,

a clear linear pattern manifests.

uptime). If all probes came from diﬀerent physical systems,

we would expect a larger number of distinct sequences.

SSL/TLS Layer.

The Tor protocol is encapsulated within TLS. The TLS

proto c ol has many features that enable ﬁngerprinting. We

analyze the TLS “client hello,” the ﬁrst message sent by a

client—or an active prober—after establishing a TCP con-

nection. We used a TLS ﬁngerprinting patch [15] for the

passive network ﬁngerprinting tool p0f [35]. We captured

a total of 621 client hellos in the Sybil dataset. They all

had the same TLS ﬁngerprint: TLSv1.0, with a particular

list of 11 cipher suites, support for the TLS session ticket

extension, and support for compression.

To better understand how common this ﬁngerprint is, we

extracted the oﬀered cipher suites of all Tor clients connect-

ing to a Tor guard relay under our control.

2

Over a 24-hour

p

eriod, we observed 236,101 client hellos, out of which only

67 used the cipher suites listed above.

Application Layer—Tor.

After the TLS handshake, a Tor client is supposed to send

a VERSIONS cell [5, §4], in which it declares what versions of

the Tor protocol it supports. After that there is further

interaction before the establishment of a Tor circuit.

By inspecting the log ﬁles of a private Tor bridge, we

found that probers send only the VERSIONS cell, and none of

the other cells that would normally follow in order to estab-

lish a full Tor connection. After receiving a VERSIONS cell,

our bridge, following t he protocol speciﬁcation, replied with

a NETINFO cell, after which the probers abruptly closed t he

2

W

e extracted only the c ipher suites and did not capture or

store identifying information such as IP addresses.

454

Jan 01 Feb 01

1.5e+09 2.5e+09

Time

TCP TSval

Bridge 1

Bridge 2

(a) Shadow experiment.

09:00 11:00 13:00

3647000000 3648500000

Time

TCP TSval

(b) Sybil experiment.

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●●

●

●●

●

●●

●

●●●●●●●●●●

●

●●

●

●●

●

●●

●

●●

●

●●●

●

●●

●

●●●

●

Jan Mar

0e+00 2e+09 4e+09

Time

TCP TSval

●

obfs2

obfs3

random

empty

short

SoftEther

AppSpot

TLS

(c) Log experiment.

Figure 11: The TCP TSval value of SYN segments received by active probers.

connection. The probers’ VERSIONS cell declares support for

Tor protocol versions 1 and 2, which were superseded in

Oct. 2011 [18], and are now only supported for backward

compatibility (the current protocol version is 4). The fact

that probers use such an old protocol version, suggests, to

us, that the Tor probes were developed in 2011 or earlier,

and not materially updated since. We brieﬂy modiﬁed a

Tor bridge to ignore connect ions that oﬀered only versions 1

and 2 and found that such a bridge does not get bl ocked

despite being probed. (This is unfortunately not a univer-

sally deployable defense against probing because there are

still some old clients that support only old protocol versions,

and ignoring their handshakes would cut them out of the

network.)

Application Layer—obfs2.

The active probers’ implementation of obfs2 conforms with

the protocol speciﬁcation: A 16-byte seed, an encrypted

“magic” number, a padding length of 0–8,192 bytes, and

random padding. The amount of random padding matches

the declared padding length, except in a small number of

cases when the TCP stream ended prematurely because of

missing segments. The probers do not send anything in the

encrypted payload layer inside the obfs2 obfuscation layer,

even when communicating with an actual obfs2 server. In

genuine use of Tor with obfs2, there is TLS within the obfs2

layer, but within the active probers’ obfs2 there is no pay-

load at all—the prober simply terminates the TCP connec-

tion. It seems that the mere existence of an obfs2 server is

suﬃcient evidence for the censor.

Obfs2 clients should use fresh randomness for every con-

nection. We were therefore surprised to ﬁnd a few instances

of duplicate obfs2 bytestreams. If two obfs2 streams have

identical payloads, they must have had identical session keys

and identical random padding. Out of 8,479 obfs2 probes re-

ceived in the Log experiment, 56 (0.7%) were part of a pair

having identical payloads. We did not ﬁnd any payload that

o ccurred more than twice. In every case, the paired probes

came from diﬀerent IP addresses, and arrived at nearly the

same time, never more than ﬁve seconds apart. The prob-

ability that two independent obfs2 streams are identical is

negligible; therefore the probers must share some state be-

hind the scenes—at least a weak random number generator

if not actually complete process state.

This apparent “state leakage” shows up in another way.

Occasionally, an IP address that sent one half of a pair

of identical obfs2 probes also sent, nearly simultaneously,

a probe of some other type. Here is a sample from the Log

experi ment on port 80. Two diﬀerent IP addresses sent the

same obfs2 payload in the same second. One second later,

one of the two additionally sent a TLS probe:

2014-08-29 15:44:01 60.216.143.31 obfs2 eef890766636...

2014-08-29 15:44:01 14.135.253.56 obfs2 eef890766636...

2014-08-29 15:44:02 14.135.253.56 tls 160301

Curiously, the active probers do more work to detect obfs2

than they strictly have to. Although the protocol does not

require it, existing server implementations send their seed,

magic number, and padding immediately upon receiving a

TCP connection, without waiting the client’s half of the

handshake. A more stealthier active prober would open a

TCP connection and simply listen. It would be able to de-

tect current obfs2 servers without being so conspicuous.

Application Layer—obfs3.

In contrast to obfs2, the active probers’ implementation of

obfs3 diﬀers from the speciﬁcation in an interesting way that

do es not aﬀect protocol semantics. The speciﬁcation calls for

each side to send 0–8,194 bytes of random padding, but in

two chunks, each with a length that is a uniform random

number b etween 0 and 4,097. Instead, the active probers

send their padding all at once, in a single chunk of length

b etween 0 and 8,194. It is therefore possible to ﬁngerprint

active probers in 50% of cases: if the ﬁrst padding chunk is

longer than 4,097 bytes, then the client is an active prober.

As with obfs2, we observe some duplicated probe pay-

loads. Out of 4,493 obfs3 probes received, 82 (1.8%) share

their exact payload with another. The elements of a pair

arrive within a few seconds of one another. As with obfs2,

there is no payload within the obfuscation layer.

Application Layer—SoftEther.

The SoftEther probe—an HTTPS POST request with a

particular request body—matches one formerly sent by the

genuine SoftEther VPN client. However, since July 2014,

the genuine client has included a Host header containing the

IP address of the VPN server. The active probers do not set

this header, making them distinguishable.

There is another way to identify SoftEther probes. Al-

though the SoftEther VPN protocol lacks do cum entation,

in our experiments with version 4.15 we found that the

client software always sends a GET request before sending

its POST. The purpose of the GET request is to determine

455

whether the server is SoftEther VPN and not some other

HTTPS server. Because the active probers do not send the

preceding GET request, we can distinguish them from legit-

imate clients.

The TLS ﬁngerprint of the SoftEther probes diﬀers strik-

ingly from that of the actual SoftEther VPN client software,

which has more and newer ciphersuites, and various exten-

sions.

Application Layer—AppSpot.

The special Host header of the AppSpot probe type is a

dead giveaway to its purpose of discovering servers capable

of fronting access to Google App Engine. All the probes we

saw carry a fairly distinctive and speciﬁc User-Agent string,

which is probably sp oofed, as the rest of the header is in-

consistent with its purported version of Chromium. The

declared version of the browser was originally released in

Apr. 2014, and superseded just two weeks later by a new

up date. We found a small number of real web requests using

this User-Agent, but the great majority were active probers.

The ﬁrst AppSpot prob es arrived in Sep. 2014.

Among other header inconsistencies, the probes set the

header Accept-Encoding: identity, which forces the server to

send the response body uncompressed. We used this char-

acteristic to weed out the small number of non-prober re-

quests that happened to use the same User-Agent string—

these requests, using a real Chromium browser, would have

set Accept-Encoding: gzip, and the server would have com-

pressed its response. We can therefore identify active probes

in our server logs because the number of transferred bytes

is greater than it should be.

The TLS signature of AppSpot probes entirely diﬀers from

that of the claimed version of Chromium. The probes almost

certainly reﬂect use of a custom program that merely imi-

tates a web browser.

5.6 Characteristics of the Probing System

We designed our Counterprobe experiment (Section 4.4)

to illuminate multiple features of both the active probing

sensors and its probing network. We ﬁnd clear evidence

that the sensor responsible for triggering probes operates in

a single-sided fashion, meaning that it only considers uni-

directional ﬂows. Our experiments showed that an unac-

knowledged series of a SYN segment, followed by an ACK,

and ﬁnally data (i.e., Tor’s TLS client hello) suﬃces to trig-

ger a probe. The following subsections discuss additional

ﬁndings.

The sensor does not process stateless segments.

Some DPI sensors are stateless, i.e., they process TCP

segments i n isolation, without considering the TCP connec-

tion state. To learn if the active probing sensor is stateless,

we set out to attract a probe in two ways: once after es-

tablishing a three-way handshake and once—on a diﬀerent

p ort— without prior handshake. The stateful data triggered

a probe and the stateless did not. This matches our under-

standing of the behavior of the Great Firewall. However, it

diﬀers from the Great Cannon [

17] that has been used to

i

nject malicious JavaScript into web pages, which acts on

naked packets.

The sensor does not seem to robustly reassemble TCP.

Next, we tried to establish if the sensor is reassembling

TCP streams. In the ﬁrst step, we sent the triggering data

in a single TCP segment after establishing a TCP connec-

tion, which, as expected, attracted an active probe. In the

next step, we split the triggering data across packets in

20 byte increments—again after establ ishing a TCP connec-

tion. The fragmented data did not trigger an active probe,

which diﬀers from the G FW [

13].

T

his behavior was already observed by Winter and Lind-

skog [32, §5.2] in 2012. There are, however, reports stat-

ing that the active probing sensor used to reassemble TCP

streams at some point [31].

Traceroute to the sensors.

We sent response-triggering packet trains with the TTL

enco ded in the port selection, and also performed a similar

traceroute to locate the Great Firewall, from both a Unicom

server and a CERNET server. Unicom’s sensor appears to

op erate on the same link as the GFW, but the CERNET

sensor appears one hop closer to our server.

Together, these three tests suggest that the act ive prober’s

sensor is distinct from both the RST-injecting portion of the

Great Firewall and the sensor in the Great Cannon.

Inferring the physical infrastructure.

Section 5.5 suggests that there is clearly a substantial

amount of centralization, as probes from a diverse range

of IP addresses share both TCP timestamps and initial se-

quence number patterns. But what is the nature of the IP

addresses from which the probes originate? We envision

three possibilities:

1. A network of distributed proxies that simply forwards

raw packets, and is centrally controlled by the active

probing system.

2. A few centralized packet injection devices that extract

the probed server’s reply via passive monitoring.

3. A few centralized man-in-the-middle devices that se-

lectively intercept traﬃc, tem porarily hijacking end-

system IP addresses, in a manner similar to the Great

Cannon.

Our solution to distinguish these three possibilities was to

deploy a system that responds to incoming probes with a

series of TTL-limited packets, eﬀectively acting as a tracer-

oute. Our responses included:

• SYN-ACK packets, enco di ng the hop in the sequence

number.

• UDP packets, encoding the hop in the IP ID ﬁeld.

• UDP packets to the probe’s source.

• SYN-ACK (with the hop encoded in both the port and

sequence number) and UDP packets to the topologi-

cally next IP address.

• SYN-ACK and UDP packets to the topologically next

subnet.

We triggered probes by sending requests from our server in

China, and our responses were sent blindly, only capturing

packet traces for a post-processing analysis.

456

The resulting traceroutes argue against the possibility of

packet injection: a packet injecting system is unable to sup-

press the legitimate reply, and we would expect to see ICMP

time exceeded packets corresponding to the answered ACKs.

The only exception would be if the injector’s author main-

tained a careful topology, ensuring that the injector never

replied to an observed packet with too low a TTL to reach

the real destination. Given that other Chinese systems, in-

cluding the detectors in the Great Cannon and the Great

Firewall’s RST injector, do not perform such an analysis,

we ﬁnd it unlikely to believe that this system used packet

injection.

For the same reason the traceroutes argue against a Great

Cannon-type interceptor: the UDP and TCP traceroutes are

consistent for both for the target IP address and the next IP

address in sequence. In particular, note that the SYN- ACK

is never answered early. To be consistent with the next IP

address’s topology, ag ain the probing devices would need a

deep understanding of the actua l network.

For both cases, such a deep understanding of the net-

work’s topology would not signiﬁcantly increase the system’s

stealth: It’s already clear that the probes come from thou-

sands of diﬀerent IP addresses, and probing itself is not,

by its nature, stealthy. Thus we believe, but cannot prove

conclusively, that the system conducts its probes through a

large, distributed proxy network.

6. CONCLUSION

Our work paints a detailed picture of active probing, the

Great Firewall’s newest weapon in the arms race of Inter-

net censorship. Our results show that the system operates

in real time, but regularly suspends for a short amount of

time. It is capable of detecting the servers of at least ﬁve

circumvention protocols and is upgraded regularly. We show

that the system makes use of a vast amount of IP addresses,

provide evidence that all t hese IP addresses are centrally

controlled, and determine the location of t he Great Fire-

wall’s sensors.

Future work could develop more circumvention strategies

that can defeat active probing. Fortunately, users behind

the GFW already have a number of working circumvention

to ol s at their disposal, and other designs are in development.

A family of techniques known variously as decoy routing [

12],

e

nd-to-middle proxying [34, 9], and domain fronting [8] colo-

cate the circumvention system’s entry point with important

network infrastructure, so that it cannot be easily blocked

even if its address is known.

The obfuscation proto c ols ScrambleSuit [33] and its suc-

cessor obfs4 [27] tread a diﬀerent path by requiring clients

to prove knowledge of a shared secret before responding.

This technique is essentially port knocking at the applica-

tion layer. Other proposals would add scanning resistance

at the TCP layer [23] or the Tor protocol layer [11].

Our datasets, code, and auxiliary information are avail-

able online at https://nymity.ch/active-probing/.

Acknowledgments

This material is based upon work supported in part by the

National Science Foundation under grant nos. #1223717,

#1518918, #1540066, and #1518882. This work was also

supp orted in part by funding from the Open Technology

Fund through the Freedom2Connect Foundation and from

the US Department of State, Bureau of Democracy, Human

Rights and Labor. The opinions in this paper are tho se

of the authors and do not necessarily reﬂect those of any

funding agency or governmental organization.

457

7. REFERENCES

[1] Anonymous. Towards a comprehensive picture of the

Great Firewall’s DNS censorship. In FOCI. USENIX,

2014.

[2] David Borman, B ob Braden, Van Jaco bson, and

Richard Scheﬀenegger. TCP extensions for high

p erformanc e. RFC 7323 (Proposed Standard),

September 2014.

[3] Richard Clayton, Steven J. Murdoch, and Robert

N. M. Watson. Ignoring the Great Firewall of China.

In PET. Springer, 2006.

[4] Roger Dingledine. Obfsproxy: the next step in the

censorship arms race. https://blog.torproject.org/

blog/obfsproxy-next-step-censorship-arms-race,

February 2012.

[5] Roger Dingledine and Nick Mathewson. Tor protocol

sp eciﬁcat ion. https://spec.torproject.org/tor-spec.

[6] Roger Dingledine and Nick Mathewson. Design of a

blocking-resistant anonymity system. Technical report,

The Tor Project, 2006.

[7] Roya Ensaﬁ, Philipp Winter, Abdullah Mueen, and

Jedidiah R. Crandall. Analyzing the Great Firewall of

China over space and time. In PETS. De Gruyter

Op en, 2015.

[8] David Fiﬁeld, Chang Lan, Rod Hynes, Percy

Wegmann, and Vern Paxson. Blocking-resistant

communication through domain fronting. In PETS.

De Gruyter Open, 2015.

[9] Amir Houmansadr, Giang T. K. Nguyen, Matthew

Caesar, and Nikita Borisov. Ci rripede: Circumvention

infrastructure using router redirection with plausible

deniability. In CCS, pages 187–200. ACM, 2011.

[10] Andrew Jacobs. China further tightens grip on the

Internet.

http://www.nytimes.com/2015/01/30/world/asia/

china-

clamps-down-still-harder-on-internet-access.

html, 2015.

[11] George Kadianakis. Bridge client authorization based

on a shared secret.

https://gitweb.torproject.org/torspec.git/tree/

proposals/190-shared-secret-bridge-authorization.txt,

2011.

[12] Josh Karlin, Daniel Ellard, Alden W. Jackson,

Christine E. Jones, Greg Lauer, David P. Mankins,

and W. Timothy Strayer. Decoy routing: Toward

unblockable Internet communication. In FOCI.

USENIX, 2011.

[13] Sheharbano Khattak, Mobin Javed, Philip D.

Anderson, and Vern Paxson. Towards illuminating a

censorship monitor’s model to facilitate evasion. In

FOCI. USENIX, 2013.

[14] Zhen Ling, Xinwen Fu, Wei Yu, Junzhou Luo, and

Ming Yang. Extensive analysis and large-scale

empirical evaluation of Tor bridge discovery. In

INFOCOM. IEEE, 2012.

[15] Marek Majkowski. SSL ﬁngerprinting for p0f.

https://idea.popcount.org/

2012-

06-17-ssl-fingerprinting-for-p0f/, June 2012.

[16] Marek Majkowski. Fun with the Great Firewall.

https://idea.popcount.org/

2013-07-11-fun-with-the-great-firewall/, July 2013.

[

17] Bill Marczak, Nicholas Weaver, Jakub Dalek, Roya

Ensaﬁ, David Fiﬁeld, Sarah McKune, Arn Rey, John

Scott-Railton, and Ron Deibert. An analysis of

China’s “Great Cannon”. In FOCI. USENIX, 2015.

[18] Nick Mathewson. Proposed version-3 link handshake

for Tor.

https://gitweb.torproject.org/torspec.git/

tree/proposals/176-

revising-handshake.txt, January

2011.

[19] Jon McLachlan and Nicholas Hopper. On the risks of

serving whenever you surf: Vulnerabilities in Tor’s

blo cking resistance design. In WPES. ACM, 2009.

[20] Leif Nixon. Some observations on the Great Firewall

of China.

https://www.nsc.liu.se/~nixon/sshprobes.html, 2011.

[21] Daiyuu Nobori and Yasushi Shi njo. VPN gate: A

volunteer-organized public VPN relay system with

blo cking resistance for bypassing government

censorship ﬁrewalls. In NSDI. USENIX, 2014.

[22] Jong Chun Park and Jedidiah R. Crandall. Empirical

study of a national-scale distributed intrusion

detection system: Backbone-level ﬁltering of HTML

responses in China. In ICDCS. IEEE, 2010.

[23] Rob Smits, Divam Jain, Sarah Pidco ck, Ian Goldberg,

and Urs Hengartner. BridgeSPA: Improving Tor

bridges with single packet authorization. In WPES.

ACM, 2011.

[24] Sparks, Neo, Tank, Smith, and Dozer. The collateral

damage of Internet censorship by DNS injection.

SIGCOMM Computer Communication Review,

42(3):21–27, 2012.

[25] The Tor Project. obfs2 (the twobfuscator).

https://gitweb.torproject.org/pluggable-transports/

obfsproxy.git/tree/doc/obfs2/obfs2-protocol-spec.txt.

[26] The Tor Project. obfs3 (the threebfuscator).

https://gitweb.torproject.org/pluggable-transports/

obfsproxy.git/tree/doc/obfs3/obfs3-protocol-spec.txt.

[27] The Tor Project. obfs4 (the obfourscator).

https://gitweb.torproject.org/pluggable-transports/

obfs4.git/tree/doc/obfs4-spec.txt.

[28] The Tor Project. TLSHistory. https://trac.torproject.

org/projects/tor/wiki/org/projects/Tor/TLSHistory.

[29] The Tor Project. Tor: Pluggable transports. https://

www.torproject.org/docs/pluggable-transports.html.en.

[30] Tim Wilde. Knock knock knockin’ on bridges’ doors.

https://blog.torproject.org/blog/

knock-knock-knockin-bridges-doors, 2012.

[31] Philipp Winter. #8591: GFW actively probes obfs2

bridges. https://bugs.torproject.org/8591, March 2013.

[32] Philipp Winter and Stefan Lindskog. How the Great

Firewall of China is blocking Tor. In FOCI. USENIX,

2012.

[33] Philipp Winter, Tobias Pulls, and Juergen Fuss.

ScrambleSuit: A polymorphic network protocol t o

circumvent censorship. In WPES. ACM, 2013.

[34] Eric Wustrow, Scott Wolchok, Ian Goldberg, and

J. Alex Halderman. Telex: Anticensorship in the

network infrastructure. In USENIX Security

Symposium. USENIX, 2011.

[35] Michal Zalewski. p0f v3 (version 3.08b).

http://lcamtuf.coredump.cx/p0f3/, 2014.

458

Comments

Products

Project