to the others is that it is not the result of artificial activity
to induce probing. Our server received its first probes in
January 2013, slightly earlier than the first public reports
of obfs2 probing [31]. On average, it has received dozens of
prob es every day since active probing began—though there
are long stretches during which it received few probes.
This dataset provides an invaluable longitudinal perspec-
tive, though with the significant limitation that application
logs do not record as much forensic information as we would
like: they omit source ports and other transport-layer infor-
mation, and usually truncate probe payloads (see below).
Data Types and Ranges.
The HTTP and HTTPS log go back to January 2010, be-
fore the earliest reports of active probing of any kind. The
SSH log dates only to September 2014; probing of that p ort
was already in effect at the beginning of the log. While ap-
plication logs do not contain as much information as would,
say, packet captures, they suffice in many cases to iden-
tify the type of probe and the prober’s IP address. The
HTTPS log is even superior to a packet capture in one re-
sp ect: for TLS probes it contains the decrypted application
data, which would not be accessible from a packet capture.
Along with the application logs, the system has full packet
captures for ports 23, 80, and 443, between December 2014
and May 2015. The packet captures enable us to perform
more detailed analysis on the network- and transport-layer
features of probes. We opened port 23 (Telnet) specially
during this period, and used it to host a multi-protocol hon-
eypot server capable of respondi ng to probes of various types
(TLS, obfs2, and obfs3), and recording the protocol layers
inside them. (To avoid inadvertently capturing potentially
sensitive activity, we do not perform any packet capture or
additional logging on the SSH and Tor ports.)
The server from which we gathered the Log data is a
working server that performs a number of functions apart
from simply absorbing act ive probes. In order to distin-
guish active probes from operati onal traffi c, we used a form
of conservative snowball sampling. We started by extract-
ing incontrovertible probes; namely, obfs2 probes and non-
proto c ol-conformi ng payloads that we could not otherwise
explain, for example random binary garbage written to the
HTTP port. (It is easy to detect obfs2, with negligible
false positives, because of the protocol weakness described
in Section
3.2. It requires only the first 20 bytes sent by the
c
lient.) We then made a list of the IP addresses that had
sent those probes, and examined all other traffic they had
sent at any point in time. Despite that the GFW’s active
probing rarely reuses source IP addresses, we occasionally
found a new probe type. When we did, we added it to our
list of known probes and repeated the process. In t his way,
we slowly expanded our uni verse of known active prober IP
addresses, all the while checking manually to make sure we
were not sweeping up non-probing traffic.
This conservative approach enabled us to find a variety of
probing behaviors with few false positives, at the potential
cost of missing some novel probes from IP addresses that did
not also send a recognized probe type. This technique led
to our discovery of the “AppSpot” and “SoftEther” probes,
even though we seeded the process only with Tor-related
prob es. Except for a handful of manually excluded hosts
(e.g., systems under our control), we did not consider the
source IP address in deciding whether a log entry indicated
a probe. Specifically, we did not employ IP geolocation to
find probes emanating from China, though Figure
7 shows
t
hat the probes we found did in fact overwhelmingly come
from China.
We found a small amount of non-probing traffic (e.g., or-
dinary HTTP requests for actual pages) from IP addresses
that also sent a probes at some other time. The shortest
separation between probe and non-probe was three weeks,
and the longest was two years.
Limitations.
Our application logs have several limitations. The HTTP
and HTTPS (Apache) log truncates at the first ‘\0’, ‘\n’, or
‘\r\n’ sequence, and omits a leading ‘\n’ (we account for this
p ossibility when classifying probes). The SSH (OpenSSH)
log truncates at the first ‘\0’, ‘\r’, or ‘\n’, and has a hard
limit of 100 bytes.
A significant effect of these limitations is that application
logs will not record any Tor probes as such, not even when
they are received by the HTTPS port, which removes the
outer TLS layer. The first message that a Tor client sends
after the TLS handshake is a VERSIONS cell, which happens to
start with a ‘\0’ byte, causing the payload to be truncated
to a length of zero in the server log.
4.4 Counterprobing
Active probers seem to share their IP address pools with
normal Internet users. To investigate this, we scanned some
prob ers repeatedly using network diagnostic tools such as
ping, traceroute, and Nmap. We started the first scan the
moment a prober showed up. From then on, we repeated
the scan hourly for 24 hours. Interestingly, the very first
scan never yielded anything. The probers were unrespon-
sive to all packets. Later scans, however, painted a different
picture. In many cases, our port scans identified the IP ad-
dresses that were used for active probing just hours before
as various versions of Windows and Linux. We found open
p orts used for file sharing, FTP, HTTP, RPC, and many
other services typical for home users. This indicates that
in many cases active probers share their address pools with
Internet users. We were able to run a small number of coun-
terprob es from a host within China, and the results matched
those of simultaneous counterprobes from outside the coun-
try: prober IP addresses were initially unresponsive, but
o cca sionally became responsive as disparate and seemingly
ordinary Internet hosts.
With the previous experiments in hand, we then set out
to develop tests to probe t he architecture of the passive
monitor that triggers the probes, and the active compo-
nent responsible for sending them. A unidirectional three-
packet sequence—SYN, ACK, and a Tor TLS client hand-
shake packet—sent from a Chinese host suffices to trigger
an active probe.
We built several tests and ran them from a Unicom host
and a CERNET host to our EC2 infrastructure. These tests
included a traceroute for the monitor; a fragmentation test
that splits the request across multiple TCP packets to ex-
p ose whether the passive detection maintains per-flow state;
a SYN-ACK traceroute that uses TTL-limited packets to
respond to probe requests; and a “milker” that rep eat edly
sends triggering requests to a Tor bridge.