Blog / The Ultimate China Web Study: How do Foreign Sites Actually Load in China? Part 4

The Ultimate China Web Study: How do Foreign Sites Actually Load in China? Part 4

Part 4: Single Site Performance

---

This is part 4 of a 4-part series on studying how foreign sites load in China. For more information please check out:

Part 1: Background

Part 2: Aggregate Performance

Part 3: Single Site Performance

Part 4: Single Site Detailed Breakdown

---

For this analysis, looking more at the nuances of “Why Doesn’t a Site work in China” and we look specifically at harvard.edu for this illustration. 

On the one hand, Harvard is a recognisable name and there are a number of channels through which Chinese students can learn about this institution. On the other hand, Harvard has a website they use to communicate with the world (for multiple purposes), and it doesn't really work in China. 

Recap

We start by looking back at the summary information to see how the site performed over a week. As much as a few one-off tests may educate or indicate a handful of problems, what’s important is understanding how sites load on a repeated basis.

Time Series Analysis

Graph 1: Harvard.edu Time Series Analysis, Source: chinafy.com

Given the above, it appears the site is fast at ~3-5am, and slow otherwise. 

Loading Time Histogram

Graph 2: Harvard.edu Loading Time Histogram, Source: chinafy.com

% of Page Loaded

Graph 3: Harvard.edu % of Page Loaded, Source: chinafy.com


How does the site look in China? 

At the onset, we look visually at how the page loaded in China (after 30 seconds) vs the US when it was complete. Visually, we can see the following:

Images are failing (failed) to load

Youtube Video does not load

Some Javascript is not loaded


The Pages Are Loading Differently Almost Every Time

Running repeated tests allow us to understand averages - one of the difficulties of measuring site performance is that pages load differently almost every time.

Following is 3 sequential loads of Harvard.edu from China. As you can tell, the waterfall (i.e. sequence in which files load) looks different in each case - the number of files loaded is different, as well as which specific files were successful. We'll dive deeper into the Resource Waterfall in a future article. 

Trial 1 Data

In Trial 1 (far left), it appears that files just took incredibly long to load (long green lines) - this is symptomatic of a low/slow bandwidth issue. This experience harkens back to those of the 1990’s when images would load ever….so… gradually. That would be this particular user’s experience. More on speed below. 

Trial 2 Data

In the second case (middle), we see in yellow that the initiation or waiting time connecting to the foreign servers is slow, in addition to the to the low bandwidth issue in Trial 1. 

Trial 3 Data

In the last case (far right), it appears some of the earlier/initiating resources timed out leading to a cascading failure where no/few other resources could be retrieved.. 

You Don’t Know What You Can’t See

One other difficulty in studying site performance is that when you look at how pages load, they often don’t show what doesn’t load. Said another way, if the site loads 100 resources in the US, and it loads 70 resources in China, you don’t know which 30 resources didn’t load unless you’re able to easily reconcile them.

Going to https://www.chinafy.com/en/tools/resource-test we’re able to compare which resources were retrieved from the US, Beijing, Shanghai, and Guangzhou, and reconcile them respectively. From a high-level there are a few issues: 

Issue 1: Files Load Slowly

Looking at Trial 1 data, you can see that the images at the top (in green) take 20-40 seconds to load. They aren’t particularly large, but this is emblematic of a far, distant, slow, or non-existent content delivery network (CDN). Doing a DNS / IP lookup, we can see that Harvard is likely using Google’s CDN. Google’s CDN, while vast and not blocked in China, performs comparable to most foreign CDNs which is - slowly.

When we look at CDN performance in China (care of Cedexis, now Citrix) we see that US networks perform poorly in China in which case, one needs to change, replace, or set up a multi-CDN configuration which make DevOps and Infrastructure a significantly more complex undertaking.

Source: https://www.citrix.com/products/citrix-intelligent-traffic-management/country-reports.html

Issue 2: Files Don’t Load

The other issue plaguing this site is that files aren’t loading. Files typically load unsuccessfully for three reasons:

They’re explicitly blocked or unavailable

They timed out - that is, the browser attempted to contact the server, and tried to load the file but the signal/response took too long

Indirect: The initiator (i.e. the file that triggers the loading of the said file) is didn’t load and the subsequent or dependent files could not be loaded.

In this case, Harvard.edu is affected by all three of these issues. As the sequence, or evolution of every page load is somewhat path-dependent, this is why we see such variance in pages loading (almost) fully some times versus not at all, at others.

Issue 3: Too Many Resources, from Too Many Domains

When we look at Trial 3, we see that it takes about 38 seconds simply to load the primary HTML file. (Yes - it's quite small to see, you'll have to trust me on this!)

This is broken down as:

~15 secs to establish a connection (i.e. TCP handshake),

~14 seconds to validate the SSL cert,

5 seconds of waiting, and

4 seconds to download 18KB of data - that’s an incredibly slow, albeit common 4.5KBps throughput

This is the time it takes to generally establish a connection and load one file from one Domain. Harvard has third-party resources on 18 separate domains:

Given the difficulties in establishing, let alone maintaining, a stable connection in China, it’s critical that the number of domains is reduced. Unless dynamic information is loaded from these sites, static assets such as JS files, fonts, and other should be aggregated, and loaded from a single domain. 

Wrapping Up

We’re really just scratching the surface with this analysis, and haven't even touched  SEO - there’s far more involved in successfully Chinafying your site. To identify the problems, and apply resolutions takes considerable money, time, and complexity to set up, optimise, and streamline. 

In these uncertain, and challenging times when the World seems to be pulling apart, we're excited to draw the Global Internet and the Chinese Internet closer together. Chinafy's incredibly powerful, and yet super simple. In the time it took you to read this article, we could have already processed your site and turned it 'live'. 

There are immense opportunities for foreign companies entering China. We think Chinafy is pretty awesome - whether you're in Marketing, an Engineer, or an Agency, we're pretty sure you will too.  

×

Notey will use the information you provide on this form to be in touch with you and to provide updates and marketing. Please let us know all the ways you would like to hear from us:

You can change your mind at any time by clicking the unsubscribe link in the footer of any email you receive from us, or by contacting us at community@notey.com. We will treat your information with respect. For more information about our privacy practices please visit our website. By clicking below, you agree that we may process your information in accordance with these terms.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.