It’s a common practice to setup multiple reverse proxies to protect your website from exposing to public network directly, it adds more flexibilities in load balancing, deployments, caching, etc. This post presents the necessity of an extra caching layer with Nginx, as well as giving a general guide in how to setup a Nginx server in front of the origin website.

Why isn’t CDN sufficient?

img

CDN (Content Delivery Network) works as a caching layer for origin website, it caches static content in its edge nodes so that users can get the resources way faster than accessing origin website directly. At the same time, it helps to reduce the traffic on origin and limit the public access by hiding behind CDN network. CDN works like a giant caching layer.

However, as pointed in the picture above, CDN needs to fetch static content at the first time. If the location of origin website is very far from the edge nodes, e.g. with origin in US AWS S3 and the edge nodes in China, it can be a disaster to fetch resources oversea (not to mention that oversea S3 is banned in mainland China). One possible solution is to setup oversea CDN edge nodes and setup express route between mainland edge nodes and oversea nodes, but this is really expensive. Secondly, assuming that there is a new file that is requested by users at the same time and it has not been cached in CDN yet, CDN edge nodes will redirect the request to the origin website, it’s still a large traffic, an abrupt traffic increase may hit the origin server badly.

Nginx Can Help

Nginx is extremely famous for reverse proxying, it comes with a large amount of plugins that help Nginx to conquer the server industry. Nginx cache is one of them. Nginx can be used as a powerful caching layer in front of the origin to reduce duplicated HTTP requests on origin.

Within Azure data centers, data transmit rate can hit hundreds of megabytes per second between two standard VMs (the more cores, the higher bandwidth). It won’t cost much to setup two proxies in two data centers, with one in US and the other in mainland China, the network latency can be ignore especially for large file downloads, like videos, mobile App, firmware, etc.

How to Setup Nginx

These steps were performed in CentOS 7.3, Azure A2 of 2 cores 3.5 GiB memory. The steps on other system may vary, please refer its official documentation for details.

0. Install and Launch Nginx

1
2
3
4
5
6
7
8
9
10
11
[xxx@azurevm ~]$ sudo yum install -y nginx
[sudo] password for xxx:
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* epel: mirror.vinahost.vn
Resolving Dependencies
--> Running transaction check
...
Complete!

Now Nginx is well installed, but not enabled for systemctl.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Check its status
[xxx@azurevm ~]$ systemctl status nginx
● nginx.service - The nginx HTTP and reverse proxy server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; disabled; vendor preset: disabled)
Active: inactive (dead)
# Enable Nginx as auto launch and maintained by systemd
[xxx@azurevm ~]$ sudo systemctl enable nginx
Created symlink from /etc/systemd/system/multi-user.target.wants/nginx.service to /usr/lib/systemd/system/nginx.service.
[xxx@azurevm ~]$
# Launch Nginx and check its status
[xxx@azurevm ~]$ sudo systemctl start nginx
[xxx@azurevm ~]$ sudo systemctl status nginx
● nginx.service - The nginx HTTP and reverse proxy server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2017-08-27 10:56:39 UTC; 6s ago
Process: 1829 ExecStart=/usr/sbin/nginx (code=exited, status=0/SUCCESS)
Process: 1826 ExecStartPre=/usr/sbin/nginx -t (code=exited, status=0/SUCCESS)
Process: 1824 ExecStartPre=/usr/bin/rm -f /run/nginx.pid (code=exited, status=0/SUCCESS)
Main PID: 1831 (nginx)
CGroup: /system.slice/nginx.service
├─1831 nginx: master process /usr/sbin/nginx
├─1832 nginx: worker process
└─1833 nginx: worker process
Aug 27 10:56:39 n0 systemd[1]: Starting The nginx HTTP and reverse proxy server...
Aug 27 10:56:39 n0 nginx[1826]: nginx: the configuration file /etc/nginx/nginx.conf... ok
Aug 27 10:56:39 n0 nginx[1826]: nginx: configuration file /etc/nginx/nginx.conf tes...ful
Aug 27 10:56:39 n0 systemd[1]: Started The nginx HTTP and reverse proxy server.
Hint: Some lines were ellipsized, use -l to show in full.

1. Setup Nginx

As indicated above, the configuration file can be found in /etc/nginx/nginx.conf.

img

The first section defines the running user, worker process, log location and process id file location. Pay attention to worker_process and worker_connections, you may need to tune them according to your traffic volume. According to some tuning guide from DigitalOcean: How To Optimize Nginx Configuration

It is common practice to run 1 worker process per core. Anything above this won’t hurt your system, but it will leave idle processes usually just lying about.

The worker_connections directive tells our worker processes how many people can simultaneously be served by Nginx. Please make sure to setup a proper number of allowed file descriptor in CentOS, guides can be found here: Increasing ulimit on CentOS.

Followed by that is http directive.

img

It specifies the log format, location of access log and some other configurations. Pay attention to keepalive_timeout, for a high traffic service it would be better to setup up a lower timeout value so as to release waiting connections to new requests.

server directive is the section where the origin website and cache rule are allocated.

img

In this example, Nginx listens on port 80 and serves the html file located in /usr/share/nginx/html to all network traffic. For 40X and 50X requests it redirects the traffic to different notification pages.

Here is another example of setting up a reverse proxy.

img

upstream is for origin website. proxy_cache_path is for cache location and cache validate time. server_name is your custom domain, current location is /, you can change it to other value according to this documentation: A Guide to Caching with NGINX and NGINX Plus.

For example, if you do not want to cache php, asp and some other files, you can use this location.

1
2
3
location ~ ^/.+\.(php|aspx|asp|jsp|do|dwr|cgi|fcgi|action|ashx|axd|json) {
...
}

Within the location directive, you need to setup cache rule, caching time, caching status code, etc. Pay attention to proxy_set_header, if you are using CDN, you should use the same Host header as your custom domain, otherwise your origin website won’t recognize your Nginx reverse engine. Other details about his configuration can be found here: Module ngx_http_proxy_module

Once configuration is updated, don’t forget to reload Nginx to enable it.

1
sudo systemctl reload nginx

It’s highly suggested to use reload instead of restart, since Nginx will perform a graceful restart by launching new threads and waiting util current connections are all completed, then retire the old threads.

Conclusion

Nginx is far more than just a web server, it gives you a performant web service with little resource consumed. In this post, Nginx is used as a caching layer for origin website in a CDN service architecture, it helps to hide origin website from abrupt web traffic and accelerates the access performance by combining multiple reverse proxies. The example listed in this post only covers some main functionalities of Ningx proxy plugin, for more details please check its official documentation. Leave me a message if you have any question about the guide.