Using Nginx as Caching Layer of Origin
It’s a common practice to setup multiple reverse proxies to protect your website from exposing to public network directly, it adds more flexibilities in load balancing, deployments, caching, etc. This post presents the necessity of an extra caching layer with Nginx, as well as giving a general guide in how to setup a Nginx server in front of the origin website.
CDN (Content Delivery Network) works as a caching layer for origin website, it caches static content in its edge nodes so that users can get the resources way faster than accessing origin website directly. At the same time, it helps to reduce the traffic on origin and limit the public access by hiding behind CDN network. CDN works like a giant caching layer.
However, as pointed in the picture above, CDN needs to fetch static content at the first time. If the location of origin website is very far from the edge nodes, e.g. with origin in US AWS S3 and the edge nodes in China, it can be a disaster to fetch resources oversea (not to mention that oversea S3 is banned in mainland China). One possible solution is to setup oversea CDN edge nodes and setup express route between mainland edge nodes and oversea nodes, but this is really expensive. Secondly, assuming that there is a new file that is requested by users at the same time and it has not been cached in CDN yet, CDN edge nodes will redirect the request to the origin website, it’s still a large traffic, an abrupt traffic increase may hit the origin server badly.
Nginx is extremely famous for reverse proxying, it comes with a large amount of plugins that help Nginx to conquer the server industry. Nginx cache is one of them. Nginx can be used as a powerful caching layer in front of the origin to reduce duplicated HTTP requests on origin.
Within Azure data centers, data transmit rate can hit hundreds of megabytes per second between two standard VMs (the more cores, the higher bandwidth). It won’t cost much to setup two proxies in two data centers, with one in US and the other in mainland China, the network latency can be ignore especially for large file downloads, like videos, mobile App, firmware, etc.
These steps were performed in CentOS 7.3, Azure A2 of 2 cores 3.5 GiB memory. The steps on other system may vary, please refer its official documentation for details.
Now Nginx is well installed, but not enabled for
As indicated above, the configuration file can be found in
The first section defines the running user, worker process, log location and process id file location. Pay attention to
worker_connections, you may need to tune them according to your traffic volume. According to some tuning guide from DigitalOcean: How To Optimize Nginx Configuration
It is common practice to run 1 worker process per core. Anything above this won’t hurt your system, but it will leave idle processes usually just lying about.
worker_connections directive tells our worker processes how many people can simultaneously be served by Nginx. Please make sure to setup a proper number of allowed file descriptor in CentOS, guides can be found here: Increasing ulimit on CentOS.
Followed by that is
It specifies the log format, location of access log and some other configurations. Pay attention to
keepalive_timeout, for a high traffic service it would be better to setup up a lower timeout value so as to release waiting connections to new requests.
server directive is the section where the origin website and cache rule are allocated.
In this example, Nginx listens on port 80 and serves the html file located in
/usr/share/nginx/html to all network traffic. For
50X requests it redirects the traffic to different notification pages.
Here is another example of setting up a reverse proxy.
upstream is for origin website.
proxy_cache_path is for cache location and cache validate time.
server_name is your custom domain, current
/, you can change it to other value according to this documentation: A Guide to Caching with NGINX and NGINX Plus.
For example, if you do not want to cache
asp and some other files, you can use this location.
location directive, you need to setup cache rule, caching time, caching status code, etc. Pay attention to
proxy_set_header, if you are using CDN, you should use the same
Host header as your custom domain, otherwise your origin website won’t recognize your Nginx reverse engine. Other details about his configuration can be found here: Module ngx_http_proxy_module
Once configuration is updated, don’t forget to reload Nginx to enable it.
It’s highly suggested to use
reload instead of
restart, since Nginx will perform a graceful restart by launching new threads and waiting util current connections are all completed, then retire the old threads.
Nginx is far more than just a web server, it gives you a performant web service with little resource consumed. In this post, Nginx is used as a caching layer for origin website in a CDN service architecture, it helps to hide origin website from abrupt web traffic and accelerates the access performance by combining multiple reverse proxies. The example listed in this post only covers some main functionalities of Ningx proxy plugin, for more details please check its official documentation. Leave me a message if you have any question about the guide.