The evolution of large-scale server architecture website images

In the mainstream Web site, the picture is often an essential page elements, especially in large sites, almost all will face technical problems related to “massive picture resources,” the storage, access, etc. In the schema extensions for the image server, also experienced a lot of twists and turns and even sorrowful lesson (especially lack of early planning, resulting in the difficult post-architecture compatibility and extensions).

This will be a true vertical portal development process, and to everyone his right.

Built on the Windows platform sites tend to be the industry’s many technology said it was “conservative”, even a little. Much of the reason is due to the short-sighted and closed part of the technical staff of Microsoft technology system caused by (of course, mainly the problem of people). Due to a chronic lack of open support, so many people can only “behind closed doors”, it is easy to form thinking limitations and shortcomings. In image server as an example, if there is no pre-planning capacity and scalable design, and then with increasing volume increased access image files, due to lack of performance, fault tolerance / disaster recovery, scalability, and other aspects of the design, follow-up will give the development, operation and maintenance work has brought a lot of problems, serious and even affect the normal operation of the company’s development of the site and the Internet business (This is not an alarmist).

Many companies chose Windows (.NET) platform to build Web sites and pictures server, much of the technical background of the founding team of the decision, early technology might be more familiar with .NET, or team of executives believe that Windows / .NET ease of use, “fast track” development model, personnel costs and other aspects are more in line with the early days of the team, naturally chose Windows. Late business development to a certain size, it is difficult to easily migrate to the overall architecture of other open-source platform. Of course, for the construction of large-scale Internet, but suggested that the preferred open source architecture, because there are many cases mature and open source ecosystem of support (there will be a lot of pit, to see it is the first to step on your own pit, or someone else stepped in after restoration You then), to avoid duplication and create the wheel high licensing fees expenses. For more difficult application migration, personally I recommend Linux, Mono, Jexus, Mysql, Memcahed, Redis …… mashup architecture, the same can support a high and a large amount of concurrent access to data features Internet applications.
Stand-alone era image server architecture (centralized)

Start-up period due to time constraints, the developer level are very limited reasons. So usually directly in the file directory website, the establishment of an upload subdirectory to hold the user to upload image files. If subdivided by business, you can then create different subdirectories under the directory to distinguish upload. For example: upload \ QA, upload \ Face like.

Stored in a database table is “upload / qa / test.jpg” Such relative path.

User access as follows:

Program upload and write mode:

A programmer by configuring the physical directory D in web.config: \ Web \ yourdomain \ upload and then written to the file by way of stream;

Programmer B by Server.MapPath, etc., depending on the relative path to get the physical directory and also by way of written document stream.

Pros: The easiest to implement, without any complicated technology, users will be able to successfully file uploads written to the specified directory. Save the database record and is actually very easy to access it.

Disadvantages: uploading chaos, severely detrimental to expand the site.

In response to these most original architecture, we are faced with the following questions:

As more and more file upload directory, partition (eg D drive) if there is not enough capacity, it is difficult to expand. Replace only stopped after a larger-capacity storage device, and then import the old data.
When deploying a new version (before deploying a new version of the backup is needed) and daily backup website files, you need to simultaneously upload files in the directory, if taking into account the increase in traffic, behind deploying load balancing cluster consists of multiple Web servers, If you do file real-time synchronization between cluster nodes will be a challenge.

Cluster era image server architecture (real-time synchronization)

In the website site below to create a new virtual directory named upload, and because of the flexibility of the virtual directory, to replace the physical directory to some extent, and compatible with the original image upload and access. Users access method remains:

Advantages: configuration is more flexible, but also compatible with older versions of uploads and access.

Because the virtual directory, you can point to any directory on any local drive letter under. In this way, you can also access external storage, for stand-alone capacity expansion.

Disadvantages: deploy a cluster consists of multiple Web servers, and between the various Web server (cluster nodes) need to synchronize files in real time (virtual directory), due to the efficiency and real-time synchronization constraints, it is difficult to ensure a certain moment Each node on the document is completely consistent.

The basic architecture as shown below:

As can be seen from the chart, the entire Web server architecture already has “scalability, high availability,” the major problems and bottlenecks are concentrated among multiple servers file synchronization.

Above architecture only on these Web server each “incremental synchronization” As a result, do not support files “delete, update” operation synchronized.

The original idea was to do in the application-level control, while at the same user requests web1 server upload writes, also synchronous to call the upload port on another web server, which is obviously not worth the candle. So we chose to use Rsync class software to do time file synchronization, thereby eliminating the cost of “repeat-create the wheel”, but also reduces the risk.

Synchronous operation which generally have more classic two models, namely sliding model: the so-called “pull” refers to polling places to get updates, so-called push, that is if they change initiative “push” to other machines. Of course, also be used for adding advanced event notification mechanism to perform such actions.

In high concurrent writes scenes, synchronization will occur efficiency and real-time problem, but also a large number of file synchronization and bandwidth-consuming system resources (inter-network is even more obvious).
Cluster era image server architecture improvements (shared storage)

Follow the virtual directory, to achieve shared storage via UNC (network path) the way (the upload virtual directory points UNC)

User access mode 1:

2 user’s access method (you can configure a separate domain name):

UNC support independent domain names point where the server configuration and configure the lightweight web server to achieve independent image server.

Advantages: to read and write operations through UNC (network path), synchronization correlation between multiple servers to avoid problems. Relatively speaking very flexible, and also supports the expansion / extension. Support is configured to separate image server and domain names, but also complete compatibility with older versions of the access rules.

Disadvantages: UNC configuration but somewhat cumbersome, but will also cause some (read, write and safety) performance loss. It may be a “single point of failure.” If the storage level does not raid or more advanced disaster recovery measures, will result in data loss.

The basic architecture as shown below:

In the early days many of the open source Linux-based architecture site, if you do not want to synchronize pictures, you may use NFS to achieve. Facts have proved that, NFS high concurrent read and write and mass storage, there is a certain problem of efficiency is not the best option, so most Internet companies are not using NFS to implement such applications. Of course, you can also Windows comes with DFS to achieve, the disadvantage is “complicated configuration efficiency is unknown, and the lack of information on a large number of actual cases.” In addition, there are some companies use FTP or Samba to achieve.

Several architecture mentioned above, upload / download operation, after a Web server (although this shared storage architecture, you can also configure a separate domain name and site to provide image access, but still have to pass written uploading Web server application on to handle), which is a Web server is undoubtedly exert enormous pressure. Therefore, it is recommended to use a separate image server and a separate domain, to provide users upload pictures and access.
Benefits independent image server / independent domain

Image Access is very consuming server resources (because it will relate to the operating system context switches and disk I / O operations). After separating, Web / App Server can focus more on the ability to play a dynamic process.
Independent memory, more convenient to do the expansion, disaster recovery and data migration.
Browser concurrency policy restrictions, performance losses (the same domain names).
When accessing the picture, the total request information with cookie information, can also cause loss of performance.
Easy to do image access request load balancing, and convenient various caching strategies (HTTP Header, Proxy Cache, etc.), but also more convenient to migrate to CDN.


We can use Lighttpd or Nginx and other lightweight web server architecture independent image server.
The current image server architecture (Distributed File System + CDN)

Before building the current image server architecture, you can be completely put aside the web server directly to configure a separate image server / domain name. But faced with the following questions:

Image data how old do? You can continue to be compatible with old pictures path access rules?
Upload a need to provide a separate written interface (API service released) on a stand-alone image server, how to ensure safety?
Similarly, if you have multiple independent image server, using the scalable shared storage solution, or the use of real-time synchronization mechanism?

Until the application level (non-system level) DFS (for example FastDFS HDFS MogileFs MooseFS, TFS) epidemic, simplify the problem: execution redundancy, support automatic synchronization, supports linear expansion, support for mainstream language api client upload / download / delete operation, part of the support document indexing, partially supported, providing a way to access the Web.

Taking into account the characteristics of DFS client API language support (need to support C #), documents and case, as well as community support, we chose the FastDFS to deploy.

The only question is: might not compatible with older versions of the access rules. If the old picture time import FastDFS, but as the old image access path distributed stored in each database table of different services, the overall update is also very difficult, it is necessary to have compatibility with older versions of the access rules. Infrastructure upgrades often more difficult than doing a new structure is also compatible with previous versions because of the problem. (To change engine aircraft in the air is much more difficult than building aircraft)
Solutions are as follows:

First, turn off the old version uploaded entrance (to avoid continued use leads to inconsistent data). The old-time image data to migrate through the rsync tool to a separate image server (as described in the following figure Old Image Server). In the forefront (seven agents, such as Haproxy, Nginx) with ACL (access control rule), the old pictures correspond requested URL rules (regular) match, then forwards the request to specify a list of servers web, in the list configuration on the server to provide a good picture (in Web) access to the site, and add cache policy. Thus to achieve the separation of the old image server and cache, compatible with the old image of the access rules and improve access efficiency in old pictures, but also to avoid the problems caused by real-time synchronization.

Overall architecture is shown:

Based on independent image server cluster architecture FastDFS, although already very mature, but because of the domestic “North-South interconnection,” and IDC bandwidth costs and other issues (the picture is very consuming traffic), we finally chose commercial CDN technology, it is also very easy, in fact, very simple principle, I am here only a brief introduction:

The domain name img cname to CDN domain name specified by the manufacturer, when the user requests access to pictures, by CDN vendors to provide intelligent DNS resolution, the recent (of course there may be other, more complex strategies, such as load conditions, health status, etc.) and services node address is returned to the user, the user request reaches the specified server node, providing similar Squid / Vanish proxy caching service on that node, if it is the first request to the path, you’ll get the picture resources returned to the client browser from a source station It is, if the cache exists, obtain and returned to the client browser, complete the request / response process directly from the cache.

As a result of commercial CDN service, so we do not consider Squid / Vanish from the front row to build the proxy cache.

Above the entire cluster architecture that can easily scale to do, to meet the general picture in the vertical field service needs of large sites (of course, such as ultra-large taobao possible is another matter). After testing, provide access to a single image Nginx server (E5 Xeon quad-core CPU, 16G memory, SSD), small static pages (compressed probably only about 10kb) can Kang Zhu thousands of concurrent and there is no pressure. Of course, because the image itself is larger than the volume of a lot of static pages of pure text, provide a picture of the anti-concurrency server access, often limited by the bandwidth of the disk I / O processing power and IDC provided. Nginx anti concurrency is still very strong, but also for resource consumption is very low, especially in dealing with static resources, it did not seem to need much to worry about. According to the actual traffic demand, by adjusting the parameters of Nginx, do tuning the Linux kernel, adding hierarchical caching strategies and other means to be able to do a greater degree of optimization can also be configured to do extended by adding servers or upgrading the server, the most direct It is through the purchase of more advanced storage devices and more bandwidth to meet the greater demand for visits.

It is worth mentioning that, in the “cloud computing” pop of the moment, the site is also recommended during the rapid development of such a program using the “cloud storage”, not only can help you solve all kinds of storage, expansion, disaster preparedness issues as well good CDN acceleration. Most importantly, the price is not expensive.

Summary of the relevant image server architecture extensions, roughly around these questions:

Capacity planning and expansion issues.
Data synchronization, redundancy and disaster recovery.
Cost and reliability of hardware (an ordinary mechanical hard drive, or SSD, or a more high-end storage devices and programs).
Select the file system. According to the file properties (such as file size, literacy ratio, etc.) option is to use ext3 / 4 or NFS / GFS / TFS open source (distributed) file system.
Accelerate access to the picture. Using a commercial or self-built proxy cache CDN, web static cache architecture.
Old pictures path and access rules compatibility, application-level scalability, upload and access performance and security.