Analysis of universal chart bed service architecture (million level source / day)

Time:2020-3-13

Hulk chart bed is a picture service supporting most of 360 company’s business, supporting a variety of picture processing functions, such as: cutting, compression, filter, phash computing, face recognition, format conversion, GIF first frame extraction And so on. The business lines supported include: search, image search, news, information flow, advertising And so on. Every day, the CDN returns to the back end of the drawing bed for 15 + billion PV.

The business logic of the picture bed is relatively simple. In an abstract view, it is upload and download. Here, we share the architecture of the two modules of the picture bed and the services experienced when uploading and downloading pictures.

1. Upload module (DaVinci)

Analysis of universal chart bed service architecture (million level source / day)
[DaVinci architecture diagram]

The logic of uploading business pictures to the drawing bed is roughly as follows:

  1. SDK upload
    Post pictures to DaVinci upload interface through SDK (or HTTP request constructed by business itself), and VIP of interface domain name resolution is balanced to back-end nginx (port 80).

  2. Nginx: 80 forwarding
    Through upstream, nginx 80 balances the requests to the back-end services (nginx 8360) through load balancing.

  3. Queuing for upload processing
    The back-end service queues the upload task, waits for the gearman asynchronous service to perform queue consumption and task scheduling, and returns the task ID to the upload request for query processing results.

  4. Image initial processing and storage
    Gearman worker processes the image asynchronously, such as compression, initial clipping, face recognition, etc., and stores the image and the processed meta information in Cassandra. The processing result corresponding to the task ID exists in redis, which is used to provide user query.

  5. Get upload results
    Through the task ID obtained in < III > to obtain the image upload and processing results.

PS: the drawing bed also supports synchronous upload and callback notification, and feeds back the upload results of pictures to the business side.

2. Download module (Picasso)

Analysis of universal chart bed service architecture (million level source / day)
[Picasso Architecture]

The user requests a picture of the drawing bed through the URL. The general process is as follows:

  1. Picture URL request
    The image URL is requested to the CDN node according to the CNAME configuration of the drawing bed domain name. If the requested CDN node has cached the picture, the data is returned directly.

  2. CDN return source
    If the CDN cache is not hit in, it will return to the back end of the table (nginx 80 port).

  3. Drawing bed backend cache (varnish)
    In order to reduce the calculation pressure on the back end of the drawing bed, when the picture request is sent back to the back end of the drawing bed, it is not directly to the storage cluster to read and process the pictures, but first through the front-end cache service of varnish. If the cache hits, it directly responds to the picture data.

  4. Varnish cache misses
    If the varnish cache fails to hit, it will be forwarded to port nginx 8360, and then to PHP fast CGI for image reading and response.

  5. Picture processing
    After reading the image data in < III >, it will be processed in the filter module of nginx (8360). The processing rules are specified in the parameters of the image URL. For example: specify clipping width and height, filter, black and white, face clipping, GIF first frame extraction, etc. this filter module mainly uses the open-source graphicsmagick for image processing, and statically compiles to nginx.

  6. Data after response processing
    Through the processing of nginx (8360) module, we get the images that meet the URL rules, and finally respond and cache them to the CDN node.

The above is the processing logic of the graph bed upload and download module.