1、 What is web proxy
Web proxyThe server is the intermediate entity of the network. The proxy is located between the client and the server, playing the role of “middleman”, sending HTTP messages back and forth between each endpoint. A client specific agent isPrivate agentThe agent shared by many clients is calledPublic agency。
2、 The difference between web proxy and gateway
Proxy: connects two or more applications that use the same protocol.
Gateway: connects two or more endpoints using different protocols.
3、 Why use a proxy
1. Children’s filter
If be in a school, want to have no obstacle to educational website visit already, want to use agent to organize the visit to adult content again.
2. Document access control
Proxy server is used to implement a unified access control policy between a large number of web servers and web resources. as
a) Allow client 1 unlimited access to the news page of server a
b) Client 2 has unlimited access to the Internet
c) Before allowing client 3 to access server B, you need to enter the account password
3. Security firewall
Firewall proxy improves security, proxy server will limit which application layer protocol data can flow into another organization on a single security node in the network. It can also provide the hook program used by the web and e-mail agents to eliminate the virus and check the traffic in detail.
4. Web Caching
Proxy caching maintains local copies of commonly used documents and provides them on demand to reduce slow and expensive network traffic.
5. Reverse proxy
Agents can pretend to be web servers. The agents of these reverse agents receive real requests sent to the web server. Different from the web server, they can initiate communication with other servers to locate the requested content on demand.
6. Content server
Proxy server can be used as “content server” to direct requests to specific web server according to Internet traffic and content type. It can also implement various service level requests. For example, users need to pay to improve performance, and the content router can forward the request to the nearby replication cache.
The proxy server can modify the body format of the content before sending it to the client. The transparent transformation between these data representations is calledtranscoding 。 For example, when transmitting GIF image, it can be converted into JPEG image; the image can also be compressed. Or in the process of transferring the document, it can be converted to another language (English document to Chinese document).
Anonymous agent will actively delete identity features (such as client IP address, from header, referer header, cookie, session ID of URI, etc.) from HTTP message to improve privacy and anonymity.
4、 Location of the proxy server
1. Export agent
The agent is fixed at the exit point of the local network in order to control the traffic between the local network and the Internet.
2. Access portal agent
Agents are often placed on ISP (Internet server provider) access points to handle aggregation requests from customers. ISP uses cache agent to store common document copies, so as to improve users’ download speed and reduce Internet bandwidth consumption.
3. Reverse proxy
Agents are usually deployed at the edge of the network and used as substitutes (reverse agents) before the web server. They can handle all requests sent to the web server and only request resources from the web server when necessary.
4. Network switching agent
The agent with enough processing power can be placed on the Internet peer-to-peer switching point between networks, and the congestion of Internet nodes can be reduced through caching, and the traffic can be monitored.
5、 Hierarchy of agents
Agents can be connected through agent hierarchy. The application scenarios of agent hierarchy are as follows:
1) Load balancing
The child agent may decide how to choose a load balance according to the workload of the current parent agent to achieve load balance
2) Routes near geographical location
The child agent may choose the agent responsible for the physical area of the original server
3) Protocol / type routing
The child agent may forward messages to different parent agents and original servers according to the URI
4) Order based routing
If publishers pay extra for high performance, their URIs will be forwarded to large caches or compression engines to improve performance.
6、 How to make client to agent
The client usually communicates with the web server directly. How to make the client flow to the agent
1) Modify client
If the client is configured as a proxy server, the client will intentionally send the HTTP request directly to the proxy instead of the original server.
2) Modify network
The network infrastructure intercepts the network traffic and imports it into the agent without the knowledge or participation of the client through several technical means.
3) Modify the DNS namespace
The proxy server (reverse proxy) placed on the web server will directly disguise the name and IP address of the web server, so that all requests will be sent to these reverse proxy servers instead of the web server.
4) Modify web server
Some web servers are configured to send an HTTP redirection command (response code 305) to the client to redirect the client request to a proxy. After receiving the redirection command, the client will communicate with the proxy server.
7、 Tracing message
1. Via first
The header field of via lists the information related to each intermediate node of the message path. Every time a message passes through a node, the intermediate node must be added to the end of via list.
Via：1.1 proxy-62.irenes-isp.net, 1.0 cache.joes-hardware.com
Via header field is used to record message forwarding, diagnose message cycle, and identify protocol capability of all senders in request and response chain.
2. The grammar of via
The via header field contains a comma separated landmark, each of which represents an independent proxy server or gateway. Via consists of four components: an optional protocol name (HTTP by default), a required protocol version, a required node name and an optional descriptive comment.
Via：1.1 proxy-62.irenes-isp.net, 1.0 cache.joes-hardware.com
3. Via’s request and response path
The request and response messages will be transmitted through the agent, so there must be a via header in the request and response messages. Request and response are usually transmitted through the same TCP connection, so the response message is sent back along the same path as the request message.
4. Via and gateway
Some agents provide gateway functions for servers that do not use HTTP protocol. Via records the conversion of these protocols in the first part.
5. Trace method
The proxy server can modify the message when forwarding it. You can add, modify, or delete headers, and you can also convert body parts into different formats. Through the trace method of HTTP / 1.1, users can track the request message transmitted by the proxy chain, observe which agents the message passes through, and how each agent modifies the request message.