目录

WSGI 理解(1)

WSGI 是个什么东西?

实际的生产环境中,Python应用程序是放在服务器的http server(比如Apache、Nginx等)上的。现在的问题是http server(之后以服务器代称)怎么把接收到的请求传递给Python应用程序?这就是WSGI做的事情。

WSGI(Web Server Gateway Interface)即Web服务器网关接口,解耦了服务器(Apache、Nginx等)Python应用程序,是Python开发者只需要关注Python应用程序的开发。

Web Server:即HTTP Server,接收用户的请求并返回响应信息;分为以下两部分:

  1. 服务器,如Apache、Nginx等
  2. Python应用程序,负责处理业务逻辑

https://github.com/affectalways/affectalways.github.io/blob/master/images/wsgi/wsgi_1/wsgi_1_framework.png?raw=true

https://github.com/affectalways/affectalways.github.io/blob/master/images/wsgi/wsgi_1/wsgi_1_wsgi.png?raw=true

HTTP Server 实现

服务器每接收到一个请求就调用一次Python Application服务器作用如下

  • 接收HTTP请求
  • 提供environ和回调函数start_response,并传给callable object
  • 调用callable object

以下是PEP-3333提供的示例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
import os, sys

enc, esc = sys.getfilesystemencoding(), 'surrogateescape'

def unicode_to_wsgi(u):
    # Convert an environment variable to a WSGI "bytes-as-unicode" string
    return u.encode(enc, esc).decode('iso-8859-1')

def wsgi_to_bytes(s):
    return s.encode('iso-8859-1')


def run_with_cgi(application):
	"""
		application: 是Python Application中的callable object
	"""
    # 构造environ变量,dict类型,里面的内容是一次HTTP请求的环境变量
    environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()}
    environ['wsgi.input']        = sys.stdin.buffer
    environ['wsgi.errors']       = sys.stderr
    environ['wsgi.version']      = (1, 0)
    environ['wsgi.multithread']  = False
    environ['wsgi.multiprocess'] = True
    environ['wsgi.run_once']     = True

    if environ.get('HTTPS', 'off') in ('on', '1'):
        environ['wsgi.url_scheme'] = 'https'
    else:
        environ['wsgi.url_scheme'] = 'http'

    headers_set = []
    headers_sent = []

    # 把响应信息写到终端
    def write(data):
        out = sys.stdout.buffer

        if not headers_set:
             raise AssertionError("write() before start_response()")

        elif not headers_sent:
             # Before the first output, send the stored headers
             status, response_headers = headers_sent[:] = headers_set
             out.write(wsgi_to_bytes('Status: %s\r\n' % status))
             for header in response_headers:
                 out.write(wsgi_to_bytes('%s: %s\r\n' % header))
             out.write(wsgi_to_bytes('\r\n'))

        out.write(data)
        out.flush()

    # 定义start_response回调函数
    def start_response(status, response_headers, exc_info=None):
        if exc_info:
            try:
                if headers_sent:
                    # Re-raise original exception if headers sent
                    raise exc_info[1].with_traceback(exc_info[2])
            finally:
                exc_info = None     # avoid dangling circular ref
        elif headers_set:
            raise AssertionError("Headers already set!")

        headers_set[:] = [status, response_headers]

        # Note: error checking on the headers should happen here,
        # *after* the headers are set.  That way, if an error
        # occurs, start_response can only be re-called with
        # exc_info set.

        return write

    result = application(environ, start_response)
    try:
        # 处理application返回的结果(可迭代)
        for data in result:
            if data:    # don't send headers until body appears
                write(data)
        if not headers_sent:
            write('')   # send headers now if body was empty
    finally:
        if hasattr(result, 'close'):
            result.close()

中间件Middleware

Middlerware是位于Http ServerPython Application之间的功能组件。

对于Http Server而言,Middlerware就是应用程序;对于Python Application而言,Middlerware就是Http ServerMiddlewareHttp ServerPython Application是透明的,把从Http Server接收到的请求进行处理并向后传递,一直传递给Python Application,最后把Python Application的处理结果返回给Http Server。如下图:

https://github.com/affectalways/affectalways.github.io/blob/master/images/wsgi/wsgi_1/wsgiframeworkmiddleware.png?raw=true

Middlerware组件可执行以下功能:

  • 根据 url 把用户请求调度到不同的 Python Application 中。
  • 负载均衡,转发用户请求
  • 预处理 XSL 等相关数据
  • 限制请求速率,设置白名单

PS:WSGI 的 middleware 体现了 unix 的哲学之一:do one thing and do it well。

本例实现了一个关于异常处理的 middleware(摘自):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
from sys import exc_info
from traceback import format_tb

class ExceptionMiddleware(object):
    """The middleware we use."""

    def __init__(self, app):
        self.app = app

    def __call__(self, environ, start_response):
        """Call the application can catch exceptions."""
        appiter = None
        # just call the application and send the output back
        # unchanged but catch exceptions
        try:
            appiter = self.app(environ, start_response)
            for item in appiter:
                yield item
        # if an exception occours we get the exception information
        # and prepare a traceback we can render
        except:
            e_type, e_value, tb = exc_info()
            traceback = ['Traceback (most recent call last):']
            traceback += format_tb(tb)
            traceback.append('%s: %s' % (e_type.__name__, e_value))
            # we might have not a stated response by now. try
            # to start one with the status code 500 or ignore an
            # raised exception if the application already started one.
            try:
                start_response('500 INTERNAL SERVER ERROR', [
                               ('Content-Type', 'text/plain')])
            except:
                pass
            yield '\n'.join(traceback)

        # wsgi applications might have a close function. If it exists
        # it *must* be called.
        if hasattr(appiter, 'close'):
            appiter.close()

Python Application

Python Application端必须定义一个 callable objectcallable object 可以是以下三者之一:

  • function/method
  • class
  • instance with a __call__ method

callable object必须满足以下两个条件:

  • 接收两个参数:environ(字典,WSGI的环境信息)、start_response(响应请求的函数, 返回HTTP status、headers给server)
  • 返回一个可迭代的值(iterable)

重点内容:

  1. environstart_responsehttp server提供并实现
  2. environ变量是包含环境变量的字典
  3. Python Application内部在返回前调用start_response
  4. start_response也是一个callable,接收两个必要的参数,statusresponse_headers

callable object代码实现

1.function/method
1
2
3
4
def application(environ, start_response):
	# 调用服务器程序提供的 start_response,填入两个参数
	start_response('200 OK', [('Content-Type', 'text/json')])
	return []
2.class
1
2
3
4
5
6
7
8
class ApplicationClass(object):
	def __init__(self, environ, start_response):
		self.environ = environ
		self.start_response = start_response
	
	def __iter__(self):
		self.start_response('200 OK', [('Content-Type', 'text/json')])
		yield "随便"

使用方式

1
2
for result in ApplicationClass(environ, start_response):
    do_somthing(result)
3.instance with a call method
1
2
3
4
5
6
7
class ApplicationClass(object):
	def __init__(self):
		pass
		
	def __call__(self, environ, start_response):
		start_response('200 OK', [('Content-Type', 'text/json')])
		yield "anything"

使用方式

1
2
3
app = ApplicationClass()
for result in app(environ, start_response):
	do_something(result)

参考链接

PEP-3333

巨佬