CGI gzip compression module

Simple CGI output module. Uses gzip compression on the output stream if the client accepts it.

Last modified
Lines 279

Parent directory Download CGIread sitemap Main page

Quick links: done init output overload_test write_b write_body write_h write_head

  1. #!/usr/bin/python
  2. import os
  3. import sys
  4. import subprocess
  5. # 2020-09-17    Somehow broke gzip compression
  6. #               Can't get subprocess.Popen to work anymore
  7. # 2022-**-**    Increased max load average from 3.5 to 6
  8. # 2022-05-21    and to 8
  9. '''
  10. compressout
  11. ===========
  12. Simple CGI output compression module.
  13. Not all webservers support compressing the output stream of CGI scripts.
  14. This script determines whether or not a client accept gzip compression and
  15. compresses the output stream of the script.
  16. NOTICE: The `cgitb` module will write to stdout if the script crashes,
  17. you should use a browser that does not accept gzip, when you are
  18. testing your scripts.
  19. NOTICE: This module uses two global variable: `use_gzip` and `body`.
  20. It's supposed to be used like an object, but rather than using a class,
  21. this is imported "pre-created" or so to say.
  22. NOTICE: There are two constants:
  23.   `max_load_avg_1min`:
  24.     This is the maximum allowed load average under one minute.
  25.     If the one minute load average exceeds this value, compressout will
  26.     return a 503 response to the client and abort the process.
  27.   `http503_body`:
  28.     A plain text error message to send if the server is overloaded.
  29. You should modify these constants to fit your own needs.
  30. Example / TL;DR
  31. ===============
  32. import compressout
  33. compressout.init()
  34. compressout.write_head('Content-Type: text/plain\r\n')
  35. compressout.write_head('Foo: test\r\nBar: test\r\n')
  36. # Blank line required for terminating the HEAD section
  37. compressout.write_head('\r\n')
  38. compressout.write_body('Hello world!\n')
  39. compressout.write_body('Bye.\n')
  40. compressout.done()
  41. Functions
  42. =========
  43. init(write_headers=True)
  44. ------------------------
  45.     Initialize the module.  This function will detect if the client
  46.     supports gzip.
  47.     If `write_headers` is True, the function writes a
  48.     'Vary: Content-Encoding' header and (if gzip is used) a
  49.     'Content-Encdoing: gzip' header.
  50. write_head(s) and write_h(s)
  51. ----------------------------
  52.     
  53.     This function is used to print all HTTP headers **and the blank line
  54.     separating the head from the body**.
  55.     Write `s` to standard output, will never go through gzip.
  56. write_body(s) and write_b(s)
  57. ----------------------------
  58.     Write part of body.
  59.     NOTICE: You need to have printed the blank line after the headers
  60.     with the `write_h` (or `write_head`) fuction.
  61.     
  62.     If gzip is supported by the client
  63.     ----------------------------------
  64.     
  65.         `s` will be appended to a local buffer which the `done` function
  66.         will compress and print.
  67.     
  68.     If gzip is not supported
  69.     ------------------------
  70.     
  71.         `s` will go straight to stdout. The `done` function won't do
  72.         anything.
  73. done()
  74. ------
  75.     Done writing output.
  76.     This function will invoke gzip.
  77. Dos and don'ts
  78. ==============
  79.     * Try to call `init` and `done` at convenient locations like on the
  80.       "outside" of a main function, i.e. don't repeat yourself by calling
  81.       these two functions everywhere in your code.
  82.     * Never call `write_head` after any call to `write_body`.
  83.     * Always call `done` when your done.
  84.     * Use only compressout to write output, otherwise you'll have a mess.
  85.     * NOTICE: The `cgitb` module will write to stdout if the script
  86.       crashes, you should use a browser that does not accept gzip,
  87.       when you are testing your scripts.  Eg, lwp-request.
  88.       `GET http://example.com/ | less` is excellent for debuggin.
  89. '''
  90. ### GLOBALS ###
  91. use_gzip = None     # Whether or not to compress the body
  92. body = ''           # The body is stored here if it is to be compressed
  93. ### END GLOBALS ###
  94. ### CONSTANTS -- Configure for your own needs ###
  95. # If the load average of the last one minute exceeds the hard coded value,
  96. # this script will return a 503 response and abort the process.
  97. max_load_avg_1min = 8       # 3.5
  98. http503_body = '''
  99. Service temporarily unavailable!
  100. Wait at least two minutes before trying again.
  101. Re-attempting prematurely may result in banning your IP address.
  102. -- END --
  103. '''
  104. #               #############################################
  105. if sys.version_info[0] > 2:
  106.     def output(s):
  107.         if isinstance(s, str):
  108.             sys.stdout.buffer.write(s.encode('utf-8'))
  109.         elif isinstance(s, bytes):
  110.             sys.stdout.buffer.write(s)
  111.         else:
  112.             raise TypeError("Unsupported datatype")
  113.     flush = sys.stdout.buffer.flush
  114. else:
  115.     output = sys.stdout.write
  116.     flush = sys.stdout.flush
  117. def overload_test(too_late=False):
  118.     '''
  119.     '''
  120.     if os.getloadavg()[0] > max_load_avg_1min:
  121.         if not too_late:
  122.             output('Status: 503\n')
  123.             output('Content-Type: text/plain\n')
  124.             output('Retry-After: 90\n')
  125.             output(http503_body)
  126.             flush()
  127.         os.abort()
  128.             
  129. def init(write_headers=True):
  130.     '''
  131.     Initialize the module.  This function will detect if the client
  132.     support gzip.  This will also set the global variable `debug_cookie`.
  133.     If `write_headers`, write a 'Vary' and (if used)
  134.     'Content-Encoding' header.
  135.     '''
  136.     
  137.     global use_gzip
  138.     global body
  139.     global debug_cookie
  140.     
  141.     # This is the only place where sending a 503 message will work.
  142.     # write_h:
  143.     #   - Message body may need to be compressed.
  144.     #   - Possibility of conflicting Status headers.
  145.     # write_b:
  146.     #   - Message body may need to be compressed.
  147.     #   - Message body may be application/xhtml+xml
  148.     # done:
  149.     #   - Message body needs to be compressed if `use_gzip`.
  150.     #   - Body has already been written if not `use_gzip`.
  151.     overload_test(too_late=False)
  152.     use_gzip = 'gzip' in os.getenv('HTTP_ACCEPT_ENCODING', '')
  153.     body = ''
  154.     use_gzip = False
  155.     if write_headers:
  156.         output('Vary: Accept-Encoding\n')
  157.         if use_gzip:
  158.             output('Content-Encoding: gzip\n')
  159.     debug_cookie = 'debug=on' in os.getenv('HTTP_COOKIE', '')
  160.     if 'debug=on' in os.getenv('QUERY_STRING', ''):
  161.         output('Set-Cookie: debug=on\n')
  162.         debug_cookie = True
  163.     if 'debug=off' in os.getenv('QUERY_STRING', ''):
  164.         output('Set-Cookie: debug=off\n')
  165.         debug_cookie = False
  166. def write_head(s):
  167.     write_h(s)
  168. def write_h(s):
  169.     '''
  170.     Write part of header.
  171.     Write `s` to standard output, will never go through gzip.
  172.     '''
  173.     overload_test(too_late=True)
  174.     output(s)
  175. def write_body(s):
  176.     write_b(s)
  177. def write_b(s):
  178.     '''
  179.     Write part of body.
  180.     
  181.     gzip is supported by the client
  182.     -------------------------------
  183.     
  184.         `s` will be appended to a local buffer
  185.         which `done` will compress and print.
  186.     
  187.     gzip is not supported
  188.     ---------------------
  189.     
  190.         `s` will go straight to stdout.
  191.     '''
  192.     
  193.     global body
  194.     overload_test(too_late=True)
  195.     
  196.     if use_gzip:
  197.         body += s
  198.     else:
  199.         output(s)
  200. def done():
  201.     '''
  202.     Done writing output.
  203.     This function will invoke gzip.
  204.     '''
  205.     
  206.     overload_test(too_late=True)
  207.     if use_gzip:
  208.         gzip = subprocess.Popen(
  209.             ['gzip'],
  210.             stdin=subprocess.PIPE,
  211.             stdout=subprocess.PIPE,
  212.         )
  213.         if sys.version_info[0] > 2:
  214.             body = body.encode('utf-8')
  215.             sys.stderr.write('Body encoded\n')
  216.         sys.stderr.write('Just before communicate\n')
  217.         gzip_stdout = gzip.communicate(body)[0]
  218.         sys.stderr.write('Just after communicate\n')
  219.         #gzip_stdout = things[0]
  220.         #sys.stderr.write('After extracting data\n')
  221.         #sys.stderr.write(gzip_stderr)
  222.         output(gzip_stdout)
  223.         sys.stderr.write('done() complete\n')