CGI gzip compression module

Simple CGI output module. Uses gzip compression on the output stream if the client accepts it.

Last modified
Lines 280

Parent directory Download CGIread sitemap Main page

Quick links: done init output overload_test write_b write_body write_h write_head

  1. #!/usr/bin/python
  2. import os
  3. import sys
  4. import subprocess
  5. # 2020-09-17    Somehow broke gzip compression
  6. #               Can't get subprocess.Popen to work anymore
  7. # 2022-**-**    Increased max load average from 3.5 to 6
  8. # 2022-05-21    and to 8
  9. # 2022-08-08    to 12
  10. '''
  11. compressout
  12. ===========
  13. Simple CGI output compression module.
  14. Not all webservers support compressing the output stream of CGI scripts.
  15. This script determines whether or not a client accept gzip compression and
  16. compresses the output stream of the script.
  17. NOTICE: The `cgitb` module will write to stdout if the script crashes,
  18. you should use a browser that does not accept gzip, when you are
  19. testing your scripts.
  20. NOTICE: This module uses two global variable: `use_gzip` and `body`.
  21. It's supposed to be used like an object, but rather than using a class,
  22. this is imported "pre-created" or so to say.
  23. NOTICE: There are two constants:
  24.   `max_load_avg_1min`:
  25.     This is the maximum allowed load average under one minute.
  26.     If the one minute load average exceeds this value, compressout will
  27.     return a 503 response to the client and abort the process.
  28.   `http503_body`:
  29.     A plain text error message to send if the server is overloaded.
  30. You should modify these constants to fit your own needs.
  31. Example / TL;DR
  32. ===============
  33. import compressout
  34. compressout.init()
  35. compressout.write_head('Content-Type: text/plain\r\n')
  36. compressout.write_head('Foo: test\r\nBar: test\r\n')
  37. # Blank line required for terminating the HEAD section
  38. compressout.write_head('\r\n')
  39. compressout.write_body('Hello world!\n')
  40. compressout.write_body('Bye.\n')
  41. compressout.done()
  42. Functions
  43. =========
  44. init(write_headers=True)
  45. ------------------------
  46.     Initialize the module.  This function will detect if the client
  47.     supports gzip.
  48.     If `write_headers` is True, the function writes a
  49.     'Vary: Content-Encoding' header and (if gzip is used) a
  50.     'Content-Encdoing: gzip' header.
  51. write_head(s) and write_h(s)
  52. ----------------------------
  53.     
  54.     This function is used to print all HTTP headers **and the blank line
  55.     separating the head from the body**.
  56.     Write `s` to standard output, will never go through gzip.
  57. write_body(s) and write_b(s)
  58. ----------------------------
  59.     Write part of body.
  60.     NOTICE: You need to have printed the blank line after the headers
  61.     with the `write_h` (or `write_head`) fuction.
  62.     
  63.     If gzip is supported by the client
  64.     ----------------------------------
  65.     
  66.         `s` will be appended to a local buffer which the `done` function
  67.         will compress and print.
  68.     
  69.     If gzip is not supported
  70.     ------------------------
  71.     
  72.         `s` will go straight to stdout. The `done` function won't do
  73.         anything.
  74. done()
  75. ------
  76.     Done writing output.
  77.     This function will invoke gzip.
  78. Dos and don'ts
  79. ==============
  80.     * Try to call `init` and `done` at convenient locations like on the
  81.       "outside" of a main function, i.e. don't repeat yourself by calling
  82.       these two functions everywhere in your code.
  83.     * Never call `write_head` after any call to `write_body`.
  84.     * Always call `done` when your done.
  85.     * Use only compressout to write output, otherwise you'll have a mess.
  86.     * NOTICE: The `cgitb` module will write to stdout if the script
  87.       crashes, you should use a browser that does not accept gzip,
  88.       when you are testing your scripts.  Eg, lwp-request.
  89.       `GET http://example.com/ | less` is excellent for debuggin.
  90. '''
  91. ### GLOBALS ###
  92. use_gzip = None     # Whether or not to compress the body
  93. body = ''           # The body is stored here if it is to be compressed
  94. ### END GLOBALS ###
  95. ### CONSTANTS -- Configure for your own needs ###
  96. # If the load average of the last one minute exceeds the hard coded value,
  97. # this script will return a 503 response and abort the process.
  98. max_load_avg_1min = 12      # 3.5
  99. http503_body = '''
  100. Service temporarily unavailable!
  101. Wait at least two minutes before trying again.
  102. Re-attempting prematurely may result in banning your IP address.
  103. -- END --
  104. '''
  105. #               #############################################
  106. if sys.version_info[0] > 2:
  107.     def output(s):
  108.         if isinstance(s, str):
  109.             sys.stdout.buffer.write(s.encode('utf-8'))
  110.         elif isinstance(s, bytes):
  111.             sys.stdout.buffer.write(s)
  112.         else:
  113.             raise TypeError("Unsupported datatype")
  114.     flush = sys.stdout.buffer.flush
  115. else:
  116.     output = sys.stdout.write
  117.     flush = sys.stdout.flush
  118. def overload_test(too_late=False):
  119.     '''
  120.     '''
  121.     if os.getloadavg()[0] > max_load_avg_1min:
  122.         if not too_late:
  123.             output('Status: 503\n')
  124.             output('Content-Type: text/plain\n')
  125.             output('Retry-After: 90\n')
  126.             output(http503_body)
  127.             flush()
  128.         #os.abort()
  129.         sys.exit(1)
  130. def init(write_headers=True):
  131.     '''
  132.     Initialize the module.  This function will detect if the client
  133.     support gzip.  This will also set the global variable `debug_cookie`.
  134.     If `write_headers`, write a 'Vary' and (if used)
  135.     'Content-Encoding' header.
  136.     '''
  137.     
  138.     global use_gzip
  139.     global body
  140.     global debug_cookie
  141.     
  142.     # This is the only place where sending a 503 message will work.
  143.     # write_h:
  144.     #   - Message body may need to be compressed.
  145.     #   - Possibility of conflicting Status headers.
  146.     # write_b:
  147.     #   - Message body may need to be compressed.
  148.     #   - Message body may be application/xhtml+xml
  149.     # done:
  150.     #   - Message body needs to be compressed if `use_gzip`.
  151.     #   - Body has already been written if not `use_gzip`.
  152.     overload_test(too_late=False)
  153.     use_gzip = 'gzip' in os.getenv('HTTP_ACCEPT_ENCODING', '')
  154.     body = ''
  155.     use_gzip = False
  156.     if write_headers:
  157.         output('Vary: Accept-Encoding\n')
  158.         if use_gzip:
  159.             output('Content-Encoding: gzip\n')
  160.     debug_cookie = 'debug=on' in os.getenv('HTTP_COOKIE', '')
  161.     if 'debug=on' in os.getenv('QUERY_STRING', ''):
  162.         output('Set-Cookie: debug=on\n')
  163.         debug_cookie = True
  164.     if 'debug=off' in os.getenv('QUERY_STRING', ''):
  165.         output('Set-Cookie: debug=off\n')
  166.         debug_cookie = False
  167. def write_head(s):
  168.     write_h(s)
  169. def write_h(s):
  170.     '''
  171.     Write part of header.
  172.     Write `s` to standard output, will never go through gzip.
  173.     '''
  174.     overload_test(too_late=True)
  175.     output(s)
  176. def write_body(s):
  177.     write_b(s)
  178. def write_b(s):
  179.     '''
  180.     Write part of body.
  181.     
  182.     gzip is supported by the client
  183.     -------------------------------
  184.     
  185.         `s` will be appended to a local buffer
  186.         which `done` will compress and print.
  187.     
  188.     gzip is not supported
  189.     ---------------------
  190.     
  191.         `s` will go straight to stdout.
  192.     '''
  193.     
  194.     global body
  195.     overload_test(too_late=True)
  196.     
  197.     if use_gzip:
  198.         body += s
  199.     else:
  200.         output(s)
  201. def done():
  202.     '''
  203.     Done writing output.
  204.     This function will invoke gzip.
  205.     '''
  206.     
  207.     overload_test(too_late=True)
  208.     if use_gzip:
  209.         gzip = subprocess.Popen(
  210.             ['gzip'],
  211.             stdin=subprocess.PIPE,
  212.             stdout=subprocess.PIPE,
  213.         )
  214.         if sys.version_info[0] > 2:
  215.             body = body.encode('utf-8')
  216.             sys.stderr.write('Body encoded\n')
  217.         sys.stderr.write('Just before communicate\n')
  218.         gzip_stdout = gzip.communicate(body)[0]
  219.         sys.stderr.write('Just after communicate\n')
  220.         #gzip_stdout = things[0]
  221.         #sys.stderr.write('After extracting data\n')
  222.         #sys.stderr.write(gzip_stderr)
  223.         output(gzip_stdout)
  224.         sys.stderr.write('done() complete\n')