| |
[Aug. 20th, 2008|04:30 pm] |
When designing the FLVio RESTful HTTP API I ended up choosing XHTML as the data representation format. My natural instinct was to use XML and invent my own schema, but RESTful Web Services convinced me otherwise.
While explaining to a customer today about simply using a web browser to help debug the API I said,
"It is no coincidence that we use XHTML to represent data as it is not only a well-understood XML format but also makes life much easier when debugging."
Which has proven itself true so far. Any browser becomes a debugging tool for the API. Although, until browsers support all the HTTP verbs (or XHTML5 / Web Forms 2.0) you'll need an addon like Poster for Firefox to test commands like PUT and DELETE. |
|
|
| Zoner - DNS management UI |
[Jul. 31st, 2008|11:25 pm] |
A couple of years ago, while learning TurboGears, I wrote a web application to simplify management of DNS zone files. Fast forward to today and I finally found a few minutes to clean it up a bit and make a release.
It is called Zoner and differs from many DNS management interfaces in that it works directly with live zone files. The zone files remain the master copy of domain details and can still be edited manually without effecting Zoner, as opposed to storing the domain structure in a database and generating zone files when needed (or reconfiguring bind to read directly from SQL). It also stores an audit trail for all changes (made through Zoner) and zones can be rolled back to any previous version.
Zoner might also be a useful reference app for anyone learning TurboGears 1.0. It is relatively simple, uses SQLAlchemy and Kid with Paginate and Form widgets. |
|
|
| FLVio video web service now encodes H.264 video |
[Jul. 4th, 2008|08:24 pm] |
We have just pushed live a new version of the FLVio video web service that gives clients the option to encode (Flash-compatible) H.264 video.
Many people probably already know that Adobe added support for H.264 video (in an mp4 container) to Flash Player late last year. This was welcome news to many people as H.264 is an open standard and provides much higher quality video (at lower bandwidths) than standard "FLV" video.
The only gotcha is that end users need to have a recent version of Flash Player installed (Flash Player 9 Update 3 aka version 9.0.115.0 or newer) to playback H.264 video.
However, many popular Flash media players can be configured to attempt to playback H.264 video within the browser and automatically fallback to the FLV alternative if the version of Flash Player is too old. FLVio has been designed to support this by providing the option to encode both FLV and H.264 videos automatically for the client, providing easy access to the best of both worlds: high quality video playback and backwards compatibility. |
|
|
| CherryPy and byte range requests? Too easy. |
[Jul. 3rd, 2008|07:58 pm] |
One of my web applications is a CherryPy server that serves large files. I wanted to enable HTTP 1.1 byte range requests so I expected to have to get my hands dirty modifying my app to look for the right headers and do the dirty work.
Not so! I was already taking advantage of CherryPy's built-in helper function serveFile (cherrypy.lib.cptools.serveFile in CP 2) to efficiently serve static files back to the client. Glancing at the code for serveFile revealed that support for HTTP 1.1 byte ranges was already supported. But why were HTTP 1.1 range requests being ignored by my app?
The answer was simply that I had to tell CherryPy to enable HTTP 1.1 features. A quick change to the application config file to add:
server.protocol_version = "HTTP/1.1" and a restart and success!
$ telnet media.serve.flvio.com 80
Trying 82.118.75.220...
Connected to media.serve.flvio.com.
Escape character is '^]'.
GET /media/mediakit/thumb/moovoob/2.jpg HTTP/1.1
host:media.serve.flvio.com
Range: bytes=10-20
HTTP/1.1 206 Partial Content
Date: Thu, 03 Jul 2008 10:10:15 GMT
Server: CherryPy/2.3.0
Accept-Ranges: bytes
Content-Length: 11
Content-Range: bytes 10-20/12438
Content-Type: image/jpeg
Last-Modified: Wed, 02 Jul 2008 05:47:28 GMT
51.57.1^]
telnet> cl
Connection closed. |
|
|
| FLVio - Video Web Service |
[May. 28th, 2008|01:43 pm] |
I have spent most of this year, so far, designing and building a video web service, which has been branded as FLVio. We have just announced the launch of FLVio with our first live customer, one of a few who helped us with beta testing.
The idea behind FLVio is to solve all the problems behind adding video content (especially UGC) to a web site. Every second web site that launches nowadays seems to be some kind of social network, and many of them want all the bells & whistles that the big guys have, including user-generated video content. FLVio helps small (and large) businesses integrate video content without the pain and upfront expense, by solving these key problems:
- storage
- encoding
- delivery
Videos are relatively large, so you need reliable storage, and plenty of it. Simple as that.
Videos (especially UGC) can be uploaded in any of a huge variety of video formats and codecs, all of which need to be re-encoded into a format that is playable within the browser and optimised for efficient web delivery. FLVio encodes almost all non-proprietary formats into Flash-compatible video (FLV and H.264), solving the other problem with re-encoding and that is CPU resources. The last thing you want to do is to have your web application server grinding away to re-encode user uploaded videos into FLV. Offloading that workload to FLVio leaves your server resources available for rendering web applications as they should be.
FLVio delivers video via progressive HTTP download, the favoured method these days for serving Flash-based video. Videos are served directly from the FLVio web servers to the web browser, avoiding the need to scale up your own web farm to handle the multitude of long-lived requests that media delivery typically requires, not to mention the unknown bandwidth costs that media delivery can add. FLVio has partnered with a Content Delivery Network (CDN) provider so that we can also accelerate media delivery for the best possible user experience.
FLVio integrates with a web application by means of a RESTful API. All interaction with FLVio is behind the scenes, at the API level, so web applications keep full control over the user experience, including upload forms and video playback. The fact that video management and delivery has been "outsourced" is transparent to users of the web application. I won't go into detail about the API here, for more details you can read a brief technical overview here. For the curious, the whole service was built with Python and is running on a farm of Solaris servers.
We've got a simple demonstration of a FLVio-based application where you can upload a video and see the results of the re-encoding process.
Any questions or comments, feel free to contact FLVio or myself directly. |
|
|
| gcc pre-defined macros |
[May. 21st, 2008|12:15 pm] |
gcc defines some macros based on the platform, architecture, etc that it is running on. I always forget the gcc arguments that makes it display all these macro definitions, so here's a reminder for myself.
gcc -E -dM foo.c foo.c can be anything, even an empty file (gcc only pre-processes the file).
Here's an ultra-simple mini script that takes care of the temp file creation.
gcc_macros.sh:
#!/bin/sh
tmpfile=/var/tmp/foo.c
touch $tmpfile
gcc -E -dM $tmpfile
rm $tmpfile If I run this script on my Mac I get a large list of macro definitions, i.e.:
$ ./gcc_macros.sh
#define __DBL_MIN_EXP__ (-1021)
#define __FLT_MIN__ 1.17549435e-38F
#define __CHAR_BIT__ 8
#define __WCHAR_MAX__ 2147483647
#define __DBL_DENORM_MIN__ 4.9406564584124654e-324
#define __FLT_EVAL_METHOD__ 0
#define __DBL_MIN_10_EXP__ (-307)
#define __FINITE_MATH_ONLY__ 0
#define __SHRT_MAX__ 32767
#define __LDBL_MAX__ 1.18973149535723176502e+4932L
#define __APPLE_CC__ 5465
#define __UINTMAX_TYPE__ long long unsigned int
#define __SCHAR_MAX__ 127
#define __USER_LABEL_PREFIX__ _
#define __STDC_HOSTED__ 1
#define __DBL_DIG__ 15
#define __FLT_EPSILON__ 1.19209290e-7F
#define __LDBL_MIN__ 3.36210314311209350626e-4932L
#define __strong
#define __APPLE__ 1
#define __DECIMAL_DIG__ 21
#define __LDBL_HAS_QUIET_NAN__ 1
#define __DYNAMIC__ 1
#define __GNUC__ 4
#define __MMX__ 1 and so on. |
|
|
| Packaging a Twisted application |
[Dec. 23rd, 2007|08:38 pm] |
At work I've created a number of Twisted applications for handling various internal services. Unlike my TurboGears applications, which I package as eggs to install using easy_install (provided by setuptools) I have no nice way to deploy my Twisted apps.
Until now.
Twisted provides a nice plugin system that allows an application to plug itself into the "twistd" command-line application starter. When properly packaged a Twisted application can be automatically plugged into the Twisted world at installation time and started by using twistd.
The only trouble is that there is no documentation for how to package a Twisted application so it can be deployed in this way.
Here I try to provide some documentation by showing an example of what is required to package a simple Twisted application. In fact, I will take the Twisted finger tutorial and write what I consider to be Step 12: "How to package the finger service as an installable Twisted application plugin for twistd" (aka "The missing step").
Step 12: How to package the finger service as an installable Twisted application plugin for twistd
Create a directory structure like this:
finger
finger/__init__.py
finger/finger.py
MANIFEST.in
setup.py
twisted
twisted/plugins
twisted/plugins/finger_plugin.py
finger/finger.py is the finger application from http://twistedmatrix.com/projects/core/documentation/howto/tutorial/index.html packaged as finger.
twisted/plugins is a directory structure containing the finger_plugin.py file that will be described below. Note that there must be no __init__.py files within twisted and twisted/plugins.
finger_plugin.py provides a class implementing the IServiceMaker and IPlugin interfaces. Basically, this is the plugin point that defines the services the application will provide and any command-line options that it supports.
# ==== twisted/plugins/finger_plugin.py ====
# - Zope modules -
from zope.interface import implements
# - Twisted modules -
from twisted.python import usage
from twisted.application.service import IServiceMaker
from twisted.plugin import IPlugin
# - Finger modules -
from finger import finger
class Options(usage.Options):
synopsis = "[options]"
longdesc = "Make a finger server."
optParameters = [
['file', 'f', '/etc/users'],
['templates', 't', '/usr/share/finger/templates'],
['ircnick', 'n', 'fingerbot'],
['ircserver', None, 'irc.freenode.net'],
['pbport', 'p', 8889],
]
optFlags = [['ssl', 's']]
class MyServiceMaker(object):
implements(IServiceMaker, IPlugin)
tapname = "finger"
description = "Finger server."
options = Options
def makeService(self, config):
return finger.makeService(config)
serviceMaker = MyServiceMaker()
setup.py is the standard distutils setup.py file. Take note of the "packages" and "package_data" arguments to setup(). Also note the refresh_plugin_cache() function which is called after setup() completes. This forces a refresh of the Twisted plugins cache (twisted/plugins/dropin.cache).
# ==== twisted/plugins/finger_plugin.py ====
'''setup.py for finger.
This is an extension of the Twisted finger tutorial demonstrating how
to package the Twisted application as an installable Python package and
twistd plugin (consider it "Step 12" if you like).
Uses twisted.python.dist.setup() to make this package installable as
a Twisted Application Plugin.
After installation the application should be manageable as a twistd
command.
For example, to start it in the foreground enter:
$ twistd -n finger
To view the options for finger enter:
$ twistd finger --help
'''
__author__ = 'Chris Miles'
import sys
try:
import twisted
except ImportError:
raise SystemExit("twisted not found. Make sure you "
"have installed the Twisted core package.")
from distutils.core import setup
def refresh_plugin_cache():
from twisted.plugin import IPlugin, getPlugins
list(getPlugins(IPlugin))
if __name__ == '__main__':
if sys.version_info[:2] >= (2, 4):
extraMeta = dict(
classifiers=[
"Development Status :: 4 - Beta",
"Environment :: No Input/Output (Daemon)",
"Programming Language :: Python",
])
else:
extraMeta = {}
setup(
name="finger",
version='0.1',
description="Finger server.",
author=__author__,
author_email="you@email.address",
url="http://twistedmatrix.com/projects/core/documentation/howto/tutorial/index.html",
packages=[
"finger",
"twisted.plugins",
],
package_data={
'twisted': ['plugins/finger_plugin.py'],
},
**extraMeta)
refresh_plugin_cache()
MANIFEST.in contains one line, which I assume tells distutils to modify the existing Twisted package (to install twisted/plugin/finger_plugin.py) or something like that.
graft twisted
With all that in place you can install the package the usual way,
$ python setup.py install
Then you should be able to run twistd to see and control the application. See the twistd options and installed Twisted applications with:
$ twistd --help
Usage: twistd [options]
...
Commands:
athena-widget Create a service which starts a NevowSite with a single
page with a single widget.
ftp An FTP server.
telnet A simple, telnet-based remote debugging service.
socks A SOCKSv4 proxy service.
manhole-old An interactive remote debugger service.
portforward A simple port-forwarder.
web A general-purpose web server which can serve from a
filesystem or application resource.
inetd An inetd(8) replacement.
vencoderd Locayta Media Farm vencoderd video encoding server.
news A news server.
words A modern words server
toc An AIM TOC service.
finger Finger server.
dns A domain name server.
mail An email service
manhole An interactive remote debugger service accessible via
telnet and ssh and providing syntax coloring and basic
line editing functionality.
conch A Conch SSH service.
View the options specific to the finger server:
$ twistd finger --help
Usage: twistd [options] finger [options]
Options:
-s, --ssl
-f, --file= [default: /etc/users]
-t, --templates= [default: /usr/share/finger/templates]
-n, --ircnick= [default: fingerbot]
--ircserver= [default: irc.freenode.net]
-p, --pbport= [default: 8889]
--version
--help Display this help and exit.
Make a finger server.
Start the finger server (in the foreground) with:
$ sudo twistd -n finger --file=users
2007/12/23 22:12 +1100 [-] Log opened.
2007/12/23 22:12 +1100 [-] twistd 2.5.0 (/Library/Frameworks/Python.framework/
Versions/2.5/Resources/Python.app/Contents/MacOS/Python 2.5.0) starting up
2007/12/23 22:12 +1100 [-] reactor class: <class 'twisted.internet.selectreactor.SelectReactor'>
2007/12/23 22:12 +1100 [-] finger.finger.FingerFactoryFromService starting on 79
2007/12/23 22:12 +1100 [-] Starting factory <finger.finger.FingerFactoryFromService instance at 0x1d0a4e0>
2007/12/23 22:12 +1100 [-] twisted.web.server.Site starting on 8000
2007/12/23 22:12 +1100 [-] Starting factory <twisted.web.server.Site instance at 0x1d0a558>
2007/12/23 22:12 +1100 [-] twisted.spread.pb.PBServerFactory starting on 8889
2007/12/23 22:12 +1100 [-] Starting factory <twisted.spread.pb.PBServerFactory instance at 0x1d0a670>
2007/12/23 22:12 +1100 [-] Starting factory <finger.finger.IRCClientFactoryFromService instance at 0x1d0a5f8>
twistd provides many useful options, such as daemonizing the application, specifying the logfile and pidfile locations, etc.
Unfortunately Twisted and setuptools don't play nicely together, so I'm not able to package my Twisted app as an egg, take advantage of the setuptools package dependency resolution system, or install it using easy_install.
References: http://twistedmatrix.com/projects/core/documentation/howto/plugin.html
http://twistedmatrix.com/projects/core/documentation/howto/tap.html
http://twistedmatrix.com/projects/core/documentation/howto/tutorial/index.html |
|
|
| Eddie 0.36 Released. |
[Dec. 8th, 2007|12:51 am] |
Eddie is a system monitoring agent, written entirely in Python, that I've been working on for many more years than I can remember. I finally got a chance to make a new release. You can get it here http://eddie-tool.net/
This version has been a long time coming, but has been well tested over that time. This version features many enhancements and bugfixes, some of them listed below. A special thanks to Zac Stevens and Mark Taylor for their contributions.
- Added support for Spread messaging as an alternative to Elvin.
- Implemented a DiskStatistics data collector for Linux.
- More command-line options and support for running as daemon.
- Added a "log" action. Use it to append to a log file, log via syslog, or print on the eddie tty.
- Variables can be set in directives, which can then be used in rule evaluation. For example, if the directive has "maxcpu=30", then the rule can address this as "rule='pcpu > _maxcpu'".
- HTTP checks support cookie persistence.
- Added "DBI" directive, for database query checking.
- Added Solaris SMF method/manifest files to contrib.
- Many more enhancements and bugfixes - see http://dev.eddie-tool.net/trac/browser/eddie/trunk/doc/CHANGES.txt
|
|
|
| Excellent! |
[Sep. 21st, 2007|09:55 pm] |
$ ssh root@10.0.1.14
root@10.0.1.14's password:
Last login: Fri Sep 21 21:53:30 2007 from 10.0.1.2
# uname -a
Darwin CM iPhone 9.0.0d1 Darwin Kernel Version 9.0.0d1: Fri Jun 22 00:38:56 PDT 2007; root:xnu-933.0.1.178.obj~1/RELEASE_ARM_S5L8900XRB iPhone1,1 Darwin
# python
Python 2.5.1 (r251:54863, Jul 27 2007, 12:05:57)
[GCC 4.0.1 LLVM (Apple Computer, Inc. build 2.0)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import os
>>> os.uname()
('Darwin', 'CM iPhone', '9.0.0d1', 'Darwin Kernel Version 9.0.0d1: Fri Jun 22 00:38:56 PDT 2007; root:xnu-933.0.1.178.obj~1/RELEASE_ARM_S5L8900XRB', 'iPhone1,1')
>>>
|
|
|
| PyCon UK 2007 Thumbs Up |
[Sep. 11th, 2007|04:31 pm] |
I spent the weekend in Birmingham at the very first ever PyCon UK 2007 conference. Everyone agreed it was an outstanding success - I went to EuroPython a few months ago and I must admit that PyCon UK had the edge on it for fun and value.
Like I did at EuroPython, I gave a lightning talk on PSI, although this time I was better prepared with real slides, instead of using vim as a presentation tool and attempting to give a real-time demo (which ran me out of time too quickly).
I have even made the slides available, for anyone who may be curious. |
|
|
| PSI 0.2a1 released |
[Sep. 6th, 2007|12:26 am] |
Today I finally released the first alpha version of PSI - the Python System Information package. Just ahead of this weekend's PyCon UK, where you'll find me.
PSI is a C extension that gives Python direct access to run-time system information by querying the relevant system calls. This version provides information about run-time process details. A Python program can take a snapshot of a process or all currently active processes on a system and inspect process details to its heart's content. PSI provides a consistent interface across all supported architectures, so programs written for one should (mostly) work on others. Where a particular architecture cannot supply the requested information that others can it will raise an appropriate exception.
This release supports 3 popular architectures: Solaris, Mac OS X and Linux. Hopefully more are on the way if I can round up volunteers.
If you want to have a play just: download it; svn checkout the source; or easy_install psi.
Here's some examples of it in action:
>>> import psi
>>> a = psi.arch.arch_type()
>>> a
<psi.arch.ArchMacOSX object type='Darwin'>
>>> isinstance(a, psi.arch.ArchMacOSX)
True
>>> isinstance(a, psi.arch.ArchDarwin)
True
>>> a.sysname
'Darwin'
>>> a.nodename
'laptop'
>>> a.release
'8.9.1'
>>> a.version
'Darwin Kernel Version 8.9.1: Thu Feb 22 20:55:00 PST
2007; root:xnu-792.18.15~1/RELEASE_I386'
>>> a.machine
'i386'
>>> psi.loadavg()
(0.705078125, 0.73046875, 0.7626953125)
>>> import os
>>> mypid = os.getpid()
>>> mypid
13903
>>> p = psi.process.Process(mypid)
>>> p.command
'Python'
>>> p.command_path
'/Library/Frameworks/Python.framework/Versions/2.5/
Resources/Python.app/Contents/MacOS/Python'
>>> p.user
'chris'
>>> p.start_datetime
datetime.datetime(2007, 9, 1, 10, 58, 51)
>>> p.parent
<psi.process.Process object pid=13860>
>>> p.parent.command
'bash'
>>> "%0.1f MB" % (p.resident_size/1024.0/1024.0)
'9.7 MB'
>>> "%0.1f MB" % (p.virtual_size/1024.0/1024.0)
'43.5 MB'
>>> ps = psi.process.ProcessTable()
>>> ps.count
115
>>> ps.pids
(0, 1, 27, 31, 39, 40, 41, 42, 43, 44, 45, 46, 47, 49, 50,
51, 56, 59, 63, 66, 67, 69, 71, 72, 89, 117, 122, 134,
136, 149, 155, 156, 159, 162, 172, 175, 176, 177, 179,
180, 182, 183, 190, 194, 214, 229, 238, 242, 245, 246, 248,
251, 256, 257, 264, 265, 267, 268, 270, 271, 272, 273, 274,
286, 392, 401, 402, 403, 1135, 1258, 1442, 1589, 1703,
1704, 1705, 1706, 1707, 1708, 1709, 1710, 1712, 1713,
1714, 1715, 1716, 1717, 1718, 1719, 1720, 1721, 2575,
2577, 2578, 2616, 2631, 2632, 9118, 9903, 10159, 10990,
12444, 12596, 13122, 13582, 13840, 13904, 13973, 13974,
13976, 14404, 14579, 14580, 14587, 14627, 14719)
>>> p = ps.processes[114]
>>> p.command
'TextMate' |
|
|
| Private PYPI |
[Sep. 3rd, 2007|05:01 pm] |
Recently at work we streamlined the way we internally manage & deploy our Python packages & applications. Taking advantage of setuptools, we release all our packages as eggs and host our own "Private PYPI" (as we call it) as a central repository for our private packages. With a simple setuptools configuration tweak, our developers & sysadmins can install & deploy internal Python packages & applications using good old easy_install. easy_install will first look for packages in our Private PYPI repository and then fall back to the public PYPI (aka Cheeseshop) if necessary.
Our Private PYPI is a simple TurboGears application that I threw together in literally 5 minutes. It exposes a directory of packages (eggs and tarballs) as downloadable links on a web page, which is all easy_install needs to find and retrieve them.
Configuring easy_install to look for packages in the private PYPI before the public PYPI is simply a matter of creating a ~/.pydistutils.cfg file containing a file_links option pointing at the PYPI URL, in a [easy_install] section. For example:
[easy_install]
find_links = http://internal.server/pypi/ Gotta love simple yet powerful package management. |
|
|
| Zipped python eggs are evil |
[Aug. 3rd, 2007|08:48 am] |
I was recently trying to deploy a TurboGears app as a non-privileged user, configured with no home directory (home directory was just "/"). The app failed to start with a bunch of import errors, even though it worked fine when run as my user. The reason for import failure ended up being the method that is used to support importing zipped eggs. It appears that when a zipped egg is imported, it is actually unzipped to a directory in $HOME/.python-eggs/ where the package is then referenced.
So, if a user does not have write access to its $HOME directory then the temporary unzip will fail and so will the import. Very disappointing.
This whole zipped egg thing feels too much like a hack. At the very least shouldn't it attempt to unzip to the system tmp directory so it can still import the package and continue?
Anyway, the lesson to learn is always install eggs unzipped (which I was starting to do anyway, as I often need to examine the insides of an installed package when debugging and having to unzip the eggs first is a bit of a pain). |
|
|
| Variable requests for Apache Bench (ab) |
[Jul. 27th, 2007|12:08 pm] |
I've been using ab (ApacheBench - comes with Apache httpd) lately to do some performance benchmarking of our internal web services at work. It is nice & simple to use, but unfortunately it is limited to only requesting the same URL over and over. For some services, such as a search engine that normally receives different query parameters with every request, this does not really represent reality.
I have created a patch for ab that gives it a new option (-R). This allows you to specify a file and ab will append lines from the file to the base URL for every request, in the order they are read from the file. If ab reaches the end of the file before the test is finished it will return to the first line and repeat them all.
An example explains this better.
Out of the box you may use ab to benchmark the speed of your site's search:
$ ab -n 5000 http://www.something/search?q=ipod
This will cause ab to send 5000 requests to the specified URL. Handy, but it is testing the same query over & over, which is not what the site would see in practice.
Instead, you could use the -R patch, by first creating a file (let's call it requests.txt) containing something like:
ipod
apple+iphone
apple+ipod
dvd+player
and running ab with:
$ ab -n 5000 -R requests.txt http://www.something/search?q=
As ab constructs a query it will fetch the next line from requests.txt and append it to the base URL and that becomes the query to use for that request. In this example it would query the URLs:
http://www.something/search?q=ipod
http://www.something/search?q=apple+iphone
http://www.something/search?q=apple+ipod
http://www.something/search?q=dvd+player
http://www.something/search?q=ipod
http://www.something/search?q=apple+iphone
http://www.something/search?q=dvd+player
and so on.
This is much more useful, at least for the types of benchmarks I want to do.
You can find the ab patch here. |
|
|
| mod_proxy_balancer gets a thumbs up |
[Jul. 20th, 2007|11:09 am] |
At work we run a bunch of web applications (mostly TurboGears, CherryPy & Twisted apps) and host them behind Apache, using mod_proxy (and sometimes mod_rewrite) to present a clean URL to the outside world, but allowing each of the apps to run on their own private ports behind the scenes. Different people manage different web apps.
In front of our web farms we use hardware load balancers to handle request arbitration, which provides nice protection from servers or Apache instances going down.
The biggest problem I've had with this configuration until now is that when we need to perform maintenance on a particular web application, bringing that application down causes Apache to return an unhelpful message like "Service unavailable" to the client, as its attempt to reverse proxy the connection to the internal service fails.
For a long while I've wanted mod_proxy to be smarter, where I could tell it "hey, if the normal service you are forwarding to is not available, forward to this one instead". And "this one" would simply be the the same service running on a different peer server.
Well, that is exactly what mod_proxy_balancer in Apache 2.2 allows you to do. It goes beyond that and can provide weighted load balancing of internal services, but it also allows you to define "hot spares" which are only used if the normal service(s) are unavailable. This is what I'm using, with a config like:
# Reverse Proxy /myapp to an internal web service, with fail-over to a hot standby
<Proxy balancer://myappcluster>
BalancerMember http://127.0.0.1:7825
# the hot standby on server2
BalancerMember http://10.0.0.2:7825 status=+H
</Proxy>
<Location /myapp>
ProxyPass balancer://myappcluster
ProxyPassReverse http://127.0.0.1:7825
ProxyPassReverse http://10.0.0.2:7825
</Location>
This config tells Apache to proxy requests for /myapp to a web service on localhost at http://127.0.0.1:7825
If that service becomes unavailable (ie: you take it down for maintenance) then it will automatically send requests to http://10.0.0.2:7825 instead. The "status=+H" defines that member as a Hot Standby. When the default service is back on-line mod_proxy_balancer will pick that up within about 60 seconds or so and revert back to forwarding all requests to it.
The ProxyPassReverse directives are unrelated to the proxy balancing smarts, but are usually required if you want to handle redirects/etc properly.
You can also get real load balancing if you define some BalancerMember entries that aren't hot standbys. mod_proxy_balancer will balance requests across them and hot standby members won't be used until all normal members become unavailable. You can control the weighting of members and the balancing method to, if you like. See proxypass and mod_proxy_balancer docs. |
|
|
| EuroPython 2007 photos |
[Jul. 17th, 2007|03:33 pm] |
Another week, another EuroPython. Good fun all round. Cheers to Google for paying for everyone's beer on Monday night :-)
Here are my photos. |
|
|
| In Vilnius for EuroPython |
[Jul. 8th, 2007|11:08 am] |
Here I am in Vilnius, Lithuania, for another EuroPython conference. The city is very nice, from the small amount I've seen so far, although it hasn't stopped raining, so sightseeing isn't easy.
I am impressed by their offering of free wifi. The hotel (where the conference is also located) offers free wifi throughout, and I've just sat down at a coffee shop in a big shopping centre and was surprised to find another free wifi signal. Given the low costs of wifi infrastructure and broadband, more cities should encourage free wifi. I can't really see London doing it though... (nothing is free, or even cheap, in London).
Anyway, with any luck I'll walk around the "old town" today, which dates back to the 13th century and try and see more of the culture than shopping centres and wifi hotspots.
The conference starts tomorrow, so not much time for seeing sights after that. No doubt that will be when the rain stops and the sun comes out. |
|
|
| EuroPython 2007 booked |
[Jun. 6th, 2007|09:09 pm] |
I'm all booked in for EuroPython 2007 now. If you're going to be there, drop me a comment so I know to look out for you. Perhaps we can meetup and try out some of the Lithuanian beers. |
|
|
| Introspecting Python objects within gdb |
[May. 15th, 2007|10:49 pm] |
I had to debug a Python C extension recently. Using gdb, it was easier than I thought to walk through the source and introspect Python objects. Here's how to do it.
The first step is to make sure you've got a Python build that contains debugging symbols. Build Python manually using "make OPT=-g".
The nice Python guys have even supplied some handy gdb macros. Grab the Misc/gdbinit file from the Python source tree and make it your ~/.gdbinit file.
$ cd Python-2.5/Misc $ cp gdbinit ~/.gdbinit
Now let's play with gdb. Fire it up and point it at the interpreter.
$ gdb (gdb) file /opt/python-2.4.4-debug/bin/python Reading symbols for shared libraries .... done Reading symbols from /opt/python-2.4.4-debug/bin/python...done.
A very useful feature with gdb is the ability to set breakpoints on files that haven't been loaded yet, such as shared libraries. Let's set one in the source of a module I've been playing with. The shared library won't be loaded until Python processes the import statement, but gdb will still let us set it.
(gdb) b processtable.c:654 No source file named processtable.c. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (processtable.c:654) pending.
Now let's fire up the unit tests, to get something happening. You can see the pending breakpoint is automatically resolved when the relevant library is loaded.
(gdb) run setup.py test Starting program: /opt/python-2.4.4-debug/bin/python setup.py test Reading symbols for shared libraries . done Reading symbols for shared libraries . done Reading symbols for shared libraries . done running test Reading symbols for shared libraries . done Reading symbols for shared libraries . done Breakpoint 1 at 0x627338: file processtable.c, line 654. Pending breakpoint 1 - "processtable.c:654" resolved test_args (tests.process_test.ProcessCommandTest) ... ok test_command (tests.process_test.ProcessCommandTest) ... ok test_command_path (tests.process_test.ProcessCommandTest) ... ok test_env (tests.process_test.ProcessCommandTest) ... ok test_nice (tests.process_test.ProcessPriorityTest) ... ok test_priority (tests.process_test.ProcessPriorityTest) ... ok test_resident_size (tests.process_test.ProcessSizeTest) ... ok test_virtual_size (tests.process_test.ProcessSizeTest) ... ok test_flags (tests.process_test.ProcessTimeTest) ... ok test_parent_pid (tests.process_test.ProcessTimeTest) ... ok test_status (tests.process_test.ProcessTimeTest) ... ok test_terminal (tests.process_test.ProcessTimeTest) ... ok test_threads (tests.process_test.ProcessTimeTest) ... ok test_current_gid (tests.process_test.ProcessUserTest) ... ok test_current_group (tests.process_test.ProcessUserTest) ... ok test_current_uid (tests.process_test.ProcessUserTest) ... ok test_current_user (tests.process_test.ProcessUserTest) ... ok test_real_gid (tests.process_test.ProcessUserTest) ... ok test_real_group (tests.process_test.ProcessUserTest) ... ok test_real_uid (tests.process_test.ProcessUserTest) ... ok test_real_user (tests.process_test.ProcessUserTest) ... ok test_bad_arg (tests.process_test.SimplestProcessTest) ... ok test_pid (tests.process_test.SimplestProcessTest) ... ok test_type (tests.process_test.SimplestProcessTest) ... ok test_args (tests.processtable_test.ProcessTableProcessTests) ... Breakpoint 1, ProcessTable_init (self=0x4410e0, args=0x405030, kwds=0x0) at processtable.c:654 654 if (PyList_Insert(self->processes, 0, (PyObject*)proc_obj)) {
Python ran some tests until it hit our breakpoint, inside the C extension module. We can view the source, of course.
(gdb) list 649 650 651 /* Add processes to list in reverse order, which ends up ordering 652 * them by ascending PID value. 653 */ 654 if (PyList_Insert(self->processes, 0, (PyObject*)proc_obj)) { 655 return -1; /* failure */ 656 } 657 Py_DECREF(proc_obj); 658 }
We are inside the __init__ function of a class. So there's the usual Python self object. In C extension modules, self is a pointer to a struct representing the internal attributes of the class. Let's take a look at self->processes.
(gdb) p self $1 = (ProcessTableObject *) 0x4410e0 (gdb) p self->processes $2 = (PyObject *) 0x4b5940
In this case, self is a pointer to our custom class. self->processes is a pointer to a PyObject, which could be any Python object type. The .gdbinit we borrowed from the Python source defines a very useful macro for inspecting the target of PyObject pointers.
(gdb) pyo self->processes object : [] type : list refcount: 1 address : 0x4b5940 $3 = void
Cool, so self->processes is a list type, and its current value is an empty list. Our breakpoint is located within a loop, so let's iterate around and get an object added to this list.
(gdb) cont Continuing.
Breakpoint 1, ProcessTable_init (self=0x4410e0, args=0x405030, kwds=0x0) at processtable.c:654 654 if (PyList_Insert(self->processes, 0, (PyObject*)proc_obj)) { (gdb) pyo self->processes object : [<psi.process.process object="object" pid="16543">] type : list refcount: 1 address : 0x4b5940 $4 = void
Cool, the list now contains an object. Let's add another by looping again.
(gdb) cont Continuing.
Breakpoint 1, ProcessTable_init (self=0x4410e0, args=0x405030, kwds=0x0) at processtable.c:654 654 if (PyList_Insert(self->processes, 0, (PyObject*)proc_obj)) { (gdb) pyo self->processes object : [<psi.process.process object="object" pid="16536">, <psi.process.process object="object" pid="16543">] type : list refcount: 1 address : 0x4b5940 $5 = void
So, self->processes is a list and currently contains 2 objects. Is it possible to fetch an element from the list and examine it? Sure is. We need to call the Python C functions that know how to deal with Python objects. gdb will allow us to do this.
(gdb) pyo PyObject_GetItem(self->processes,Py_BuildValue("i",0)) object : <psi.process.process object="object" pid="16536"> type : psi.process.Process refcount: 3 address : 0x4dbf28 $6 = void
PyObject_GetItem(obj, y) is the C equivalent of obj[y] or obj.__getitem__(y)). The "y" must also be a Python object, you cannot just give it a C int. So we use Py_BuildValue() to build a Python integer object. The above is the equivalent of self.processes[0]. (Note that you cannot have any spaces within the argument given to pyo, as arguments to gdb macros are split by white space and pyo will only use the first one ($arg0).)
So, how do we examine the Process object itself? We can easily look at an attribute of the object, which might be handy. Let's look at the "command" attribute of the Process object.
(gdb) pyo PyObject_GetAttr(PyObject_GetItem(self->processes,Py_BuildValue("i",0)),Py_BuildValue("s","command")) object : 'gdb-i386-apple-d' type : str refcount: 3 address : 0x640bb0 $7 = void
and same for the other object in the list.
(gdb) pyo PyObject_GetAttr(PyObject_GetItem(self->processes,Py_BuildValue("i",1)),Py_BuildValue("s","command")) object : 'python' type : str refcount: 3 address : 0x63cf60 $8 = void
Cool, so even though we are deep within a C extension module, we can still introspect our objects with relative ease. |
|
|
| Me == TurboGears Contributor |
[May. 3rd, 2007|01:33 pm] |
TurboGears 1.0.2 has just been released, and yours truly has been listed as a contributor in the CHANGELOG. Admittedly the patches I submitted were only a couple of minor enhancements to the paginate functionality, but it is nice to help out on the project, and 1.0.2 saves me from running my own patched branch of TG. |
|
|
| navigation |
| [ |
viewing |
| |
most recent entries |
] |
| [ |
go |
| |
earlier |
] |
| |
|
|