Tuesday, January 25, 2011

How Can We Create Blackbox Logs for Nginx?

There's an article out there, Profiling LAMP Applications with Apache's Blackbox Logs, that describes how to create a log that records a lot of detailed information missing in the common and combined log formats. This information is supposed to help you resolve performance issues. As the author notes "While the common log-file format (and the combined format) are great for hit tracking, they aren't suitable for getting hardcore performance data."

The article describes a "blackbox" log format, like a blackbox flight recorder on an aircraft, that gathers information used to profile server performance, missing from the hit tracking log formats: Keep alive status, remote port, child processes, bytes sent, etc.

LogFormat "%a/%S %X %t \"%r\" %s/%>s %{pid}P/%{tid}P %T/%D %I/%O/%B" blackbox

I'm trying to recreate as much of the format for Nginx, and would like help filling in the blanks. Here's what Nginx blackbox format would look like, the unmapped Apache directives have question marks after their names.

access_log blackbox '$remote_addr/$remote_port X? [$time_local] "$request"'
                    's?/$status $pid/0 T?/D? I?/$bytes_sent/$body_bytes_sent'

Here's a table of the variables I've been able to map from the Nginx documentation.

%a = $remote_addr - The IP address of the remote client.
%S = $remote_port - The port of the remote client.
%X = ? - Keep alive status.
%t = $time_local - The start time of the request.
%r = $request - The first line of request containing method verb, path and protocol.
%s = ? - Status before any redirections.
%>s = $status - Status after any redirections.
%{pid}P = $pid - The process id.
%{tid}P = N/A - The thread id, which is non-applicable to Nignx.
%T = ? - The time in seconds to handle the request.
%D = $request_time - The time in milliseconds to handle the request.
%I = ? - The count of bytes received including headers. 
%O = $bytes_sent - The count of bytes sent including headers.
%B = $body_bytes_sent - The count of bytes sent excluding headers, but with a 0 for none instead of '-'.

Looking for help filling in the missing variables, or confirmation that the missing variables are in fact, unavailable in Nginx.

  • Nginx is not like Apache -- the power of Nginx is in (1) server/location matching and URI rewriting, (2) proxying/failover requests (to HTTP/FastCGI upstreams) and (3) caching of the result. So you may face real problems somewhere in these areas, when you need to measure how your backend service responds and which of the requests are cached. I doubt there can be any universal "blackbox" format for Nginx, since it has a lot of variables and moreover they are dynamic (i.e. you can write the value of cookie "Registered" and corresponding cache status to the log in order to see whether registered users get cached content or not)

    Nevertheless, it is very useful to extend short "combined" log format.

    General

    Some hints on the general variables:

    $uri - URI after rewrites
    $args - Query arguments
    $realpath_root - Value of "root" for this request
    $server_name - the name of server which is processing the request
    $connection - the number of connection
    

    Proxying

    Some useful variables when proxying. These variables may contain serveral values: separated with , when upstreams where asked in turn and separated with : when X-Accel-Redirect has been issued:

    $upstream_addr - the upstream IPs
    $upstream_response_time - the upstream processing times
    $upstream_status - the upstream statuses
    

    Caching

    One variable useful to log caching status:

    $upstream_cache_status = MISS | EXPIRED | UPDATING (stale answer used) | STALE (stale answer used) | HIT
    
    Alan Gutierrez : Why not? There's a universal log format for hit counting, "combined." Why can't there be one for performance? Then you could have people write scripts to report on the default performance log format, which could be extended or supplemented as your application needs dictate. Convention versus configuration. Chris Josephes' article was a nice contribution to Apache. It was nice that someone thought about the data you'll want to have when *unanticipated* performance or security issues arise.
    From Alaz

0 comments:

Post a Comment