0

I've installed AWStats 7.0 (the latest version in the Amazon Linux repository) to try to get additional information about bandwidth usage. I'm having trouble getting AWStats to parse my logs - I suspect it's because I can't get the LogFormat right.

I've tried many variations and I just can't get it working.

Here's my Nginx log format

log_format  main  '$remote_addr - $remote_user [$time_local] "$host" "$request" '
                  '$status $body_bytes_sent "$http_referer" '
                  '"$http_user_agent" "$http_x_forwarded_for" "$request_time" '
                  '"$upstream_cache_status" "$sent_http_content_encoding" ';

Here's a log entry

1.1.1.1 - - [12/Mar/2017:07:23:53 +1300] "www.example.com" "GET /url/ HTTP/1.1" 200 7455 "https://www.google.ru/" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36" "46.71.136.54" "0.000" "HIT" "gzip"

Here's my AWStats configuration file. Anything not here is standard and inherited from the main configuration file

# Path to you nginx vhost log file
LogFile="/var/log/nginx/pts.access.log"

# Domain of your vhost
SiteDomain="example.com"

# Directory where to store the awstats data
DirData="/var/lib/awstats/pts/"

# Other alias, basically other domain/subdomain that's the same as the domain above
HostAliases="www.example.com"

LogFormat = "%host %logname %time1 %virtualname %methodurl %code %bytesd %refererquot %uaquot %otherquot %otherquot %otherquot %otherquot"

Here's the awstats output

[root]# /usr/share/awstats/tools/awstats_updateall.pl now -awstatsprog=/usr/share/awstats/wwwroot/cgi-bin/awstats.pl
Running '"/usr/share/awstats/wwwroot/cgi-bin/awstats.pl" -update -config=example.com -configdir="/etc/awstats"' to update config example.com
Create/Update database for config "/etc/awstats/awstats.example.com.conf" by AWStats version 7.0 (build 1.971)
From data in log file "/var/log/nginx/pts.access.log"...
Phase 1 : First bypass old records, searching new record...
Searching new records from beginning of log file...
Jumped lines in file: 0
Parsed lines in file: 323
 Found 323 dropped records,
 Found 0 comments,
 Found 0 blank records,
 Found 0 corrupted records,
 Found 0 old records,
 Found 0 new qualified records.

Can anyone spot what's not right? I can't find any additional information or awstats logs that would give further information.

Tim
  • 30,383
  • 6
  • 47
  • 77

2 Answers2

0

One possible issue is here:

log_format  main  '$remote_addr - $remote_user [$time_local]...

Corresponding configuration in AWStats:

LogFormat = "%host %logname %time1

And your log file contains:

1.1.1.1 - - [12/Mar/2017:07:23:53 +1300]

%logname matches to only a single string, that is, the username provided in HTTP authentication. Now, your log file contains two dashes, first one from your configuration, and the second one means an empty username.

So, AWStats tries to interpret the second dash as a timestamp, and that causes it to consider the record as failed.

So, you either need to add the dash to AWStats log format string, or remove the dash from nginx log format.

As a side note, you don't need to quote your last parameters ($request_time, $upstream_cache_status, $sent_http_content_encoding) in nginx log, since they cannot contain spaces.

You can also use %extraX in AWStats configuration if you want to use that information in building reports based on those facts.

Tero Kilkanen
  • 34,499
  • 3
  • 38
  • 58
  • Thanks for the thoughts Tero. Unfortunately adding the dash to the LogFormat makes no difference. I had it in there originally, removed it for testing. I also tried copying a line of the logs into a temp file, removing the dashes (eg "207.46.13.91 [13/Mar/2017:07:12:23 +1300]" etc), and changing LogFormat to "%host %time1" etc. AWStats told me "1 blank record one dropped record". Other ideas welcome and appreciated :) – Tim Mar 12 '17 at 18:17
  • OK. My next approach would be to take standard nginx log file format, add one element at a time and see what exact element is what makes AWStats fail. Or I would read AWStats code to see how it parses the log and figure out the reason there. – Tero Kilkanen Mar 12 '17 at 19:01
  • Good approach, thanks. I'll do that over the next week or so and update when I have more information. – Tim Mar 12 '17 at 19:12
0

I finally worked it out, after about 6 hours of effort. They key problem was I had the AWStats site config incorrect, but I don't think my Nginx log format or my AWStats format string were right either.

Here's my working Nginx log format. This is the standard Nginx combined log format, which maps to awstats LogFormat=1, plus three extra fields I wanted in my logs

# /etc/nginx/nginx.conf
log_format combined_custom '$remote_addr - $remote_user [$time_local] '
                '"$request" $status $body_bytes_sent '
                '"$http_referer" "$http_user_agent" $host $request_time $upstream_cache_status';

Of course I had to have my server use this configuration. This is in my server block.

# /etc/nginx/sites-enabled/example.com.conf
access_log  /var/log/nginx/access.log combined_custom;

Here's my AWStats site config file. This extends the /etc/awstats/awstats.conf.local file with site specific values.

Note that one problem was I had the SiteDomain wrong - I had omitted the "www" at the start of my domain. The reason I did this was because I thought "HostAliases" would let me add the www subdomain as an alias, but that's not what it's for. It's to

This parameter [HostAliases] is used to analyze referer field in log file and to help AWStats to know if a referer URL is a local URL of same site or an URL of another site.

# /etc/awstats/awstats.example.com.conf
# Path to you nginx vhost log file
LogFile="/var/log/nginx/access.log"

# Domain of your vhost
SiteDomain="www.example.com"

# Directory where to store the awstats data
DirData="/var/lib/awstats/example/"

# Other alias, basically other domain/subdomain that's the same as the domain above
HostAliases="localhost"

# Performance optimisation
DNSLookup=0

# This works with the Nginx combined log format
# LogFormat=1

# This is the equivalent of LogFormat=1
# LogFormat="%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot"

# This adds my custom fields
LogFormat="%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot %virtualname %other %other"

I haven't gone any further in getting AWStats working, but once I do I'll update this post with anything I find that's tricky.

Thanks to @Tero Kilkanen for the methodology to work this out - ie start with the combined format and work forwards.

Tim
  • 30,383
  • 6
  • 47
  • 77