Completing the module AWSTATS

awstats

(Arnaud) #1

Good morning,
this topic is related to the module that spethdl made for us: https://wiki.nethserver.org/doku.php?id=awstats

Unfortunately, only the virtualhosts created by the server-manager have been taken into consideration for the stats.
Beside these vhosts, other interesting URL could need stats e.g. modules nextcloud and dokuwiki that run as a subdomain, the forwarded URLS of the ReverseProxy, the main domain www.domain.tld.

Starting from this topic Nethserver-awstats needs testers I made some tests.
I understood that 2 things are necessary: a separate httpd-log file and a conf file for awstats.

The separate log file can be created more of less quite easy by adding:
#redirect access log for awstats statistics CustomLog /var/log/httpd/access.subdomain.domain.tld.log combined
after <VirtualHost *:443> of the corresponding conf file of the subdomain into /etc/httpd/conf.d
This works fine for nextcloud, dokuwiki and mattermost as well as the forwarded URLs of ReverseProxy.

For www.domain.tld there is no <VirtualHost *:443> but adding the CustomLog parameters at the beginning of the conf file works.

Restart httpd after the modifications.

Of course this should be solved by using templates-custom but for tests, writing directly into the conf files is enough.

The conf files for awstats are into /etc/awstats. Copying an existing file and modifying the name of the Logfile as well as the domain name and alias is enough.

To display the stats: starting from the server-manager => webstats (the “other” subdomains aren’t listed!) go to an existing virtualhost. Then replace the virtualhost.domain.tld.vhost through other-subdomain.domain.tld and you have it.
This works eg. for dokuwiki, nextcloud and mattermost.

But www.domain.tld as well as forwarded-url.domain.tld make issue: stats are displayed, but these are not the stats from the separate log-files!!
I deleted the separate log-files and the values are still present.
The day and the time can not be updated and the stats look maybe like the general log-file!!

Where are these stats coming from? Where are the corresponding parameters for awstats?

Please let me know.

Bye
Arnaud


(Stéphane de Labrusse) #2

Sure I created awstats to work with the virtualhost module, we could imagine something to make it works with other module…at least my modules which work with a virtualhost


(Arnaud) #3

Hello,
I’m a very little bit further: in theory it works: adding the Customlog into the httpd conf file creates the separate log files. These log files are read by awstats that creates its data file (into /var/lib/awstats). This can be displayed by changing the url in the browser.

But…

The log files for the main domain www.domain.tld look different than the log files of the virtualhosts (created by the server-manager) => awstats read the logs but doesn’t recognize something in it. I think the problem comes from the missing URL => the “SiteDomain” of the conf file of awstats is not recognized and the log is not taken into consideration for the stats.

Exemple:
For a virtualhost:
WWW.XXX.YYY.ZZZ - - [22/Oct/2018:20:09:19 +0200] "GET /themes/elegant/icon/icons_sprite.png HTTP/1.1" 200 4928 "https://subdomain.domain.tld/_data/combined/10dohcr.css" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0" WWW.XXX.YYY.ZZZ - - [22/Oct/2018:20:09:19 +0200] "GET /plugins/language_switch/flag_sprite.jpg HTTP/1.1" 200 120557 "https://subdomain.domain.tld/_data/combined/10dohcr.css" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"

From the same client machine for www.domain.tld:
WWW:XXX:YYY:ZZZ - - [22/Oct/2018:20:16:48 +0200] "GET / HTTP/1.1" 301 - "-" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0" ::1 - - [22/Oct/2018:20:16:54 +0200] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/5.4.16 (internal dummy connection)" ::1 - - [22/Oct/2018:20:16:55 +0200] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/5.4.16 (internal dummy connection)" ::1 - - [22/Oct/2018:20:17:00 +0200] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/5.4.16 (internal dummy connection)"
And this connection is not counted by awstats.

But robots seems to produce:
WWW.XXX.YYY.ZZZ - - [22/Oct/2018:04:38:58 +0200] "GET /index.php/folders/of/my/website/Newsbox-News01&login HTTP/1.1" 301 - "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://YYYYYY.com/)"
Robots have been counted by awstats.

Can somebody confirm?
Why are the logs of the virtualhosts different? I can’t see any parameters about logs into the httpd conf file.

Bye
Arnaud


(Arnaud) #4

Hello,
I have to report that I haven’t moved forward :anguished:
I’m still looking for what makes the logs of www so empty of information.
Into /etc/httpd/conf/httpd.conf I have found following about log format:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
and I couldn’t find any other parameters into /etc/httpd.
I tried to copy these parameters into /etc/httpd/conf.d/www.conf but the logs stay unchanged.

In fact, I don’t understand how and where the url www.domain.tld is linked to the folder /var/www/html.
And the words internal dummy connection into the logs make me thinking that there is maybe an internal redirection and this redirection would break the logs… Maybe… I don’t know. I haven’t found any redirection.

I had a short look into how to add entries for supplementary subdomains into the server-manager => for me this is … a challenge (thinking positively). I had the hope that adding some custom-templates would make the job…:roll_eyes:

Bye
Arnaud


(Stéphane de Labrusse) #5

sorry @Arnaud I was a bit busy last time, I need to go back in the history of this thread but I wonder if something is not already done for you

if you look to /etc/awstats you could find all configuration for your servers, one is created by the rpm…this is what is it for me : /etc/awstats/awstats.systemname.domain.com.conf

it is all what is used by your server, check it at https://prometheus/awstats/awstats.pl?config=systemname.domain.com

we could add a link to display it nicely in the server manager…does it satisfy you ?


(Arnaud) #6

Hi @stephdl,
as I have tested and understood it, the issue for www.domain.tld doesn’t come from awstats itself but from the logs. As you can see above, there is no target url into them => awstats doesn’t take them into consideration => 0 visitor!
The logs of the VirtualHosts contain target urls, so awstats can count and make stats.
I can imagine 2 solutions:

  • the best one: we are able to make www.domain.tld genarating logs with the target url
  • the poor man’s trick: awstats should only count the number of lines into the logs (due to custom folder, we know that the logs are only coming from www.domain.tld). But we would loose precision into the stats: the possibility to know which pages are visited frequently (because of missing detailed url)

It seems that stats dokuwiki, mattermost and nextcloud (all with own subdomain) are OK => only need to create the conf file for awstats. A link to display would be IMHO sufficient.

SOGo over redirection (reverse proxy) and own subdomain: the stats are in my case 50% correct: the linux client (Evolution) makes good logs and stats but the Davdroid from the smartphone is not take into consideration. But I didn’t made any analyses for this.

So the main target stays the stats of www.domain.tld . I can try tomorrow to make a VistualHost “www” and to put the internet site into this but I can hardly remember that there was some issue then with dokuwiki and nextcloud.

Arnaud


(Stéphane de Labrusse) #7

this is the systemName.domainName.com awstats configuration of my server created by the rpm installation, it seems to work for me

:-?

of course if you want separated stats, then either you need separated logs or use the grep function to retrieve the domain/URL you need

HostAliases="REGEX[^.*SystemName\.domain.com\.fr$]"


(Stéphane de Labrusse) #8

this also need to modify a file which is not templated (/etc/httpd/conf/httpd.conf) , I do not want to change default configuration with a module


(Stéphane de Labrusse) #9

pushed a commit to display the default virtualhost in the tab URL page


(Arnaud) #10

Hello Stéphane,
many thanks for the new version of awstats.
I had a look into it and it is more or less what I have tried: therefore I don’t know if we understand each other and deal about the same thing.
I try to be clearer:

  • awstats “works” now with the conf file “awstatshost.domain.tld.conf” into /etc/awstats.
  • the taken logs are the general acces logs of apache = all without the logs referring to the VirtualHosts
  • now there is a “button” into the server-manager to select these “general stats”
  • stats are displayed
    But:
    IMHO the stats are not correct (like in the real life…)!
    When I connect www.domain.tld from my mobile phone or my client PC, the stats are not increasing. => the stats don’t take connections to www.domain.tld in consideration.

Separating the logs for www.domain.tld reduces the “noise” of the logs (no reverse proxy, or dokuwiki, mattermost, nextcloud etc…) and then I see that only the robots make stats.
Having a look into the logs, I see that the logs done by the robots are different than the logs done by a client machine:

  • by robots, the target URL is present and I can see the crawling of the robots from one page to the other along the internet site (more or less the same logs than produced for VirtualHosts).
  • by a client machine I can only see a connection without knowing the source or the reached url.

And I thinks that this is the problem: with such logs:
::1 - - [05/Nov/2018:20:47:59 +0100] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/5.4.16 (internal dummy connection)"
awstats doesn’t make stats.

Do you see that I mean?

How have you made the supplementary link into the server-manager for stats of host.domain.tld? Could you please indicate the conf files?

Arnaud


(Stéphane de Labrusse) #11

conf files are in /etc/awstats/awstats.*.conf, cp a file and modify it accordingly and launch the cronjob manually /usr/libexec/nethserver/awstatsCronJobs
I know that statistics are not true due to the single log file, hence this is why I separated the log file, but this was possible for me because the virtualhost is templated, this is not true for the default virtualhost.

You could make your own awstats configuration, and we could imagine to display them automatically in the server manager if we respect a name file standard, something like above (awstats.*.conf).

IIRC correctly the options to modify are

SiteDomain="www.domain.com"
HostAliases="REGEX[^www\.domain\.com$]"

(Arnaud) #12

Hello Stéphane,

I already know it and it works.

I find worse that the stats are false because of not beeing able to count the visits of www.domain.tld!

separating the logs for www is easy: like for the VirtualHosts, a “CustomLog” parameter has to be add into /etc/httpd/conf.d/www.conf

yes, VirtualHosts are templated but into the template (=into the corresponding file /etc/httpd/conf.d/Virtualhost.conf) I couldn’t find any modification of the logs format, only of
the file for logs storage (see above).
So it doesn’t explain why the format of the logs of the VirtualHosts is different to the log format of the default virtualhost. It only explains why the logs are stored into a separate file/folder.

I would be a very fine solution!

In theory yes, but it doesn’t work because the logs of www.domain.tld can’t be analyzed by awstats (missing IP and target url).
This is the issue that I try to describe since the last posts.

Arnaud


(Stéphane de Labrusse) #13

https://wiki.nethserver.org/doku.php?id=awstats#manual_configuration

with the new version, all awstats configuration will be displayed in the statistics tabs if you respect a naming convention

awstats.Your.Configuration.name.conf


(Stéphane de Labrusse) #14

teaser: maillog statistics with awstats

https://awstats.sourceforge.io/awstats.mail.html


(Stéphane de Labrusse) #15

Released in my repository, now you track email statistics


(Stéphane de Labrusse) #16

Find something useful maybe

/usr/share/awstats/tools/awstats_buildstaticpages.pl -config=subdomain.domain.com -dir=/tmp -buildpdf=/usr/bin/htmldoc

we could make pdf report for the sysadmin or your customers (thank to @davidep for the idea)


(Stéphane de Labrusse) #17

released


(Arnaud) #18

Hi Stéphane,
well done! Very good job!
All is working perfectly (except www by me :hushed:) . I can’t test “email” because this NS is only doing “web”.
It seems that I’m the only one with the “bad” logs for www => I will fresh install a testing machine and compare the behavior of the “production” machine with the results given by test machine.

In any case, the result of this improvement is much higher than what I have expected!

Bye
Arnaud


(Stéphane de Labrusse) #19

please test it, normally even without email stack installed you should be able to export the email of root to another account. I suppose that you monitor your server or check its email