DansGuardian

DansGuardian is excellent at filtering pages from the Internet as it examines both the URL and the content of the page, and it has many options to allow you to fine tune the process. To run DansGuardian, you will first need a proxy in place. DansGuardian will work with many proxy servers, such as tinyproxy, but Squid is the recommended one.

Note: DansGuardian is no longer maintained. An actively developed fork is e2guardianAUR.

From the project home page:

DansGuardian is an award winning Open Source web content filter which currently runs on Linux, FreeBSD, OpenBSD, NetBSD, Mac OS X, HP-UX, and Solaris. It filters the actual content of pages based on many methods including phrase matching, PICS filtering and URL filtering. It does not purely filter based on a banned list of sites like lesser totally commercial filters.
DansGuardian is designed to be completely flexible and allows you to tailor the filtering to your exact needs. It can be as draconian or as unobstructive as you want. The default settings are geared towards what a primary school might want but DansGuardian puts you in control of what you want to block.
DansGuardian is a true web content filter.

The original author of this article runs Squid and DansGuardian content filters at several schools in the UK, successfully blocking inappropriate content.

Installation

Install the dansguardianAUR package.

After #Configuration, start/enable the dansguardian.service systemd service.

Configuration

Configuration files

Configuration files, blocking templates, blacklists, etc are stored in /etc/dansguardian.

DansGuardian has the concept of groups, which are groups of users or machines that have certain blocks applied to them. The following sets up one group for everyone---the idea being that anyone who wants unfiltered access could just hit Squid directly. (This is not ideal.)

Edit /etc/dansguardian/dansguardian.conf. The defaults are suitable for most users, and the file is well-commented. Be sure to check the options under Network Settings for filterip, filterport, proxyip and proxyport. You may also want to examine and . The filter mode is especially important if you are running this setup on older hardware.

Next edit the options for the first, and only, group in this setup. Edit the file , which is well-commented. Pay attention to the .

DansGuardian examines the content on a page and adds up the naughty words based on a weighting scheme, with worse words getting more points. If this total exceeds the , the page will be blocked. 50 is a good limit for young children, whereas 160 is good for young adults. In a corporate environment, you would want this set around 200.

Blacklists

Adding websites to the block lists are done in the same directory, in the almost self-explanatory configuration files. DansGuardian provides a powerful regular expression URL and content filter, as well as ordinary URL blocklists. Some of the important files are:

  • bannedsitelist - The domains for your banned sites, e.g. "human-horse-love.com"
  • bannedurllist - If you just want to block part of a website, e.g. "bbc.co.uk/games"
  • exception* - This is where your domain/URL exceptions go. Sites in this list will not be checked and allowed straight through.
  • bannedregexpurllist - Very powerful. This is where you can put regular expressions to block certain URLs, e.g. "(.*q=.*xxx.*)" to stop searching for the word "xxx".

Whenever you add or remove a URL from the list, you must tell DansGuardian you have done so. This can be done with:

# dansguardian -r

Which will force DansGuardian to reload its configuration. Doing it this way, rather than restarting the daemon, will mean that for the most part your users will not notice reloading. (I have noticed that straight after the reload, users may have trouble accessing web pages for about 5-10 seconds - if this is a problem you can always run a cronjob at 12am to run dansguardian -r).

You can download a blacklist collection from URLBlacklist, but be sure to read the FAQ first, as you will, paradoxically, want to unsort the collection to enable DansGuardian to start faster. Optionally, you can find blacklists at Squidblacklist.org, Once you have installed a blacklist under /etc/dansguardian you can add them to your DansGuardian configuration by opening the appropriate configuration file and adding:

.Include</etc/dansguardian/blacklists/ads/domains>
.Include</etc/dansguardian/blacklists/drugs/domains>
.Include</etc/dansguardian/blacklists/porn/dg-porn
etc...

To the bottom of the file. Take a look around the blacklist collection to see what is available.

"Access Denied" template

If you wish to change the page that gets displayed to users when a website is blocked, you need to edit the file:

/usr/share/dansguardian/languages/LANGUAGE/template.html.

Web frontend

If you would like a web-based frontend to manage DansGuardian, you could use Webmin with the DansGuardian Webmin Module.

gollark: Historically technological advances have at least eventually replaced lost jobs (not that I think jobs created/lost is a good way to judge innovations) but I suppose you could argue that AI is different somehow. It definitely would be if AI stuff started being able to make more AI stuff, but you would probably run into bigger issues than high unemployment then.
gollark: It also seems unlikely that we would suddenly jump from the current situation where a bit of stuff is automated and quite a lot isn't to everyone being immediately unemployed, so you can notice and do stuff about it in the interval. Restructure the economy for post-material-scarcity or whatever. No idea how that would *work* but oh well.
gollark: If you can make robots/AI/whatever do any work you want easily, I'm sure you could make a few to produce food and whatever without problems.
gollark: Also, congratulations on successfully (so far) navigating the horrors of the UK university system.
gollark: Our culture has such a bizarre obsession with hard work.
This article is issued from Archlinux. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.