clamav-users March 2008 archive
Main Archive Page > Month Archives  > clamav-users archives
clamav-users: Re: [Clamav-users] A small survey about limits (Ov

Re: [Clamav-users] A small survey about limits (Oversized.Zip and friends)

From: aCaB <acabng_at_nospam>
Date: Thu Mar 13 2008 - 16:54:58 GMT
To: ClamAV users ML <clamav-users@lists.clamav.net>


aCaB wrote: > Hi list. > I'm in the process of redesigning the logic of limits in ClamAV. > The rewrite (scheduled for the upcoming 0.93) is aimed at solving, once > for all, the annoyances related to config options like > (clamd.conf-style): ArchiveMaxFileSize, ArchiveMaxRecursion, > ArchiveMaxFiles and so on...

A follow up on this...

The code rework is complete and ready to be tested in 0.93-rc1. Although our regression results are very satisfactory (increased detection rate and slightly improved performance) we urge as much users as possible to test it and to report back so that we can move on to 0.93-final.
Please grab it from
http://downloads.sourceforge.net/clamav/clamav-0.93rc1.tar.gz and send back your comments.

Now about the changes in the limits...
The new mechanism is pretty simplified.
For every request to scan a file passed to libclamav from its "clients" (e.g. clamscan, clamd, or 3rd party tools) we determine if the file is a container or not.
A container is a file which may, upon processing, may generate more input files. The most typical example of a container is an archive, like a zip or a rar file, but many other file types can be containers too: a mailbox, an office document, an executable, etc. Every time libclamav processes an input file which is a container, more input files are generated and scanned. These newly generated "child" files can be in turn containers (e.g. a zip file containing an office document) so the process repeats and child containers are processed and scanned recursively.

It is clear that this process cannot be indefinitely long and, to avoid DoS conditions, an arbitrary end needs to be enforced. That's where limits kick in.

The new limits are based on a distinction, between the main input file, i.e. the original file for which a scan request to libclamav is generated and the "child" input files which are generated as by processing the main input file and its children.

To make it clear let's say we have "archive.zip" which contains 2 files: "readme.txt" and "setup.exe" where "readme.txt" is a plain text file and "setup.exe" is a Rar self extracting archive which in turn contains "myapp.exe", a plain executable, and "license.txt", again a plaintext file. When a scan request for "archive.zip" is generated (e.g. via clamscan archive.zip) we have:
- the main input file: archive.zip

  • a child container: setup.exe
  • some non container children: readme.txt, myapp.exe, license.txt The recursion levels during the processing are:
  • Level 0: archive.zip
  • Level 1: readme.txt and setup.exe
  • Level 2: myapp.exe and license.txt

Now to the limits:
- MaxFileSize (--max-filesize in clamscan)
This applies to all input files, both the main input and the eventual child input files. It also applies to container and non-container files. This option defines a filesize; input files exceeding it will either be entirely skipped or scanned only up to this value, depending on the file type.

  • MaxScanSize (--max-scansize in clamscan) This option defines the overall amount of data to be scanned for a given main input file. Back to the example above, the scanning process goes like: archive.zip, readme.txt, setup.exe, myapp.exe and, finally, license.txt. Before scanning the main input file a special counter is set to zero, then, at each of these steps the size of the current input file is added to the counter (it basically indicates how much data has been scanned so far). Also, if the counter exceeds the value of MaxScanSize, one of these actions is performed (depending on the type of data we are scanned): the current input file is skipped, the current input file is only scanned partially (up to the remaining size), the processing is halted.
  • MaxFiles (--max-files in clamscan) This limits the number of files to be scanned for each main (container) input file. If the value is reached, processing is halted. In the example above setting this limit to 3 would result in only "archive.zip", "readme.txt" and "setup.exe" to be processed, leaving out "myapp.exe" and "license.txt".
  • MaxRecursion (--max-recursion in clamscan) This sets the maximum allowed recursion level to apply when processing containers. Level 0 is the main input file. Note: this has NOTHING to do with _directory_ recursion limits in clamscan and clamd.

Please note that the following config options are no longer supported:
- MailMaxRecursion - mail files are now treated as containers just like
zip files.
- ArchiveMaxFileSize, ArchiveMaxRecursion, ArchiveMaxFiles - archives
are treated as containers.
- ArchiveMaxCompressionRatio - the concept of Oversized.* has been
dropped completely.
- ArchiveBlockMax - the concept of Oversized.* has been dropped completely.

Default values:
- MaxFileSize - 25MB

Today, no malicious file is probably larger than 5-10MB, however the overhead introduced by certain container files can heavily affect the final size. The default value should be suitable for most environments. Do not tweak this option by more than 20-25%.

  • MaxScanSize - 100MB Should be suitable for most environments. Paranoid desktop users with lots of archives may raise this value by up to 100% at the expense of a slight performance hit. On busy mail gateways this could be decreased by max 30% to lower the load, however some malicious mails could pass through unnoticed.
  • MaxFiles - 10000 This option is really a fallback against DoS attempts, that's why it exist although the default value is rather large. If you really want to touch it be aware that archives containing a lot of small files (especially graphic files and small images) are not uncommon at all.
  • MaxRecursion - 16 This is a very important option to prevent stack smashing in libclamav. The default value is safe for all the OS's we've tested. The only reason to increase it is if you experience malware within double bounced emails passing through. If you really need to do this you must understand the implications in terms of stack space. Alternatively you'll need to only increase this value by 1 and extensively test the stability before increasing it again.

Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://lurker.clamav.net/list/clamav-users.html