ClamAV Antivirus scanner for file uploads for Python applications

08 / Mar / 2024 by giridhar.bhimalli 0 comments

Introduction

ClamAV is an open-source antivirus engine designed for detecting trojans, viruses, malware, and other malicious threats on Unix-based systems. Initially developed for email scanning on Unix-based systems like Linux, it has evolved into a comprehensive antivirus solution for a variety of platforms, including Windows and macOS. Known for its reliability, ease of use, and frequent updates, ClamAV has become a popular choice for both individual users and organizations seeking effective protection against cyber threats. 

Clamd is a portable Python module to uses the ClamAV anti-virus engine on Windows, Linux, MacOSX, and other platforms. It requires a running instance of the clamd daemon. The below steps will provide details on how to install ClamAV and how it can be used with Python applications.

Technical implementation

Step 1. Open the terminal and install ClamAV in Local using cmd for macOS brew install ClamAV

Step 2: Go to the path in macOS cd /opt/homebrew/etc/clamav/ , which will have two sample files 

  1. clamd.conf.sample
  2. freshclam.conf.sample

Step 3: Run cmd. “cp freshclam.conf.sample freshclam.conf” to copy and open the file that was just created and comment out Example -> #Example.

Step 4: Open the terminal and update the ClamAV database using “freshclam -v”.

Step 5: Run cmd. “cp clamd.conf.sample  clamd.conf”  to copy and open the file that was just created and apply the below changes

  1. Comment out Example -> #Example
  2. Uncomment TCPSocket
  3. Uncomment TCPAddr

Step 6: Now that we have installed and setup ClamAV and to run ClamAV in local clamd –foreground, this will run the ClamAV service on 

  • HOST = localhost 
  • PORT = 3310

Step 7: Now, in your application, install ClamAV python package using the below command. 

pip install clamd==1.0.2

Step 8: Once installed, import the package in your app as below to scan files in a specific path.

import clamd

def scan_file_using_file_path():
    cd = clamd.ClamdNetworkSocket()
    cd.__init__(host='127.0.0.1', port=3310, timeout=100)
    cwd = os.getcwd()
    for file in os.listdir("files"):
        directory = os.path.join(cwd, f"files/{file}")
        res = cd.scan(file=directory)
        print(res)

scan_file_using_file_path()

Results:
{'/Users/PycharmProjects/ClamAVAntivirus/files/test_xls.xlsx': ('OK', None)}
{'/Users//PycharmProjects/ClamAVAntivirus/files/users.txt': ('OK', None)}
{'/Users/PycharmProjects/ClamAVAntivirus/files/testfile.txt': ('FOUND', 'Win.Test.EICAR_HDB-1')}
{'/Users/PycharmProjects/ClamAVAntivirus/files/test_pdf.pdf': ('OK', None)}
{'/Users/PycharmProjects/ClamAVAntivirus/files/image.jpg': ('OK', None)}

Process finished with exit code 0

After scanning through the files, it will report if any malicious item is found. Below is an example to scan a file if the input is in the form of a stream.

from io import BytesIO

def scan_file_using_byte_stream():
    with open("files/testfile.txt", "rb") as fh:
        buf = BytesIO(fh.read())
        try:
            res = antivirus_scanner.scan_stream(stream=buf)
        except (AntivirusScannerException, MaliciousContentException) as e:
            print(e)
        else:
            if res == "OK":
                print("File scanned successfully, no potential malware found")
            else:
                print("Something went wrong!!!")


scan_file_using_byte_stream()

Results:

***** Malicious content found in file scanning, reason: Win.Test.EICAR_HDB-1 *****

Process finished with exit code 0

Class to connect to ClamAV service

from io import BytesIO
from typing import Optional, Tuple
import clamd


class AntivirusScannerException(Exception):
    pass


class MaliciousContentException(AntivirusScannerException):
    def __init__(self, reason: str):
        super().__init__(f"Malicious content found in file scanning, reason: {reason}")


class InvalidScanResultStatus(AntivirusScannerException):
    def __init__(self, status: str):
        super().__init__(f"Undefined status returned from antivirus scanner: {status}")


class ClamAvScanner:
    _CONNECTION_TIMEOUT = 60
    _DEFAULT_PORT = 3310
    _MALICIOUS_STATUSES = frozenset(("ERROR", "FOUND"))

    def __init__(self, hostname: str, port: Optional[int]):
        self._hostname = hostname
        self._port = port or self._DEFAULT_PORT

    def scan_stream(self, stream: BytesIO):
        connection = self._init_connection()
        try:
            output = connection.instream(stream)
        except (clamd.BufferTooLongError, clamd.ConnectionError) as exc:
            raise AntivirusScannerException(
                "Unable to scan stream due to internal issues"
            ) from exc

        return self._output(output["stream"])

    def _init_connection(self) -> clamd.ClamdNetworkSocket:
        return clamd.ClamdNetworkSocket(
            host=self._hostname,
            port=self._port,
            timeout=self._CONNECTION_TIMEOUT,
        )

    def _output(self, output: Tuple[str, str]):
        status, reason = output
        if status in self._MALICIOUS_STATUSES:
            raise MaliciousContentException(reason)
        if status != "OK":
            raise InvalidScanResultStatus(status)
        if status == "OK":
            return "OK"


antivirus_scanner = ClamAvScanner(hostname="127.0.0.1", port=3310)

To determine what virus is associated with a file detected by ClamAV, you can usually check the scan report generated by ClamAV after it completes a scan. The report should list the name or identifier of the detected virus or malware and information about the affected file. Additionally, you can search for the specific virus name or identifier online to find more information about its characteristics and potential impact.

Links:

  • https://pypi.org/project/clamd/

Some best practices:

  • Filtering or allowing only required file formats while uploading.(ex: .pdf, .jpeg, .xlsx)
  • Scanning files for potential malware before storing.
  • Limiting max file size.

Conclusion

In conclusion, ClamAV continues to be a trusted antivirus solution, providing essential protection against a wide range of malicious threats across multiple platforms. With its open-source nature, frequent updates, and robust detection capabilities, ClamAV remains a valuable tool in the fight against malware. Whether you’re an individual user or a large organization, integrating ClamAV into your cybersecurity strategy can help safeguard your systems and data from the ever-evolving landscape of cyber threats.

FOUND THIS USEFUL? SHARE IT

Leave a Reply

Your email address will not be published. Required fields are marked *