Various families of malware use domain generation algorithms (DGAs) to generate a large number of pseudo-random domain names to connect to malicious command and control servers (C&Cs). These domain names are used to evade domain based security detection and mitigation controls such as firewall controls. One of the techniques used to detect DGA domains is to reverse engineer malware samples to discover the DGA algorithm and seed, and use them to generate the list of domains. Malware activity can then be mitigated by preregistering and sinkholing these domains, or applying mitigation controls such as publishing the domain names on security device blacklists. This process is time-consuming and can be easily circumvented by attackers and malware authors. As such, static detection and prevention methods are not efficient. Other approaches also use statistical analysis to identify DGA domains over a time window, however many of these techniques need contextual information which is not easily or feasibly obtained. Our goal was to detect DGA domains on a per domain basis using the domain name only, with no additional information. This paper presents a DGA classifier that leverages long short-term memory (LSTM) networks for the detection of DGA domains without the need for contextual information or manually created features. A performance evaluation of our LSTM model based DGA classifier against various publicly available datasets is also provided.

Authors: Haleh Shahzad, Abdul Rahman Satt, Janahan Skandaraniyam

Published in: World Congress on Internet Security (WorldCIS-2020)

  • Date of Conference: 8-10 December 2020
  • DOI: 10.20533/WorldCIS.2020.0005
  • ISBN: 978-1-913572-24-2
  • Conference Location: London, UK