This study aims to determine whether privacypreserving data mining method can be effectively applied in data mining for a social networking service (SNS). By implementing privacy-preserving data mining on personal information collected by a SNS, it becomes possible to provide secure personalized services to SNS users. In this study, we consider using privacypreserving data mining by the anonymization approach. By this approach, all input information is anonymized while performing data mining. We examine whether the anonymization approach can be applied to data that can be partially anonymized, such as the SNS data, and how many users can be identified by the anonymization approach. In the experiments conducted in this study, we anonymized some dataset attributes and then counted whether we can identify the data of the anonymized attributes from the data of the non-anonymized attributes. The ratio of anonymization to all data is defined as the security level of an anonymization approach. In this study, it became clear that the anonymization approach can be applied to data, which can be partially anonymized, such as a SNS data. In addition, it became clear that it is necessary to consider removing a small number of attributes as identifiers because they can be narrowed down even if they are anonymized as quasi-identifiers.

Authors: Ayahiko Niimi, Takahiro Arakawa

Published in: World Congress on Internet Security (WorldCIS-2020)

  • Date of Conference: 8-10 December 2020
  • DOI: 10.20533/WorldCIS.2020.0006
  • ISBN: 978-1-913572-24-2
  • Conference Location: London, UK