Ask Us logo

Quick Links


Installing and Using Find_SSNs on Linux and Solaris

This article is intended for Facutly/Staff who may have certain types of PI (personal information) located on a computer, running the Linux or Solaris operating systems.

Overview:



Find_SSNs is a piece of software written in python at VirginiaTech that searches a computers files for Social Security #'s and Credit Card numbers. It requires python version 2.4+ to run. By default Find_SSNs searches the following file types: doc, docx, xlsx, xls, rtf, zip, text files (e.g. html, xml, txt) and Open Office 2 documents. It can additionally search pdf files when the pdftotext binary is installed. (It's part of the poppler package.) We provide two versions of Find_SSNs: One that searchs pdfs and another version that doesn't search pdfs (in case you can't install the poppler package). Our instructions below will include the necessary steps to get the poppler package installed.

The Find_SSNs software webpage at Virginia Polytechnic Institute is located here: http://security.vt.edu/resources_and_information/find_ssns.html

The full Find_SSNs documentation at Virginia Polytechnic Institute is located here: http://security.vt.edu/Find_SSNs/find_ssns_referance_manual.html

 
Installation:



Linux:

Note: While these install steps should work on any modern Linux distro we've only verified that they work on RHEL5, RHEL6 and Ubuntu 11.04.


The requirements to run Find_SSNs are:

Note: RHEL and Ubuntu with a default install come with python installed.


On RHEL5 install it with this command:

yum install poppler-utils

On Ubuntu install it with this command:

apt-get install poppler-utils

Note: If for some reason you can't install poppler-utils to scan pdf files you can grab a copy of Find_SSNs with pdf searching turned off: http://www.hawaii.edu/its/docs/find_ssns_nopdf.tar

Solaris 10:

Requirements:


Scanning:



Scanning your filesystem(s) for files that contain SSN or CC #'s is the same across all Unix/Linux boxes.

Note: Find_SSNs uses a few innovative methods to reduce false positives, (If you're interested, check out their webpage http://security.vt.edu/resources_and_information/find_ssns.html), but it *will* still find some false positives when it scans your computer.
         We've found that the best way to reduce the number of false positives is to only scan locations on the servers that could hold PII information. For example, /home, /fileshare, etc...
         We've included the false positives that Find_SSNs finds on a full scan of a default install of RHEL5 and Solaris 10 in the Find_SSNs packages in the directory named "default_false_positives".

The steps that are required for Find_SSNs to successfully run:


Note: For the full documentation on Find_SSNs, please refer to the Find_SSNs official documentation, located here: http://security.vt.edu/Find_SSNs/find_ssns_referance_manual.html.

A basic scenario of using Find_SSNs are these:

To scan your whole computer for SSN's and CC #'s use this command:

python Find_SSNs.pyw -p / -o /root/find_ssns/ -t csv -a

After reviewing the two output files they should be securely deleted from the computer.

Please rate the quality of this answer: Poor Fair Okay Good Excellent
Not the answer you were looking for? Try different keyword combinations and if you still can’t find your answer, please contact us.
Article ID: 1323
Created: Sun, 11 Sep 2011 4:14pm
Modified: Fri, 26 Oct 2012 10:45am