Do you know what files you or your company are making visible on the Internet?

Every so often there is a news report of a company accidentally making private or sensitive information available on the Internet. One of our local banks had a spreadsheet with names and bank account numbers visible for about 48 hours on their website last year…

This is one example of information leakage.

For instance, you wouldn’t want a spreadsheet showing employees salaries and proposed increases to be available, would you?

Likewise, you would not want to find out that plans for a new product line can be found with a simple Google search.

You should periodically review what you make publicly available. There are two basic ways to do this – manual review and search.

  • Manual review – The manual review process simply requires that you take the time to go through contents of every directory that is made visible on the Internet. This is most practical for simple websites. It can become very time consuming if there are hundred of files and directories that need to be reviewed. Keep in mind that this can also be scripted.
  • Search – A second way is to use Google to search for file types and phrases that would be indicative of leakage. You should check your website for word processing or spreadsheet documents that should not be there. If you use specialized software for payroll, you could check for the visibility of those documents. Just about everything you do that you don’t want the world to know about ought to be checked.

Here are some examples of using Google to search your site, assuming it is named www.xyz.com…

  • Search for all pdf files on you site
    • site:xyz.com filetype:pdf
  • Search for all files containing the word “payroll
    • site:xyz.com payroll
  • Search for all Excel spreadsheet files visible on your site
    • site:xyz.com filetype:xls

You should also familiarize yourself with the Advanced Search dialog from the main Google search page.

Notice that these two ways of reviewing for information leakage are complementary. The manual review is performed on the actual server and source of the files, whereas the search is done from the Internet perspective. Both should be done.

Give that a try and let me know if you find anything that is a surprise!

- Dan

Tags: , ,

Leave a Reply

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>