What You Don’t Know About Metadata Can Hurt You

WDC Journal Edition: Spring 2007
By: Jeffrey S. Krause

Most lawyers would not knowingly disclose confidential information about their clients to other parties, yet many law firms unknowingly do this every time they send a Microsoft Word or Corel WordPerfect document via email. Most computer files contain “metadata” and in today’s world of document collaboration, email, and remote access, every lawyer should know something about it.

Many people tell me they have heard of metadata but are not really sure of what it is or whether they need to worry about it. Metadata is “data about data” and, yes, you need to worry about it. Examples of metadata are Microsoft Windows storing file date and author information and how WordPerfect knows what to undo when you select Ctrl-Z. As you might guess, metadata is intended to be helpful. However, this beneficial aspect comes with a potential price – the trail that metadata creates. Most documents contain some metadata with information about the person who created the file, the firm where it was created, the computer it was saved to, and the printer it was printed to. While informative, most of this information is not all that damaging. In the worst case, this trail is capable of making a fairly comprehensive document history available to prying eyes.

Ironically, the more you know about your word processor, especially Microsoft Word, the more dangerous metadata becomes. Among the more damaging things that metadata tracks are hidden text, document revisions, previous authors, comments, and template information. In many law firms, it is common practice to open a document, revise it, and save it with a different name for a different client. Many of the revisions made during this process are tracked by metadata. If you use the track changes feature of Word, these prior revisions and comments are tracked with metadata and stay there until the changes are finalized. Consider the damage that could be caused if opposing counsel were to see changes made to a settlement offer after review by the client. As you can imagine, metadata has the potential to be a very big problem.

If you would like to view metadata from one of your Word documents, simply open it and select the File menu followed by Properties. The General tab contains information about when and where the document was saved while the Summary tab displays author and title information. The Statistics tab provides information on how long the document is and how much total time was spent on editing. That piece of metadata could be embarrassing. While there might be a perfectly good explanation, try justifying to a client the three hour bill they just received for a document with a total editing time of 34 minutes! And, if the information contained in the File Properties wasn’t bad enough, other metadata is much worse. For example, neglecting to finalize a document with track changes turned on makes those changes easy accessible.

While it is true that metadata is a bigger problem in Microsoft Word, Corel WordPerfect documents are not immune. Almost every file has some metadata and much of the same file properties data contained in a Word document is present in WordPerfect documents. In addition, improperly using the Undo/Redo History feature in WordPerfect could make all of your changes visible if you save that history with the document. Other applications like Excel and PowerPoint will write some metadata to files created with that program. Even Adobe Acrobat, which I highly recommend as a way to reduce unwanted metadata, includes a small amount in the file properties. In other words, metadata is everywhere and you need to take steps to lessen your exposure to it.

If you email Word or WordPerfect files to clients or other attorneys, the first question to ask is why. The only time that you should be sending a Word file to a client or anyone else is if you need the other party to collaborate on the document and they need the raw file to do so. There is no other reason to send them the actual file. Sending the client a PDF file produced with Adobe Acrobat is a much better alternative. The Adobe file will not inherit the metadata from the Word document and nearly every computer has a version of Adobe Acrobat Reader installed, so your client will almost certainly be able to read the PDF. Furthermore, the PDF cannot be modified by the other party – which in most cases is a good thing.

Microsoft recognizes that metadata is an issue and released add-ins for the XP and 2003 versions of Office. This Remove Hidden Data add-in was available as a free download at Microsoft’s website. It will not remove every bit of metadata but it is better than nothing. Word 2007 includes the Inspect Documents feature, which can review your documents for hidden text and other metadata and remove it if you choose. While it is a step in the right direction, this feature does not tell you what specific bits of metadata it is removing – only that it is removing it. Therefore, if you absolutely must send Word files via email, you should consider a metadata cleaning application. There are a number of these applications, including Metadata Assistant from Payne Consulting (www.payneconsulting.com). A metadata cleaner will automatically prompt you to analyze an outgoing document for metadata and offer to clean it, if necessary. An added benefit is the ability to view someone else’s metadata when documents are sent to you.

Document collaboration is a fact of life, so data regarding changes and revisions is a necessary evil and has to be tracked somewhere. Therefore, you need to be concerned about metadata for the foreseeable future. Fortunately, because most people do not know about metadata, you have time to take proactive steps. For starters, everyone in your office needs to understand what metadata is and why it is dangerous. This may take the form of a memo or a short presentation but, as with many technology issues, don’t expect your users to gain an understanding of metadata without at least some training. Creating documents in a manner than includes as little metadata as possible is a good place to begin training. Converting documents to Adobe Acrobat prior to sending them via email will also decrease the amount of metadata leaving your office. Ultimately, even after taking these steps, you should install a metadata cleaner. The beneficial aspects of metadata mean that it is not going away any time soon and, eventually, more and more people are going to find out about it. Before that happens, you should do everything you can to minimize the threat metadata poses.