If you’ve emailed a Microsoft Word (or Corel WordPerfect, for that matter) document to anyone, you may have unwittingly sent confidential information to a friend, colleague, or even competitor. You see, when you create and edit a document in these programs, the software creates bits and pieces of information and hides it within your document.
If one chooses to reveal these bits and pieces, or metadata, they’ll discover who created, opened, read, printed, deleted information, added information, and where the document was stored as well as how long it took to perform the task on any particular date and time.
What is metadata?
Metadata, as defined in Beware the Dangers of Metadata, is “simply described as ‘data about data’. Think of it as a hidden level of extra information that is automatically created and embedded in a computer file.”
Some metadata is easily viewed (steps shown below). Other metadata is hidden and can be revealed by accident or by using a binary file editor. Both of which are quite possible in any office.
Microsoft indicates that the following metadata is stored in Word, Excel, and PowerPoint files:
- User name and computer name
- Comments and tracked changes
- Hidden text, worksheets, data columns, and data rows
- Embedded objects such as Excel worksheets, drawing objects, and pictures
- PivotTable® cache
- Speaker notes
Why does it matter to me?
All the information indicated above is great for productivity and is an important part of a technical communicator’s life. In fact, we embrace the ability to collaborate! Document management systems rely extensively on metadata, allowing users to find a relevant document based on who edited it, how it was distributed, keywords, and subject or matter information.
Metadata makes life easy, right? Well, MOSTLY.
As I was researching this article, I found multiple references to blunders made by individuals, governments, and even the United Nations, in which bank account numbers, assassin names, original authors not attributed in a document, smoking guns in memos, and more were revealed. Here is an article in the Washington Post that has some good examples. I was particularly intrigued by the story of Tony Blair providing Colin Powell a document that had large portions plagiarized—grammar mistakes and all!
I’ve spoken with colleagues who had several months worth of documentation seized simply because a team member, who was involved in litigation, had simply opened a file once upon a time. The team lost hours of work and had some tense times making their deadlines.
What can your document’s metadata reveal?
Your document can reveal quite a bit about your work. When I was working on a presentation about metadata, I went fishing in my archives for an older document that would reveal sloppy document management. I opened a file that was used in a collaborative project when working on my Master’s degree eight years ago. I believe the original document was created in Word 2000, but can’t be sure with a cursory review.
Just by a simple selection, I revealed the following information about the document (Figure 1) I created earlier this year.
Figure 1: Metadata information in Word 2007 (top) and Word 2003 (bottom).
So, what’s interesting about this? I created this document on February 27, 2008, but my metadata says it was created on August 1, 2007. While this was a brand new document, I had opened up an older folder that had my styles already set. Instead of reflecting revision one, it showed that this was the third time I had revised the document. Though I had actually worked on the document for about 10 minutes, I apparently had it open for 50 minutes at the time of the screen capture. The title of the document was even wrong!
For me, the scariest thing I found was on the Summary tab. It says that the company that created the document was Company X. I haven’t a clue about that company. To my knowledge/recollection, I’ve never worked for or collaborated with anyone in that company. When I did a Google search, I couldn’t find anything that seemed to fit Company X, nor did any representative with that company have any connection whatsoever to this document. But there it is…. Somehow this document descended from a document (from a document from a document) that was created by a classmate who probably worked for Company X EIGHT years ago!
Not only would this information be embarrassing if a client saw it, I could be opened up to some intellectual property issues if somebody chose to be litigious. Now fortunately, a forensic review of the document’s metadata would reveal the truth. But it could be expensive.
It pays to be aware of what your document says about you and to make sure it reveals what you want it to reveal.
How do I reveal my document’s metadata?
It’s easy, with one click you can reveal your properties.
- Word 2003 or earlier: select File > Properties
- Word 2007: select Office Button > Prepare > Properties
How can you protect your document’s metadata?
Many ways are available for ensuring that your personal or company data stays with you:
- Turn off Fast Save. This feature speeds up saving a document by saving only changes made to a document. However, text that you delete from a document may still remain.
- Remove personal information from a document when you save it.
|In Word 2002 and 2003||In Word 2007|
- Turn off the Track Changes tool.
- Use a third-party software to remove the information.
- Use a clean template/document each time.
- Save the document as an .rtf, .txt, or .pdf file.