The Grammarian’s dispute over the word Data

If you’ve been involved with discussions that involve data, you might have noticed the dispute regarding its use. It seems that this dispute boils down to how one copes with foreign words as used in English. In this case, the word data is a plural word in Latin (the singular form is datum.)  

“The data were easy to gather.” “That datum was easy to locate.”

In English, however, data follows the singular rules and therefore may be paired with a singular verb or a plural verb, depending on whether the referenced data may be counted. For instance, just substitute another noun that may be counted or not depending on your use case (e.g. “information” or “facts”):

“The data was easy to gather.“ -> “The information was easy to gather.”

“We put all the data in a single folder.” -> “We put all the facts in a single folder.”

So, on the whole, the English speaking world has voted for the English use of the word data.  For those of us for whom this is irritating, well… we get to suck it up.

Usage:

In Latin, data is the plural of datum and, historically and in specialized scientific fields, it is also treated as a plural in English, taking a plural verb, as in the data were collected and classified. In modern nonscientific use, however, it is generally not treated as a plural. Instead, it is treated as a mass noun, similar to a word like information, which takes a singular verb. Sentences such as data was collected over a number of years are now widely accepted in standard English.

(Simpson, J. A., and E. S. C. Weiner. “Data.” Def. Usage. Oxford Dictionaries. N.p., n.d. Web. <http://www.oxforddictionaries.com/us/definition/american_english/data>.)

 

History:

Mid 17th century (as a term in philosophy): from Latin, plural of datum. Originally recorded as a term in philosophy referring to ‘things assumed to be facts’, it is the Latin plural of datum ‘a piece of information’, literally ‘something given’. Although plural, data is often treated in English English as a singular meaning ‘information’, although Americans and Australians use ‘the data are…’. See also dice. In the Middle Ages letters could be headed with the Latin formula data (epistola)…‘(letter) given or delivered…’ at a certain day or place. From this comes date (Middle English) in the time sense. The date you eat is also Middle English but comes from Greekdaktulos ‘finger’, because of the finger-like shape of the plant’s leaves. (Simpson, J. A., and E. S. C. Weiner. “Data.” Def. Origin. Oxford Dictionaries. N.p., n.d. Web. <http://www.oxforddictionaries.com/us/definition/american_english/data>.)

The word Data is troubling on another front…

It is ironic that the word data is so vague because it always refers to something very specific.

First of all, the word data is useless without context. When someone suggests that you “get your data together” this can mean very different things to people coming from different perspectives:

To a classic IT support person, “get your data together” typically refers to all of the information related to your account on a local computer. In this case, it includes the more obvious files such as images, Microsoft Office files, PDFs; but it also includes the less visible, account-related files like the template and custom dictionary files in Microsoft Office™, or bookmark files from each browser.  

To a researcher, “get your data together” is likely to indicate only the numeric or experiment-related files related to their work; the collection of information that is the object of their analyses. The rest of the materials they actively created are likely to be referred to as simply their “files.”  The less visible or automatically created files associated with their account do not enter into this version of “get your data together.”

If we change the phrase slightly, to the more obvious but still ambiguous “get your data backed up,” only the context of “backup” gives us a clue that there might be more to the statement than just one’s research files. But my point remains;  be sure to provide context when using the word data!