Open Government Data: The Book

By Joshua Tauberer. Second Edition: 2014.
Also available as a Paperback and for Kindle. Tweet me at @JoshData.

The Later Memorandums

The Digital Government Strategy

At the start of the Obama Administration in 2009 there was a major effort to push open data as a means for improving transparency and participation in government, and it came from the very top. was initially primarily touted as the Administration’s signature transparency achievement. There has been an unfortunate shift since the 2009 directive.

Transparency has not been any longer a top-line goal in the Administration’s open data programs. The leadership on transparency that was found in the early years has been replaced by deliberate obfuscation (from the Foreign Intelligence Surveillance Court’s secret case law to the lies to Congress about domestic surveillance1) and intimidation of journalists2 and whistle-blowers.

The 2012 Digital Government Strategy marked the shift in policy away from the principles of the Memorandum toward entrepreneurship. The economic recession of the preceding few years had shifted the national debate toward economic growth and job creation, and the Digital Government Strategy was certainly a part of that trend. In the document, data was framed as a tool for innovation (rather than transparency), with participation reframed within the world of mobile communications, and collaboration lost entirely. The Digital Government Strategy also elevated the profile of APIs, perhaps at the expense of bulk data.

The 2013 Executive Order and Open Data Memorandum

With an executive order3 and a new memorandum, the Memorandum on Open Data Policy—Managing Information as an Asset, in 2013, the focus on entrepreneurship remained at the forefront. Weather data and GPS signals were the examples of choice in the Open Data Memorandum. “Transparency” is mentioned only in references to the 2009 policies, but not as current policy goals.

The memorandum rightly returned focus to open data, as opposed to APIs. The Open Data Memorandum presented the most detailed definition to date of “open data” by the federal government. It included many of the principles mentioned in this book in its own definition of open data (including online, primary, timely, accessible, analyzable, non-discriminatory, non-proprietary, and with public review). In describing several of the principles it reused language verbatim from the original 8 Principles (which is great), including suggesting that data be “available to the widest range of users for the widest range of purposes” and the use of “multiple formats”. Its definition also states that open data has a presumption of openness, and elsewhere the Memorandum addressed public input and interagency coordination.4

The Memorandum also stated that information collection should be done in a way to support information dissemination: “[A]gencies must design new information collection and creation efforts so that the information collected or created supports downstream interoperability between information systems and dissemination of information to the public.” This brought attention to a very real stumbling point for open data, that even if an agency wants to make data available it can be incredibly costly to turn the information as collected into a format suitable for dissemination. Building in redaction, slicing, and exporting into how data is collected would reduce the cost of public access later on. The memorandum also asks agencies to create data catalogs to include datasets “that can be made publicly available but have not yet been released.”

But the memorandum was exceedingly unclear about what it intended to require of agencies with regard to licensing. It required the use of “open licenses,” even though most federal data is not subject to copyright protections and thus cannot be licensed. And while it required data to be made available with “no restrictions on copying, publishing, distributing, transmitting, adapting, or otherwise using the information,” it also suggested that agencies could require attribution. And in the year since the Memorandum was posted, agencies have continued to impose capricious requirements, including attribution, just as before.

Open licensing would be a step forward for government data produced by government contractors, which is subject to copyright protections, but “open licensing” still does not mean no restrictions. Typical open licenses in the private sector grant limited privileges to copy in exchange for restrictions. The most common restriction is the GPL license’s virality clause, which permits copying if any associated works are also licensed under the same GPL terms. This is what “open licensing” can mean, and I hope it is not what the White House intended.

And imagine if government agencies began to rely more on contractors to handle policy decision-making processes. Would the core materials of government become copyrighted? While there are some safeguards in place against contractors performing “inherently governmental” or core functions of government5, to my knowledge this difference has not yet been tested as it relates to the public availability of government records. Open licensing is not sufficient for government data: it must be license-free.

For more background on licensing, see No Discrimination and License-Free.

We may be at the start of a significant policy shift in which users of government data become afraid to deviate from what the government data owner deems an acceptable use of the data because of an implicit threat of a copyright infringement lawsuit or a civic or criminal lawsuit arising from a violation of a terms of service agreement.

And so I worry about how confused and short-sighted policies like the Memorandum create precedent for other jurisdictions. In 2014, Washington, DC’s mayor issued a Transparency, Open Government and Open Data Directive that drew from the White House’s memorandum. As in the White House memorandum, the DC directive asks agencies to use an “open license,” contrary to the principles of open government data. It also explicitly states that there will be “no restrictions” on use while simultaneously describing restrictions including a requirement to agree to an “indemnification” clause6, to attribute the District for the data, and to explain modifications to the data.

There is a strong American tradition — or at least a core American value — that the government does not get in the way of the dissemination of ideas and that the government does not put words in people’s mouths. We don’t always live up to that ideal, but we strive for it. Access to information about the government that comes with restrictions on what we can do when we use it (e.g. attribution and explanation), a waiver of rights or a commitment to indemnify, etc. are all an anathema to accountability and transparency and respect for the public.

  1. Charles C. W. Cooke. June 11, 2013. Clapper’s Lie .

  2. RT. 2013. US Justice Department acknowledges wide-ranging surveillance of AP.

  3. Executive Order: Making Open and Machine Readable the New Default for Government Information

  4. While the definition of open data is quite strong, the definition is used just once in the whole memorandum. The memorandum does not mandate that government data be open data under its definition, at least as far as I could see. The only use of the open data definition is in its request for agencies to create roles for staff to ensure data released to the public are open. That is, staff should promote open data, but open data itself is not required. Although the definition itself is not used much in the memorandum, there are independent provisions that repeat some of the same principles: agencies must use “machine-readable and open formats,” existing standards, and metadata.

  5. Report of the Acquisition Advisory Panel to the Office of Federal Procurement Policy and the United States Congress. January 2007. Chapter 6: Appropriate Role of Contractors Supporting Government, page 393.

  6. Interestingly, federal government employees cannot agree to indemnification clauses. Any data or API restricted by an indemnification cause puts it off-limits for use by (other parts of) the federal government.