Here are my free-form, unedited notes from the LBJ School panel at the event:
Research Class from LBJ School: Finding the Needle in the Haystack
Barack Obama’s commitment to transparency through technology. (Quotes Barack.)
“Information is only transparent if citizens can find what they are looking for.”
Roadmap to transparency.
LBB site – early adopter of ‘transparency’.
Hundreds of documents, can take a lot of time to find what you’re looking for. How clear is the taxonomy?
LBJ project IA suggestions. Suggested improvement includes cleaner interface, different and better classifications.
Suggestions are designed to bridge gap between government and citizens, to give people a way to understand what govt is doing and how it affects their life.
Who’s looking and what are they looking for?
Surveyed and were surprised to learn how few would use the data for advocacy and to form public opinion. The format of the data is just the first step. Doesn’t matter how well it’s formatted if it’s not supporting advocacy and transparency.
Access to data is seldom the problem; locating the needle in the haystack and then understanding what you have found are the real challenges.” – from one of the respondents.
Rethinking what journalism is in this age. Journalists are the ones who will make use of the data most readily.
If you provide data that only the most sophisticated can use, you perpetuate the digital divide.
Important to make the data visual and consumable. Have legislators explain to their constituents what this means to them.
Can’t expect govt agencies that make the information available to have the same agenda as those who are consuming it – make it available with a clear API, essentially.
User interface is important, but you have to look at the supply chain for information. How do you do discovery to find the data you need. Once you have discovery, what is the interface? Big economic and time issue to get the data out.
If you can describe the database properties, you an come out with an FOIA response.
Discoverable properties, database properties, analysis properties, trust. We’re not interested in it if we can’t trust it.
Data.gov. Are they looking at centralizing all the data in one location? Data.gov leads best practices and open data standards.
Thought about the idea of putting it in one place, but was a little beyond the scope of the research project. It’s hard to talk about aggregating many different types of data for many different divisions into one place. E.g. if you tried to create a website for data from all agencies of a state, you would have “a lot of apples and oranges and mangoes.”
What is the structure of information delivery organizations in the future. ProPublica award winning project around recovery data. Rise of data management journalism organizations that act as wholesalers.
“We’re inviting all of you right now to help us.” Project extends through May.
Comment: Identifying what are the most useful information sources, going with the highest value target is very important. The hard part is understanding what the information is at the start.
Do you think that we’re starting at step 2, then? Is this a chicken/egg question.
GIGO problem at the interpretive level has no ready answer.
Carolyn: concern about the privacy of individuals in all the government data that is held about them. In 20th Century Chief Justice Brandeis observed that having records on paper resulted in practical obscurity. With digital data we have a greater consideration of potential privacy issues.
What you do produce can be curated.
Crowdsourcing with geodata – time and data. A guy on a bicycle who maps East London every weekend. Once you have the geocodes there’s an explosion.
In some sense it depends how many eyes you can put on it. Guardian UK: MP expenses scandal. PDF docs were scanned in, not text-recognized, but put online as pdfs. Crowdsourced views and gave people a way to identify and annotate where something didn’t look right, and this was how they dug out the information. Depended on having a big critical mass of people to crowdsource both effectively and quickly.
From the audience: “Politics in general is a form of crowdsourcing.”
WIkis and collective ideas for crowdsourcing. Do you think it would be useful to test this idea out on a local level, get input, a collective idea? What is the data people are interested in looking for, creating a dialogue. How would this impact the work you are doing?
You could not pick a better idea than what was talked about – measuring the success of education. People care about kids, and people are not happy.
Are there unintended consequences? Benchmarking utility data could be useful information. But that could also be a component of “top ten homes to rob.” If we don’t look for these unintended consequences and adjust for them, there’s the potential for some really bad things to happen. Looking at how to manage unintended consequences would be a great thing.
How and where do you focus your energy in the marketplace? Outcome-driven innovation. Center for Social Innovation in Austin predictive. Build an ontology or framework around an issue you’re trying to solve. We could study all the jobs on local and govt level and leverage crowdsourcing to get a report card on work and identify unmet needs.
Our project is in some ways a guerilla movement to circumvent the govt and offer ways, get started.