Data Stewards and the Conceptuality of Open Data

I attended an Open Data Week event about Data Stewards. I had heard about Open Data Week through the Pratt School of Information google group and just an event in Manhattan at a decent time, 6pm, on my free day and answered the Eventbrite RSVP. I arrived on 21st St in the Flatiron District and my iPhone’s mail app began failing and displayed the subject-line of the RSVP but not the email, despite refreshing and full reception. Ironic for attending an information science event. However about four other people were immediately chatty and introduced themselves when we realized we were all trying to go to the ninth floor in the wrong building. If not for them and their awareness of two locations in association with the event or its sponsors, I probably would have been lost.

The correct building was a stone’s throw down the block. I didn’t try to ask these fellow attendees numerous of questions but did if they worked at the same place, as they clearly knew each other. They said yes and no, equivocally, probably yes at some point, no now, and yes of course to having similar interests in data. We all arrived at the correct floor in the correct building, I believe a WeWork space. My interpretation was most present were some form of software engineer, and some with nonprofits, and showed up from a share of self-interest in access to data for their projects, and what the so-called ‘open data’ landscape looks like and is aspiring towards. 

There were many free sandwiches and beers which gave the event a specialized feel. I had sandwiches and a La Croix. In ten minutes or so of chatting before the panel started I talked with a guy standing near me, slightly older, who said he had a software project he was in the process of boarding to a friend’s private sector company. I asked a few elementary questions informed by my first two months at Pratt like if it was a database, in a cloud, and if SQL. He said it was noSQL. I’ve found even so far there’s a consistency of ideas or themes once you start discussing projects in the data community. 

Additionally a survey was handed out at the beginning, and from its wording intended to be filled out at the panel’s conclusion, though no one called for them and I didn’t see a bin to return them. I still have mine. It’s not specific to this Data Stewards event here but Open Data week in general. One of its most telling response options I thought was the most advanced option for “What is your level of data expertise?” Which goes “I am a data expert with no fears, who is happiest when given a messy dataset to wrangle”. In addition to the rapport of the group this suggests to me the event or week in general is consciously for advanced information engineers.

The overall slant of the panel and attendees, I gathered, was about prying data from the private sector, and that those attending had projects which could use it. However as it went on many comments were made about the private sector as an efficient beast that is ready to sell and even compete with its data. Everyone there wanted ‘data collaboratives,’ (private, nonprofit, government) to become more systematic and sustainable. They wanted more ‘piloting’ and prototyping and predicted a ’reimagining of statistics in the 21st century.’ However there were striking differences between each of the three sectors discussed, several of which openly acknowledged.

The private sector had the most need to reflect on its biases, as its interests could change and such a company would typically also have a desire to ‘get its name out there.’ Sometimes it’s even tricky for them to get involved in a data agreement if it’s more a long than short-term profit. Cubiq, a three-year old startup for location intelligence from consumers, had a representative pesent named Brennan Lake who spoke about its Data for Good program. Using opt-in smartphone app data to supplement a natural disaster response program, and he mentioned in particular a focus on giving to data right to natural disaster professionals who can appropriately use it. 

However it was also acknowledged across the board that access to data can sometimes come before genuine solutions or use protocols. Rules and a contract repository were mentioned as desired. Estonia by contrast already has legislation for data sharing and Denmark, from which a statistician was present, pulls its census results from admin data, employing two people. Nick Eng from LinkedIn also noted using information they already have uses about two analysts compared to an external project. Brennan from Cubiq also spoke about ‘figuring out the ask’ as being as a difficult part. Privacy as a topic of beforehand attention and cost was also highlighted in particular by Nick from LinkedIn. In these upfront negotiations Lake mentioned ‘privacy by design paradigm,’ and Eng emphasized the cost of producing a sharing agreement that is ‘as hard as possible to abuse,’ but that also being the only way they were willing to enter sharing agreements. 

I can think of several connections to design, and identity and concept politics from Foundations course readings. Talja & Hartel in their look at user-centered data favor a turn more to the audience or user in an effort to reflect more realistic demographics, situational contexts, and not just investigate how researchers are using a system and if their ‘needs’ are met. This is similar to a turn to individual researchers, or so called Stewards for private companies, in reflecting on the information they formulate and seek, and their culture. It did feel like a tech culture to me at the event, although the most straightforward panelist I thought was the Adrienne Schmoeker from the Mayor’s Office of Data Analytics, a new office employing about eight people. The Mayor’s Office has an advantage of being an ‘enterprising organization,’ she said, always minding to serve the city’s 8.6M people. Nonprofits by contrast, more like government than private in this respect, can be much less efficient in contract production and may be just trying to keep the lights on in their offices. A private company rather may have more of a sense of ‘giving back,’ for using city services and frequently census data.

It seems like in an imaginable future more companies and even individuals may seek data—Schmoeker from the mayor’s office anticipated eventually having an open help desk for data, but right now they address matters like STEM (Science, Technology, Engineering, Math) funding for schools, free lunches for kids, ambulance speeds and tenant abuse. However as she said earlier, “there’s no ideal dataset,” and just a live stream without history doesn’t highlight that much that is useful. Another panelist echoed that if it’s less private, it’s more futile. It seems to invoke a more conceptual turn in use evaluation, in other words not just “task oriented” (Talja & Hartel, 2007) but turning to users with what to me seems like situational awareness and occasional cynicism. 

Similarly, I can relate information needs or a burgeoning ‘outlook’ methodology to design needs and the idea of a an axis that actually dishes out preference on multiple traits while representing one, as Constanza-Chock says in her piece on design justice. There are, it seems to me, mechanized intersectionalities, like looking more dryly at how people use a system, or what biases are implicit in their needs (looking at private companies or individual researchers) versus conscious intersectionalities, on which Constanza-Chock mounts the identity of Black Feminism, like looking at how users have conceptualized or contextualized their information (needs). Some of this may include parsing hidden intersections. 

To me it seems like there is an interest in both delineating information by designers, in the “supply chain” (Sayers, 2018), as it were, and in allowing researchers and groups to self-pool data and identity that is increasingly, one would hope, less intersected by an axis that addresses that need only in a shadow, as Costanza-Chock references even some particular community centers as sites of oppression and resistance. 

Given the axises already in place, I agree that it depends on a turn from looking at systems to biases in groups, and from that changes is design to deconstruct shadow interests. It was clear even this Open Data Week event existed in a particular culture. I think we are at an excess of intersections with everyone on the web and there is a need, in myself at least, to locate earlier in timelines, and parse interests that are disadvantageously melded. In my experience this has to do with looking and working before and after points of apparent significance. Data professionals are already looking for granularity of information, as Nick Eng from LinkedIn mentioned in preference to surveys. A move toward reflection and granularity in interpreting users (or researchers) is to me what seems most important, as there may be as much to deconstruct there as in a ‘system.’ A heightening of design theory may logically follow. One of the panelists also mentioned the MIT Media Lab, which encourages “anti-disciplinary research” and already tracks mobility data to gauge housing inequality in and around Boston. It was clear and refreshing at any rate that all attending seemed to be geared by outside the box thinking, at least as perceived by me. 

References

  • Talja, Sanna & Jenna Hartel. (2007). “Revisiting the user-centered turn in information science research: an intellectual history perspective,” Information Research 12(4).
  • Costanza-Chock, Sasha. (2018). “Design Justice: Towards an Intersectional Feminist Framework for Design Theory and Practice.” Proceedings of the Design Research Society 2018.
  • Sayers, Jentry (2018). “Before You Make a Thing: Some Tips for Approaching Technology and Society.”

References:

Leave a Reply

Your email address will not be published. Required fields are marked *