QUESTION
How accurate is 311 data in reflecting citywide issues?
0:47:16
·
3 min
Martha Norrick discusses the complexities and limitations affecting the accuracy of 311 data in New York City.
- Martha Norrick highlights the importance of understanding the data generation process for 311, mentioning that reporting rates vary among individuals.
- She explains reasons for uneven reporting, including distrust in government, reporting fatigue, and differing priorities in different neighborhoods.
- Norrick doesn't see these as issues with the data itself but rather with how the data is generated.
- She gives an example of cross-referencing 311 data with Department of Health rat indexing inspections to validate complaints, showing the potential for 'ground truthing' 311 data.
- Throughout the discussion, she emphasizes the value of 311 data, despite its limitations, for citywide analysis and decision-making.
Erik Bottcher
0:47:16
Do you think that 311 data is an accurate snapshot of what's actually going on in the city?
0:47:24
Do you think you know, as folks who work in data professionally Mhmm.
0:47:29
Do you think the data is really reflective of what's going on there or are there issues with it like you know, people fatigue about reporting the 311, some neighborhoods?
Martha Norrick
0:47:49
Yeah.
0:47:49
That's a great question.
0:47:51
I think, you know, we actually we work with the 311 dataset ourselves all the time.
0:47:57
The mayor's office of data analytics.
0:47:58
The office of data analytics has a data science team within it as well.
0:48:01
So we're both publishing this data and using this data for for analyses around New York City.
0:48:07
I think it it is important to always remember how that data comes to be, which is that people call or text or use the 311 mobile app to report something that they see.
0:48:19
And not everybody that does that at equal rates.
0:48:22
Reasons could include sort of, you know, discomfort with with the idea of an acting with with government generally, distrust of government could be your as you're right, like general reporting fatigue.
0:48:32
Like, how can I actually call New York City, but every Rad IC, like, that's a lot that's a lot of calls about 31 to 311?
0:48:39
Or or just generally, you know, I think there's a lot of reasons why people do or do not call through in 1.
0:48:45
And I I I think some of them I wouldn't necessarily describe that as an issue with the data, so much as that is an issue about the data generation process.
0:48:55
Right?
0:48:56
Like, ideally, we we all trust 311 we all or we, you know, trust government to to deliver on on a on your, you know, resolving a complaint.
0:49:07
And, you know, see a number of times.
0:49:10
So I I do think it's important when you're using the 311 data set to keep that in mind.
0:49:14
I I don't think that it's I don't think that they're know, there's also situations where people use 311, are are, like, very enthusiastic users of 311, and there are lots of complaints about a particular thing that, you know, you know, in in another neighborhood or with another person, they wouldn't call 311 as as frequently.
0:49:32
So Yeah.
0:49:34
I think it is a very, very valuable data set.
0:49:37
And, you know, some of the cool things that you could do with the 301 data and with with sort of, you know, if you have another dataset that has another view of that same issue.
0:49:46
So the Department of Health for example, does rat indexing inspections where in a particular neighborhood, they are going to every single house regardless of whether or not there was a complaint about a rat there.
0:50:00
So we can actually look to see, you know, who's complaining in that neighborhood about rats versus where are those health inspectors actually finding rats?
0:50:08
When they're just going to every single address.
0:50:11
Mhmm.
0:50:11
So opportunities for sort of ground truthy in 311, they're they're rare, but when they happen, they're really exciting because you can really kind of see, like, how how that how that how those, you know, the data generation process influence is sort of what you see when you look and analyze the data.
0:50:30
I'm this is like my favorite topic in the whole world, so I can nerd about about it forever, but