When I began thinking about this, I had to ask “What isn’t data in literary studies?” Everything is data, in some sense, and it depends on the position of the analyst and the nature of the project. So I want to narrow the question by situating it: what is data to whom? and for what? In this talk, “data” is that which can serve as input for computer analysis, by someone working with texts using the type of Natural Language Machine Learning I’ve worked with to isolate significant word clusters, topic modeling.