Living with Machines: Revolutionising the way historical sources are analysed
Data scientists and humanities scholars are uniting for a bold new project that seeks to dispel the myth of 'the lone scholar' and provide new insights into the human impact of the Industrial Revolution.
Living with Machines is a major new five year inter-disciplinary research project led by the British Library, the Alan Turing Institute, and four partner universities that will use data science and artificial intelligence to analyse the human impact of the industrial revolution.
“One side of the project is about the research and sifting through very large quantities of data in the form of nineteenth-century newspapers and census data using the kind of artificial intelligence (AI) and data science capabilities that the Alan Turing Institute have become known for,” says Dr Ruth Ahnert, principal investigator on the project, who is based at Queen Mary, University of London.
“The other side - which is equally important - is the development of a research paradigm.
“We are suggesting a mode of collaboration that will close the gap between the computational sciences and the arts and humanities extending what the digital humanities have achieved. We are hoping to create a space between the two cultures; a space that is filled with shared understanding, practices and norms of publication.
“The aim is to create a data driven approach to our human past and a human approach to data science.”
Living with Machines is set to be one of the biggest and most ambitious digital humanities and science research initiatives ever to launch in the UK - funded by UK Research and Innovation’s Strategic Priorities Fund and administered by the Arts and Humanities Research Council.
Dr Ahnert hopes that it will devise new methods of research that will revolutionise the way historical sources are analysed, and provide vital insight into the debates and discussions taking place in response to today’s digital industrial revolution.
“If we want to make use of the massive troves of digistised sources available to us now, we have to get away from this idea of the 'lone scholar' working away in silence in the library and get to a point where we all feel comfortable working as a team, sharing results,” she says.
Project partners are the Alan Turing Institute, the British Library, and the University of Cambridge, the University of East Anglia, the University of Exeter, and Queen Mary University of London.
“There have in the past been many examples of humanities scholars getting to the point where they had particular questions that needed computer science input, and so they would then hire someone to bring their vision to life, but it was not a true collaboration” says Dr Ahnert.
“And it's happened the other way around as well. I've read papers where computer scientists use humanities data but don't really get to grips with the complexity and contextual nuance.
“We want to get beyond this hierarchy.”
Dr Ahnert says that she hopes that Living with Machines will progress as a fully reflexive process that will have everyone involved at every stage and create a good model for how people collaborate in the future; where the research framework will be as important an output as the computational code and the historical findings.
“We will be looking at data for a large number of people over a wide area for a long time and one of the challenges we have is understanding and adjusting for the bias in the sources, and the bias created by the process of digitisation,” says Dr Ahnert.
“We have over representation of certain newspapers, for example. Can we quantify those biases? And how do we fill those gaps? And how can we off set the political bias that newspapers have? Most importantly, we will communicate those biases to the researchers, giving them control”
These kind of challenges present an knotty problem for the research engineers on Living with Machines - finding new ways to work with unstructured data will give them the opportunity to create new code and digital methods for the community.
“They will have to find ways of ingesting it, of reading it and creating new methods to let the content speak for itself, posing new humanities questions; they are very excited by having this large dataset, the opportunity to turn big data into big knowledge” says Dr Ahnert.
Language is one of the hardest things to parse and presents challenges that are at the forefront of data science.
“They are interested in developing new methods to explore how actual historical questions can be broken down into logical, computational processes,” says Dr Ahnert.
But pulling all this together with require a fundamental rethink in the way that the project team, as academics, work – and so they have turned to the methods pioneered by high tech firms in Silicon Valley.
“We started with a project charter that lays out our principles of openness and sharing, what our expectations are,” says Dr Ahnert. “And because this isn't a normal project we are using Agile Project Management software.
“We will have fortnightly meetings where we will decide what everyone is doing and assign tasks.
“It will require huge behavioural changes. We are being a bit more like industry with a lot more transparency about what everyone is doing.
“No one will go dark because they are struggling. We will have a shared workspace and an expectation that we will work there, sit together and discuss things as we go along.
“We want to really make sure that we are actually collaborating and not just working alongside each other with a common goal. That's the way to come up with some really exciting results for everyone.”