We’ve heard how artists and creators are making sense of, and testing the edges of artificial intelligence. In this session, we look at how different institutions are using the same technologies to augment access to the archives and activities. We will hear how experimentation with emerging technologies requires the ability to collaborate and be confident in the uncertainties that the experimentation brings. How do machines see and hear archival collections? What new possibilities emerge from the archive as a result?
Speakers
Dr Mia Ridge (British Library), Dr Keir Winesmith (NFSA), Simon Loffler (via Zoom) (ACMI), moderated by Jeff Williams (ACMI)
Watch the video with graphic notations by Jessamy Gee
More recorded talks from the FACT 2024 Symposium
Transcript
This transcript was machine-generated and published for search and accessibility purposes. It may contain errors.
Good afternoon and hello. I'm Jeff Williams, ACMI's head of technology. And this is the machine looking back at us. We're not going to be talking directly about games but, or playing, but we do have a lot of fun playing with tools. And we're going to talk about a lot of that today. Over the next hour, we'll be hearing about three organizations and how they're using AI and machine learning to engage with their activities. First, we'll hear from Simon Loffler, a creative technologist at ACMI. Simon will be discussing the tools used to help expand ACMI's collection search and what ACMI can learn from this new data. Keir Winesmith, NFSA's chief digital officer, will discuss a system they built to help address some of the common issues that arise when using large language models and how that project is informing what they will do next. And Mia Ridge, the British Library's digital curator for Western Heritage Collections, will discuss AI tools that can enrich GLAM records and overall making machine learning work for GLAMs. Please welcome Simon Loffler. He is on Zoom here. Thank you. Hey. Thank you. Firstly, I'd like to acknowledge the Kaurna people as the traditional owners and engineers of the unceded lands that I'm speaking from today in so-called Adelaide. Always was, always will be, Aboriginal land. I couldn't decide whether to use an optimistic or pessimistic cover for this talk, so here are both. So this cover image is a machine looking at a human in a museum. This image is also a machine looking at a human in a museum. I hope you relate to either one of them. The first thing I thought I'd talk about is because machine learning uses GPUs, we might talk about the GPU rich, which is a new flavour of wealth inequality. So what is the GPU rich? As organizations or corporations with many more GPUs than human employees. Are we talking about any GPUs? Well, no, because all GPUs help display information quickly, but only a few are good for fast machine learning, such as NVIDIA GPUs. Will it ever change? Hopefully, yes. Over time, more frameworks will appear that will help create fine-tune large language models. So a recent example is Apple's MLX framework. So ACMI are GPU poor. We have two GPUs that are NVIDIA GPUs at ACMI And both of them are using interactive exhibitions in our gallery. So we have zero for machine learning. But don't be too sad for us. We have CPUs, even some in broom closets, such as our legacy virtual machines. These ones have 64 CPUs and 64 gig of RAM, which is pretty reasonable. And so after migrating internal software to the cloud, we could use these VMs for machine learning slowly. So what's our machine learning strategy? We use cloud GPUs for experimentation with large language models, large image generation models, and other processor-intensive experiments. We then deploy prototypes to our Azure Kubernetes cloud CPUs. So XOS, which is our museum operating system, can automate background tasks for us when we're using low-RAM models. So these are like 1 gig models, such as Whisper. Then for anything that needs from 1 gig to 32 gig, we generally use our developer laptops, so one-day tasks at a time, because it slows down on the video card and stuff like that. And then for anything that uses beyond 32 gig, we use our VMs. And so that's for tasks which might generally take like 100 days to complete. And we open-source everything so others can learn and build on our work. So what have our machines done? Firstly, we installed OpenAI Whisper into XOS and transcribed audible speech in 4,000 videos in our collection. Next, we created our hugging face client, so we could use BLIP-2 to write image captions every 100 frames of those 4,000 videos. And finally, we added our OpenAI embeddings client, so we could encode our 47,000 work records into numbers or vectors for our Chrome DB vector database or our similarity comparer. And so all of these projects, we've written about online at labs.acmi.net.au And all of the code is up at github.com/ACMIlabs On the final slide, there are more links to these. So here's a little demo for audio transcription video search, searching for the term solar. I get a community event. So just by ringing around, I've got 60 odd people to get involved in the solar programs. So that allows visitors to click exactly where their audio speech term is in the videos. But not all of our videos have audio. So then we added the video image frame captions to our search index as well. And this allows us to return the videos where the image includes the search term. And also, if it's a video, we can add a video image frame and also if it's in the text in the image as well. Involved in a solar program. It's interesting because obviously every. And next, we built an explorer for people who don't know what they want to search for. So you are shown a selection of random works from our collection. And then using the OpenAI embeddings, it returns similar vectors to the work that you select. So this allows visitors to follow rabbit holes through our collection from student video projects to music videos to TV shows to video games. And for this experiment, we turn the distance between the vectors that was returned into percentages to make it a bit easier to understand. So if you can read those little pink ones, when we click on one of the works, it will return all of the similar works. And in pink there, it shows you the percentage matches for all these others. So you can see the metadata from our records and then keep burrowing down the rabbit hole. So that's what the humans heard and saw in our collection. Now let's look at it from the machine's perspective. So here's a list of the 20 most common words that the machines saw in our audio transcriptions. And also it shows in brackets how many times they appeared. And so next, we ask GP4 for its observations of the top 100 words. So it noticed things like personal pronoun usage, conversational words, temporal references, actions, and movements. And we'll just keep jumping through these relatively quickly. There will be a link to the slides at the end. Because I got it also to summarize what it heard in the ACMI collection using its own voice. It's clear that the ACMI video collection is rich in personal narratives, social and cultural discussions and covers a broad spectrum of topics, including personal experiences, cultural identity, history, and perhaps educational content. The conversational tone indicates a strong focus on engaging and relatable content, potentially aimed at a wide audience. The specific references to Australia and Australian also underscore a focus on national identity, culture, and possibly the exploration of Australia's place in a global context. So next, we have a look at what the machines saw from the VMA that categorized from our videos. So this tries to determine what actions from a list of, I think, about 400 different categories there are, so a little bit general. It's also worth noting that VMA version 1 has quite low confidence scores in what it's categorized. So yeah, take these results with a pinch of salt. We look again now at the 20 most common words that machines saw from our video captioning. It's worth noting the gender imbalance here, and that's looking pretty similar to what it saw in the audio transcriptions as well. So once again, let's ask it for a summary of the captions. A video collection rich in human-centered stories, diverse settings, and a broad spectrum of activities and themes. The emphasis on color and appearance, along with the variety of objects and settings, points to a visually engaging collection. The presence of historical and geographical references, combined with descriptions of both natural and urban environments, suggests that the collection encompasses a wide range of subjects and narratives, potentially appealing to a broad audience with varied interests. So it's probably a good time to look at what it didn't see. So it's worth considering that when we add a layer of machine listening and watching to our technology stack, we're also adding layers that might further strengthen biases in our collection. So a question for our organization is like, when was the gender imbalance introduced? Was it when the filming occurred? Was it when the films were donated to ACMI? Was it when the films were selected for digitization? When the images were selected for the training data? Or when the machine recognized the image? All of the above. And what about all the other biases? So interestingly, YOLOv8, which is what we used for the video object recognition, doesn't see gender, it just sees people. So here's another list of the 20 most common objects that it store in our video collection. And once again, ask GP4 to give us a summary of that. The predominance of human figures, along with a rich diversity of animals, objects and activities, indicates a collection that spans a wide array of themes, including daily life, nature, technology and recreation. The data reflects the collection's potential to engage a broad audience by covering topics that are universally relatable, culturally significant or of specific interest to niche audiences. Next, we'll have a look at what the machine saw in our works embeddings. So for this image here, we used the OpenAI's clustering example code, which applies a key means clustering to try and categorize the 47,000 works into four colored clusters. And then uses t-SNE, which is stochastic neighbor embedding, which reduces the dimensionality of our data from 1,536 dimensions down to two dimensions, so we can plot it. It kind of does its best to preserve the local structure of the data, which makes it kind of useful to display here. So in the purple cluster, you can see the top 20 words from the titles of works in that cluster. The green cluster here, and it's interesting to note that first word or the acronym there, T-E-S-E, I wonder if the collections team can tell us what that is. Red cluster and blue cluster. And so once again, I might ask GP4 to give us a summary of those. The clustering effectively groups the data set into distinct thematic categories. Promotional or cultural content, purple cluster, educational or documentary style content, green cluster, narrative commercial films, red cluster, and Australian specific content, blue cluster. Each cluster represents a different facet of the collection, highlighting the diversity of materials from educational and documentary films to narrative cinema and culturally specific archives. So what are we gonna look at next? We would really love to open source a data set, which will give us the ability to evaluate open source image data sets and models, as well as fine tune captioning models to reduce the biases in future captioning projects. We'd also like to look at doing some more vector embeddings, but this time with the actual pixels in images, and then do the same for the image pixels every hundred frames of videos. So that will enable, yes, the evaluation of bias. It'll also allow the vector representations will let us upload images and compare them against images in our database. And so then we can do that by color and style, which would be quite fun. And we can let visitors do that as well, which I think would be really fun for videos. Thank you. Here are a bunch of links. So to our labs posts, open source code website, where you can play with the video search and the embeddings. And yeah, the link to the slides at the top. Thank you. Thank you. I'm a little sleepy. You're a little sleepy? You're sleeping in your comfy chairs in your dark room. Can you shake your fingers with me just for a sec? I'm gonna chew up some of my 15 minutes. If you have the ability and if you're comfortable with it, just stand up and point at those big lights and stare at them for a second. Imagine that’s the sun. Imagine that’s the sun in the morning, you walk up. There's a word for the sun on your back. I don't know what that word is, but it's a beautiful word. Have that word in your mind. Little stretch left. Little stretch right. All right, sit down. You're chewing up my time. Sorry, I'm just, I needed that. All right. I'm gonna talk about the machines looking back at us. I'm doing a conversational archive at the NFSA, but actually what it really should have been, I realized on the way here is it should be a talk called people don't care who have their stories. They just want access to them. And that's the role of the machines in our organization. So I work at the National Film and Sound Archive, the NFSA, not the N... safe for work. But we do have some of that content too because we collect everything. And by collecting everything, what I mean is we collect all the things and we collect them at scale. We have millions of things and we don't just collect the audio visual material, we collect the things that are around the audio visual material. And that might be a DVD, it might be a soundtrack, it might be notes, it might be a sketchbook, it might be a film canister. And it almost always turns into bits and bytes. Primarily we serve the creative industry. So we serve filmmakers, we serve podcast makers, storytellers, cultural organizations, film and TV. We serve people with a question. We serve academics, we serve students, and we serve the general public. And probably, linearly in that order, sorry. And we have like a sh*t ton of these. Oh, I've been told to do this by my wife, trigger warning. I talk fast, I swear, and I try and make you laugh. So we have a sh*t ton of these and we run them through these and these so that we can put them all on these and these. So in the background is the spinning disks and in the foreground is the LCO tape and so that's a really high density bit. The teraflops, so that's eight and a half petabytes is one copy of the NFSA collection as digitized. And obviously we have three copies and one of them's off the shelf, sorry, out of a computer and on a shelf and kept cool so it can't be ever lost. And at the core of our organization is this sort of provocation. So we wanna tell the national story by collecting, preserving, and sharing audiovisual media, which is actually the cultural experiences of our time. It's like the things that we do every day. It's the bit that's inside everyone's phone that's in front of them at all times, including mine, as my child texts me. They don't even have a phone. So what are the machines looking at and who decides? So we started to think about where do we put the machines at the NFSA and I have a little reputation of being the technologist who says you shouldn't do it. At NDF five years ago, I pitched our sector that we shouldn't do VR. At a conference four years ago, I pitched our sector we shouldn't do NFTs and I'm gonna pitch you now that it's different and we should actually do AI. But I also have pitched some of you in this room to say we shouldn't do AI, but it's different now. Anyway, so we started thinking if the organization exists to collect and to preserve and to share, like if that's why there is an NFSA, what role does the machines have and how do we interrogate that in a really kind of critical way, how do we proactively look at that? And so we've made actually the decision, having run a whole bunch of experiments internally, a bunch of pilot, some of which worked and some of which didn't, that the machines don't belong in the middle. The machines belong in the edges. They belong between the collection items and our understanding of them, not our understanding of them. And they belong in between our understanding of them and our publics. So they are in the webbing and not at the core. Our cultural workers, our cultural memory is at the core. That's not the place for the machines. And why is that? Well, we ran a whole bunch of experiments. We were like, we wanted to start with things that we thought would be really good at. So white faces is pretty good at, although that's not Kylie. And then, and text is really good at, right? So one of our curators was really interested in kind of the history of labor disputes and was looking for equal pay and actually found that First Nations Australians were pushing for equal pay before and weren't part of the kind of story we told about the push for equal pay in Australia. And we only found it almost by accident because we're interrogating the characters that are in the videos, that are in the collection that we don't know what is in, because our collection is too big. It's 30 plus linear years of content. And if someone says, I wanna know all of the references to equal pay in Australia's film and television history, you can't do that. It's not a doable thing. But the machines can help when you put them in the right lines. And so we started thinking about this conceptually. So what's the concept? What's the organizing principle for this work after we ran our five or six pilots? And the concept, the kind of organizing principle for this work was this idea of a conversational archive that you could engage in a dialogue with the archive. So this is not the data set. The machines can engage with the data set and you're engaging with the archive to echo the discussion earlier. And the reason we thought it's a conversational archive is, well, 65,000 years of conversation being used to progress cultural knowledge in the place that we work. In our case, it's on Ngambri land. But also really, really young emerging technologies like lists. So a thousand years ago, we started doing lists. About 500 years ago, we put the lists in the book columns. And then we're like, oh, you know what? We can make punch cards of lists and columns. Yeah? And then we can put those and make them ones and zeros and we can put those. So now we're at 50, if we're going to 500 years to 50 years. So 50 years ago, 56 years ago, the first computer was put inside a cultural institution in New York. And you know what it did? It was a computer to look for columns and lists. What we didn't change is with the column headings and the words we put in. So we're saying to the public, we're saying to the public, the way that you can engage with our collections, the way that you can engage with your cultural heritage is you have to know the words we put in the right character order in the column, in the list that we inherited from a thousand years ago from some French dudes who were making lists. And we haven't changed the columns and we haven't changed the words that we put in those lists. We've just put them on machines. So we think beyond that, if we were to go backwards, I guess go forward and go backwards, is to get to a conversational archive where you can actually engage with the archive in dialogue rather than having to know the string order of characters that we happen to have put in the record that might have your grandmother or your story in it. So we made a thing, we called it Ask Ava because we wanted to put it kind of in between like the people we serve and our staff. And we wanted to kind of generate a conversational archive. And so I asked it some stuff. So I was like, and we built it. Some of the people who built it are even in the room. I asked it to write us an essay on Australian film history and like it had a crack, which is actually pretty good. It's a pretty good essay. But the thing that's a bit weird about that is that's not a film, those ideas don't exist. All of that's bullsh*t. You can actually punch that idea into our search because you can search by idea and the NFSA search the collection. And some of them don't exist. Some of them are different things and that's just luck. And almost all of that content is made up. So like, okay, so that didn't work. That's when you put the machine in the middle. We belong in the middle. The cultural stories belong in the middle. Machines belong on the edges. So we told it that it's not allowed to do that. When you ask it, can you write my high school essay for me? It's like, no, no, no, you can watch some sh*t and write your own high school essay. So we then we thought, okay, well, let's push it something that's more doable. I'm a fan of Ben Mendelsohn. I thought his career took off when he acted in a really fantastic film, Animal Kingdom. Turns out when you ask him, his career actually took off with the film, The Year My Voice Broke. And we asked him because we transcribed all of the oral histories of all the interviews with Australian actors, film producers and makers and asked the machine to only tell us things in the collection and these ideas do exist. He did say those things. They're real. And that's the sort of collection that we wanna engage with. That's the conversation that we wanna have. Something that's real with the people who know those stories, they're in those stories. So like if you're a nerd, I'm just gonna leave that for a second, take a photo of that. That's how we built it. Ask Ava's in the middle, right? The knowledge graph we own and the machine sit around it. And it's real, like it actually worked. Like that's actually the knowledge graph that we're drawn from to make the story. There's The Year My Voice Broke. It has a relationship to Ben Mendelsohn. But then we found weird things, like wonderful weird things. Like Ben Mendelsohn has a relationship with Steven Spielberg, which is a caught the eye of relationship, which is not something anyone would ever put in the catalog. There's no option. There's acted in, directed, features, interviewed by. There's not a caught the eye of record. So by asking the machines to look at how these things are related, it's kind of beautiful. And then you can reinforce that and you can say, Ben Mendelsohn says he's in that film. And we know from our catalog data, he's in that film. We can probably trust that he's in that film. And if you ask if he's in that film, and we can get you to him being interviewed about that film, and that's a conversational archipelago. And you want to answer a question, get to the raw, get to the real, get to the human, rather than the machine's sort of kind of aggregated bullsh*t version. Also what's really funny, and I didn't see this the first time someone else pointed this out to me, but it thought there was a film called My Voice, and that film broke, as in was successful. It didn't realize the film was called The Year My Voice Broke. So we're on a journey, right? We're on a journey. So it's good without being great. And we've talked a lot, said, brought it up, so did others, about the climate crisis. So how big does a data model for 35 years of linear content and all of the entities and all of the transcripts and all of the relationships, like how many of these do you need to store that so we can have a conversation with our audiovisual heritage is a kind of open question that we haven't answered yet. The other thing that I'm kind of fascinated by, and I noticed everyone's been using whisper, is the American machines don't speak Australian. So it probably thinks Chiko Roll is a chicken roll, and as we all know, it's neither a roll and it has no chicken in it. Davo, actually, they do a write on Davo. They get Davo. They do not get servo. That's a service station for those from out of town or a petrol station. They never get Queanbeyan, because it's derived from First Nations language. And when I showed this in Canberra, someone just yelled out from the back of the audience, there's a colleague of mine, it's Queangers. So the animation didn't work. They should say Queangers. Queangers, it's Queangers. And so it'll never get Queangers. That's Australian saying only we can teach it that. It won't get Queanbeyan, that's based on local Ngunnawal language, it won't get that. So we are actually, at the moment, in real time, we had a milestone recently, I should probably take a photo of that too. We are developing our systems that speak Australian. We're training them with Australian content. We're training them ourselves, and we're running them on our servers that are run on our green electricity, and they're not being built on American technology, trained with money we pay to Americans that they can then license back to us for us to do our work. And that's critical to our sector, that we do this work in this way. So we thought, what's the organizing principles for machine learning and AI at the NFSA? It's not like a really big binder here. This is the budget statement. It's not a massive thing of words. The director, Patrick McIntyre, has asked for something that could kind of fit on a page, and something you can remember, because it's something that the whole organization needs to feel comfortable with, not just the technologists. When Katrina put up everyone and their thoughts about AI from different parts of the business, I do love that she just left out education. That's great, she is an educator, I think it's brilliant. What's really important, if you're the nerd whose job is to respond to the leader who said, oh, I've read about that thing, we should probably do that thing. You should work with that leader to come up with principles before you do any of the things. Because if you don't do that, you will do the stupid thing. You will do the generative thing. And actually, originally, when we worked on these slides, one of my colleagues said, oh, I could generate a really funny slide using Midjourney to go here. And I was like, no, we’re on the side of the creators? Like, we're not the bad guys here. We don't wanna be stealing people's IP to generate new IP that we then put out into the world. We care about intellectual property. We're like invested in copyright, because it's a way to pay creators to make. So we take our own f*cking photos. So what are our principles? We don't know yet, we're working on them. So we're starting from somewhere. We're starting from maintain trust. One of the reasons the NFSA can exist is because so many people donate material to it, because they trust us to keep it forever, whatever forever means. We wanna build transparently, and we wanna build efficiently. So we don't wanna just do things for the sake of doing things. We wanna do things that actually make our work better, to make the collection more discoverable by the Australian public and more reusable by producers, makers, storytellers, so that the past can actually play a role in creating the present, so the future is less sh*t. And we wanna create public value, because we exist to create public value. It can't be done in a vacuum. It's gotta be done for and with the Australian public. And it needs to be done in a way where if you are an IP creator and we are the place that you put it to keep it forever, there are many instances where people have made a kind of either regulatory kind of e-deposit donation or like given their content to the national collection or they've chosen to do it. And a couple of weeks later have accidentally deleted their movie. And we've got the only copy on earth. Then we send it back to them and they can release it. So we're a place of trust. We cannot be using the machines in a way that steals that trust away from the people with whom we work and on whose behalf we maintain the National Audio Visual Collection. So all the words underneath these three principles haven't been written yet or we've kind of got a draft, but we wanna make sure that it's actually crowdsourced from within the organisation so that we all feel it and we can lean on it. So when you're in that room writing code or wiring up a model, you know whether or not it fits these principles and you know the right way to do that work, because unlike with NFTs and unlike with VR, we actually do have to do this work. It will be done to us unless we choose to do it. To help us get better at it, we're hosting a thing called the AI4LAM or AI4Libraries Archives and Museums International Conference in October in Canberra. Three seconds left. And I really encourage you to join us then. Thank you. Hey. Okay, just while I'm waiting for the slides, I can tell you that what you're going to get is some dodgy weaving metaphors and a confusing British Australian accent that AI definitely struggles with. Okay, I can see them on the thing, but they're not, oh, there we go. Awesome, and also aerial, because that's the brand of the British Library. And I haven't included a slide in the British Library, but it's the National Library of the UK. Our collection is somewhere between 170 million to 200 million items. That scale puts pressure on almost everything that we do. Our collections come from around the world. They cover many, many languages, many formats. So a lot of the work that we do means that we're not as agile as some organizations, but we also try and experiment where we can. So we'll talk about some of the experiments that we've done. So fundamentally, my argument is that AI can help enrich your collections, but it does need supervision. I've also been a technologist who spent most of my time saying, we don't need VR, we don't need NFTs. We need to be thoughtful about how we deploy technology. We don't just do it to chase trends. We don't have the resources to do that. And also we need to think about what's unique and special about what we offer. But I think that every gallery, library, archive, museum, community organization is different. So the art is in weaving AI into your existing strategies, thinking about your existing audiences, the problems that you're trying to solve, the role that you play in people's lives. And I think Kier's really brought that up. And I love the idea of being... your role is to help people answer questions. People come to the library because they have questions. They want to create something. They want to research something. They want to understand something. And that's what we're here to help people do. OK, so we've had a decent amount of cynicism about AI. And I just really want to go deep into that and just say, it's marketing speak for machine learning. Machine learning is a conspiracy by statisticians to make their work sound sexy and to increase their pay by about 100 times. I'm in no way devaluing the work that they do. It's amazing, incredible work. I can't believe how far things have come in the past few years. But ultimately, the architectures that we have, as amazing as they are, are very limited. And we've heard about some of those limitations today in terms of the biases that are embedded, the carbon footprint that's created, the water usage of all these data centers, the issues of buying into the Silicon Valley venture capitalist values. But for all of that, it is amazing. They've scraped the best of the internet, also the absolute worst of the internet. And we've heard about that already today, the issues with who you are and how you're represented in a machine learning set. Images or text or audio will really depend on how society views you as a whole. There's no escaping that lens of society when AI looks at you or when machine learning looks at you and your life. And it's really just fancy predictive text. It's the stuff that you do on your phone. It doesn't know anything about the world. It's definitely not alive. It doesn't have feelings. And I also love this image because I asked one of the image generators to do an image of special collections and rare books. And this is what it gave me. The vibe is right, but you can't actually read it. And that's kind of where AI is at the moment, I think. But for all that, it's still worth the hype. So I want to think about how we can use machine learning and AI in ways where we're getting the most value out of that and not buying into too many of the problems of that. So I wanted to talk about a project that we did at the British Library with the Alan Turing Institute, which is the UK's data science and AI Institute. We had five and a half years or so, a really big budget, a team of up to 43 people over the lifetime of the project, about 20 people at any one time. And some of the things that we managed to do with the collections I think are worth thinking about. So any of you who do historical research, you might know that census records are really hard to look through. Unless someone has an unusual name, a stable address, tracing people through census records can be really hard. We wanted to understand the impact of the mechanization of the power loom on loom workers, on manual loom workers who weren't called hand loom workers until power looms existed. So we wanted to trace people through a time of really tumultuous change, when they might be moving rapidly between locations, and also for married women who can be lost in census records when they change their name if they get married. So some of the research has developed algorithms to link people between census years, which is the kind of tool that could be used by family historians, by anyone applying research trying to look for people between slightly matched, fuzzily matched records. We had a lot of computational linguists on the project. So they were using methods to find instances where journalists writing in newspapers had assigned agency to machines. So where they'd been given a human-like agency. And I think that's a really important question, because as we heard this morning, people use AI, or they used mechanization in the 19th century to delegate responsibility. You blame the machine for changing your working conditions. You don't have to, as a boss, take on the impact that you're having on your workers. So we were looking for that kind of shift in when machines were actually given agency or blame for the changes that they were precipitating in society. We also looked for diachronic shifts, so how words changed in meaning over time. And in one project, we used computational linguistics with crowdsourcing to look at terms like car, trolley, to understand how those words went from being sort of unpowered or things moved by humans or by animals to being things moved by steam power or later petrol and other power. We also tried to contextualize our massive, I think it's 60 million editions of newspapers that we have. We digitized about 12% of them. Looking for historical information or historical power data that, in this case, came from press directories that were kind of like media buying guides for newspapers to understand the biases in the overall newspaper collection and in what we digitized. So we digitized with a commercial partner who was digitizing mostly for family history research. They're looking for name-rich newspapers, the kind of newspaper where you'll find an ancestor's name if it was mentioned. So that can inadvertently create a bias in the types of voices that are represented in the digitized corpus of newspapers. So if you're looking at what the newspapers said, you have to understand which newspapers are available to these computational methods. So we were trying to understand bias at scale using these methods. And we also did things like machine learning for working with bad, poorly transcribed text, because we've been digitizing for 20 or 30 years. So some of the early digitization is not brilliant. And it's also really expensive to go back and redo that. We are trying to redo it, but we're working with a lot of bad data at the moment. And we also taught use computer vision methods to read ordnance survey maps. But also, those methods have been developed further by biologists and other people who are using annotating images to find patterns in images. So I think it's really clear that we can use these machine learning tools and these AI tools to enrich our records. We can transcribe text. We can recognize layouts. So we can try and tell newspaper advertisements from newspaper articles and understand the different roles of those kinds of content, and not just have a kind of bucket of words approach. We can detect objects and images, link entities to images, and then we can use these tools to detect people. We've heard a lot about some of these methods already across the day. Things like clustering images by similarity or words by similarity, which means that we can then search in ways that means we're not dependent on the keywords that some cataloger has typed in, which really expands how accessible we are to people. You don't need to think like a gallery or a museum or an archive to access that content. You can think like a normal person. But it's a really long way from ready to run. That's like the biggest stretch in the metaphors that I'm using today. Things are almost ready. It's definitely worth experimenting, but you can't build one workflow and have it work for the next couple of years. You're going to have to be adapting and tweaking and being really flexible and agile in how you try and use these technologies. So we've had quite a lot about what AI can't do, what machine learning can't do. It's not intelligent. It's not conscious. It's definitely not a magic wand. It will struggle with some of your messy data, although it is getting better all the time. And over the lifetime of the five and a half year project that I mentioned, ChatGPT launched in the last six months of the project. And a lot of the things that we'd worked really, really hard to do over the earlier years in the project were suddenly anyone with any computer could do them, maybe not at the same scale. But it was a moment of, OK, things do change really, really fast. So it is getting better at coping with messy data, but it's still not brilliant. But I would encourage you to just use those magic words to get funding or to do an experiment, because just lean into the moment where if there's any money in the cultural sector, just get it while you can. So I just wanted to take a step back and think about some of the human cost of maybe introducing these technologies. I've worked a lot with crowdsourcing. I think it's an incredible framework for people engaging with collections without having to have the guts to get a reading room card to come into your venue during your opening hours. We know that people really value this. They learn a lot. They find pleasure in talking to others who are interested in the same thing. We know that people really enjoy those kind of like puzzle-like tasks, like transcribing handwriting is an enjoyable puzzle. But now that AI machine learning can do a lot of these tasks, we really need to think about how we build those kinds of opportunities for people to engage with collections. What are we losing? And how do we deal with the ethics of asking people to do work that we know machine learning could do a reasonable job of doing? So maybe we think about it being more as a form of a hobby. You can knit a jumper. It will take you hours. You'll spend a lot of money on wool. You might have to rip it out and do it again a few times. Or you could go and buy a jumper. But still, crafts, knitting, gardening, there's a lot of things that we do that aren't efficient. We do them for the joy of the task. We do them for the joy of learning. So how do we fit that into the kinds of values that we're embedding in our projects? And just a bit more cynicism. But I think this is really important to think about if you have special collections or collections that are in some way unique. It might just be the values, the country that you come from isn't the kinds of things that Silicon Valley machine learning systems will have been trained on. But this idea that AI is really good at the average, it's not great at the creative, at the lateral, at the unusual. It might look creative, but it's kind of a bit samey after a while. If any of you mark essays, you might start to find that you're seeing the same kinds of stylistic ticks appearing in text. So AI is great at the average. Let it do the average. It's not great at the unusual. So how do we think about finding the unusual, getting that in front of people instead? I think one of the big challenges, and this is really boring and down in the weeds of workflows and metadata, you can do experiments with creating enriched data through crowdsourcing, through machine learning. But then how do you get it back into your core catalog? Or how do you update your discovery systems so that they can actually include that fantastic metadata? And then how do you deal with the fact that the only digitized items or items that you have the right to process in some way are going to have that additional data where everything else that's not digitized won't? So there's a real unevenness, a lumpiness in the discoverability of your content. That might not matter if you are mostly aiming to have people see things on the screen at the same time. But if you want them to come into your reading rooms or into a viewing station, how do they find the things that aren't digitized and still have not as great metadata? How do they know that they're not seeing those in their search results? So my final slide is really just thinking about how you weave AI into your existing strategies. And this is a fantastic artwork by a Melbourne artist. So go check out their work. But really, I think it's about thinking about how you're going to use AI to support your staff and your volunteers. How are you going to talk to people about how it's going to change their work? And I think it's really important to start from the fact of you're not going to replace work or workers. You're going to enhance work. You have to have those principles. Work out your principles as an organization. How does it tie into your values? But also think hard about where it's better to have some data than no data. So maybe some crappy data is fine. And thinking about the fact that it's easier to review than it is to create. But also, don't automate too much, because AI in no way should be left to make consequential decisions. So I'll stop there so we can have some questions in the panel. But thank you for listening. Thank you all for speaking. We've got a Slido link that we're going to get up on the screen. So if you have any questions, please go there. And I will relay them. I want to start with something that I've heard you say before in the past. Thank you. Yeah. Yes. And that is that the AI we have now isn't what we need, but what we have. And however, we're at an important time where we need to work with AI and understand how to use it so we can make decisions that will impact how our organizations and GLAM uses AI as it improves. To do this, organizations need to be OK with uncertainty and experimentation. And so just a conversation to open up here with the panel is, what advice do you give organizations that might not have the resources or expertise to experiment? And how do they learn from AI during this time? I think go make friends with people who do have the resources. I'm really interested in how we, someone said that expertise is fled GLAMs. We can't afford to pay the wages that someone else could get for us in London or in Leeds. We can't pay a big team either, but we can collaborate with others to learn. So we collaborated with the Alan Shearing Institute. We do have people who can do small scale experiments in town, but I think we learn more by sharing what we've done at events like this and others, and then building on what we've done. So open source code, reusing models if we can, talking to people with collections like yours and seeing what they're doing. So being scrappy about it in some ways, I think, is just a virtue. And a lot of us have worked in the cultural sector for a long time. We've been through the peaks and troughs of funding. So we're kind of good at being scrappy. And it's a really, really constructive, friendly, collaborative sector. So I think go and talk to people. See what you can do together. Probably where I'd start. Tickets to AI4LAM in Canberra. October will be under $200. So you could go to that. There's workshops on the two days before. That was awful. Apologies. The conversational archive metaphor is useful for us. It's a kind of organizing principle. But the Ask Ava pilot that I shared outcomes of was four people, one day a week, for three months. And they had other jobs. So they were doing other work. So from a scale of investment point of view, that's really low. And we knew that we wanted to not just read about the work and not just talk about the work. We wanted to learn, do, and talk together as a community. And then stop and design our kind of way we were going to attack it. And then invest and go to enterprise. So we haven't gone to enterprise yet. We're in the process of doing that now. But we didn't want to start with enterprise uneducated. We wanted to meet peers and discuss with people. And we wanted to try some things out that were contextually relevant to our collection and our audience. And then set out on a kind of fundable path. And I think that model works. That's an extensible model to other organizations. Simon, do you have any thoughts on that? Yeah, I mean, I love the model of making friends with people who are really curious and really into it and passionate. Because those friends are usually ones that attract other people with some other skills. So yeah, I used to start up hack spaces. And those were really great places just to meet people who all had passion. Maybe each of them individually didn't know how to do things. But together, they could do amazing things. Thank you. One of the questions we had from the audience is, how do you balance the opportunities that AI tools are providing with what we heard about the ecological impacts and the problematic training of these tools? Kira, I think you were trying to solve some of those problematic training issues with what you guys are doing. You want to talk about that? Yeah, I think that's a kind of core question. So one of the things we found early on was deferring to kind of American models and American metaphors didn't work for Australian faces. I kind of quipped internally that Anh Do is probably the person on TV more in Australia than anyone else. And yet, because he's not like a white American face, the face recognition models that we got out of the American companies would never find him. And so they don't suit our context. They don't suit our collections. So make kind of intentional choices about which tools you use and how to invest. There is no uncompromised way to do this work. You need to kind of choose your compromises. And there's a term about that one of my colleagues, Grant, came up with, which was, when is good enough better than nothing? And I thought it was a really kind of useful metaphor. Sometimes good enough is better than nothing. And sometimes actually nothing is better than the work you're going to do, and you need to make that choice. When I was in Europe for a series of meetings and workshops over multiple weeks, I was in a session with some people who I thought were the best at this. And I said, well, we've been saying internally, when is good enough? And it was a chorus. It was four people in the audience who all went, better than nothing. And so it turns out that they were saying that internally, too. And so they'd made some decisions that good enough was in the relationships and the connections, but good enough wasn't good enough when it was describing something in the collection. So the machine made descriptions, but they were too biased, so they never used them. But the machine made connections that they weren't seeing, that were really valuable and the public loved, and so they did use them. So you need to find your good enough, and you need to acknowledge that sometimes you won't do a thing, because it's not the right thing to do. With the when is good enough better than nothing, I think one other thing is to not try one thing and stop and be like, oh, that's what this technology does. But instead, there are so many tools building so quickly. If you're not happy with it, keep trying. Figure out, do more research. So I guess don't use that. Something is better than nothing. Excuse to be lazy. I think there is a real tension between decarbonizing, minimizing water usage, and exploring AI. The two strategies that we're writing at the moment in the British Library, one is our green strategy, and the other is our AI strategy. And we're trying to do it together and mindfully. And we're having conversations about AI strategy, and we're also doing carbon literacy training. And it means that you do have to be mindful, but also put it into perspective. So we're also decarbonizing our buildings, thinking about where electricity comes from. There's lots of different ways that you can think about across the organization. And each stupid image that you create or each query that you run on ChatGPT has a carbon footprint. So does everything you run in Google, every email that you send. So I think training local models where you can, adapting other people's models, thinking about what are the environmentally expensive operations and minimizing those and reusing what you can. I guess it's like everything else, reduce, reuse, recycle, if you can. But they are very definitely in tension. And I think just on the point of what's good enough, it changes all the time. You can't write things off because they will be better in about a month's time. And it means if you're used to thinking monolithically about your architecture, you really have to think about what you can swap in and out. But that also means that as you try different things, you can move your systems away from the most environmentally expensive part. So it might be going back to different kinds of storage and cloud storage or whatever so that you are consciously planning how to minimize the footprint as you do everything. Can I just say something? I only learned this recently, so I was going to sound like a revelation in my voice, but it may be something you already know. Some of the origins of conservation were paintings put underground during wars and coming out looking great. And so we've built our conservation models in many cases around European caves, so temperature and humidity. Salt mines. Salt mines. Salt mines are a great one. Salt mines are a great place to keep your art. We've ported that over to, in our case, 150 years of filmmaking in Australia, or 120 years of filmmaking in Australia, including the first ever feature film on Earth was made in Australia. And most of that's lost because it wasn't kept. It wasn't treated like culture. It now is. But maybe one degree less cool is the same carbon footprint as everything we've done in terms of generative AI experimentation, maybe even a quarter of a degree. So I think it's a whole of business approach we need to take to thinking about the impact that we're making on the climate that is in crisis. Question, Kier, can you explain how the NFSA subverts the SAS model of AI services by running everything on its own servers and not training data sets in the process? I guess this is the speaking Australian. So I don't even know if I should say this. So a thing happened earlier today. I have Eric said something that really stuck with me. That was just a kind of brilliant encapsulation of some of the things that are wrong. And we have this kind of transparency approach to the work we're doing around AI. So we've got an open thread where we just post things. Curators are in it. Conservatives are in it. Preservations are in it. Technologists are in it. Project managers are in it. The finance people are in it. So that when you hear something about this work, you do it in public. All of the prototypes we did happen in public, et cetera, et cetera. So I just posted to some of the more interested people on that kind of brain bubble that I had. And just then when I sat down, my phone had blown up because they had built it. So they'd built it since 11 to 2.30. And the reason they could build it is because we've taken a we'll own the infrastructure approach. So I went to where the product is, but they built a product in five hours because we invested intentionally about knowing our own work, hiring our own people, building it on our premises, owning our models, and not paying fees for every transaction to American technology companies and running in their clouds that they get to charge back at us. So it's not just a cost thing and not just a moral thing. It's also an efficiency and responsiveness thing. You can build something in a day if you've got the right infrastructure that allows experimentation to happen internally. And so that's just like a, oh, f*ck. Yay, I suppose. Mia, this question was written kind of an Australian-centric, but I think it can apply to basically maybe any one that's data that's not American because these models are so American, right? And it's clear you talked about that. But the regulation can prevent the training, right? And though by doing that, if your country is really strict on that, you could prevent the country's voice from being heard there. So the UK, despite Brexit, we still have the EU regulations around the right to access is the right to mine. So if you have a license to access content or you've bought a copy, then you can text and data mine it. I think it gets really tricky because I think you and the American Library Association as well has come out and said that people should be able to mine content. But the second that you start to generate new content, potentially affecting creators and destroying their ability to earn an income and to create art or to create anything, then you have to be really, really careful about it. But also, it depends on your legal status. We're a legal deposit library, so we have the right to access copyright material is sent to us. We can't do things with it off premises, but we can do things on premises. So I think it's about understanding your copyright regime, how you respect creators, but also how you allow for that experimentation. Well, we're running out of time. There are wonderful questions on Slido, so I would encourage for all of you that have questions to find Mia and Kier and myself throughout the rest of the conference and Simon as well. Simon's contact information, he posted that earlier. We can get that to you as well. So please do not hesitate, and let's continue that conversation.