irResponsible AI

🤯 Harms in the Algorithm's Afterlife: how to address them | irResponsible AI EP1S01

June 03, 2024 Upol Ehsan, Shea Brown Season 1 Episode 1
🤯 Harms in the Algorithm's Afterlife: how to address them | irResponsible AI EP1S01
irResponsible AI
More Info
irResponsible AI
🤯 Harms in the Algorithm's Afterlife: how to address them | irResponsible AI EP1S01
Jun 03, 2024 Season 1 Episode 1
Upol Ehsan, Shea Brown

Got questions or comments or topics you want us to cover? Text us!

In this episode of irResponsible AI, Upol & Shea bring the heat to three topics--
🚨 Algorithmic Imprints: harms from zombie algorithms with an example of the LAION dataset
🚨 The FTC vs. Rite Aid Scandal and how it could have been avoided
🚨 NIST's Trustworthy AI Institute and the future of AI regulation

You’ll also learn:
🔥 why AI is a tricky design material and how it impacts Generative AI and LLMs
🔥 how AI has a "developer savior" complex and how to solve it

What can you do?
🎯 You have no idea how much it will annoy the wrong people if this series goes viral. So help the algorithm do the work for you! 

🎙️Who are your hosts and why should you even bother to listen? 
Upol Ehsan makes AI systems explainable and responsible so that people who aren’t at the table don’t end up on the menu. He is currently at Georgia Tech and had past lives at {Google, IBM, Microsoft} Research. His work pioneered the field of Human-centered Explainable AI. 

Shea Brown is an astrophysicist turned AI auditor, working to ensure companies protect ordinary people from the dangers of AI. He’s the Founder and CEO of BABL AI, an AI auditing firm.

All opinions expressed here are strictly the hosts’ personal opinions and do not represent their employers' perspectives. 

Follow us for more Responsible AI:
Upol: https://twitter.com/UpolEhsan 
Shea: https://www.linkedin.com/in/shea-brown-26050465/ 

#ResponsibleAI #ExplainableAI #podcasts #aiethics 

Chapters:
00:00 - What is this series about?
01:34 - Personal Updates from Upol & Shea
04:35 - Algorithmic Imprint: How dead algorithms can still hurt people
06:47 - A recent example of the Imprint: LAION Dataset Scandal
11:09 - How can we create imprint-aware algorithms design guidelines?
11:53 - FTC vs Rite Aid Scandal: Biased Facial Recognition
15:48 - Hilarious mistakes: Chatbot selling a car for $1 
18:14 - How could Rite Aid prevented this scandal?
21:28 - What's the NIST Trustworthy AI Institute?
25:03 - Shea's wish list for the NIST working group?
27:57 - How AI is different as a design material
30:08 - AI has a developer savior complex
32:29 - You can move fast and break things that you can't fix
32:40 - Audience Requests and Announcements

Support the Show.

What can you do?
🎯 You have no idea how much it will annoy the wrong people if this series goes viral. So help the algorithm do the work for you!

Follow us for more Responsible AI:
Upol: https://twitter.com/UpolEhsan
Shea: https://www.linkedin.com/in/shea-brown-26050465/

irResponsible AI +
We are self-funded & independent. Hit support & get exclusive content.
Starting at $3/month
Support
Show Notes Transcript Chapter Markers

Got questions or comments or topics you want us to cover? Text us!

In this episode of irResponsible AI, Upol & Shea bring the heat to three topics--
🚨 Algorithmic Imprints: harms from zombie algorithms with an example of the LAION dataset
🚨 The FTC vs. Rite Aid Scandal and how it could have been avoided
🚨 NIST's Trustworthy AI Institute and the future of AI regulation

You’ll also learn:
🔥 why AI is a tricky design material and how it impacts Generative AI and LLMs
🔥 how AI has a "developer savior" complex and how to solve it

What can you do?
🎯 You have no idea how much it will annoy the wrong people if this series goes viral. So help the algorithm do the work for you! 

🎙️Who are your hosts and why should you even bother to listen? 
Upol Ehsan makes AI systems explainable and responsible so that people who aren’t at the table don’t end up on the menu. He is currently at Georgia Tech and had past lives at {Google, IBM, Microsoft} Research. His work pioneered the field of Human-centered Explainable AI. 

Shea Brown is an astrophysicist turned AI auditor, working to ensure companies protect ordinary people from the dangers of AI. He’s the Founder and CEO of BABL AI, an AI auditing firm.

All opinions expressed here are strictly the hosts’ personal opinions and do not represent their employers' perspectives. 

Follow us for more Responsible AI:
Upol: https://twitter.com/UpolEhsan 
Shea: https://www.linkedin.com/in/shea-brown-26050465/ 

#ResponsibleAI #ExplainableAI #podcasts #aiethics 

Chapters:
00:00 - What is this series about?
01:34 - Personal Updates from Upol & Shea
04:35 - Algorithmic Imprint: How dead algorithms can still hurt people
06:47 - A recent example of the Imprint: LAION Dataset Scandal
11:09 - How can we create imprint-aware algorithms design guidelines?
11:53 - FTC vs Rite Aid Scandal: Biased Facial Recognition
15:48 - Hilarious mistakes: Chatbot selling a car for $1 
18:14 - How could Rite Aid prevented this scandal?
21:28 - What's the NIST Trustworthy AI Institute?
25:03 - Shea's wish list for the NIST working group?
27:57 - How AI is different as a design material
30:08 - AI has a developer savior complex
32:29 - You can move fast and break things that you can't fix
32:40 - Audience Requests and Announcements

Support the Show.

What can you do?
🎯 You have no idea how much it will annoy the wrong people if this series goes viral. So help the algorithm do the work for you!

Follow us for more Responsible AI:
Upol: https://twitter.com/UpolEhsan
Shea: https://www.linkedin.com/in/shea-brown-26050465/

Upol Ehsan (00:04.561)
Hello, hello, hello. Welcome to irresponsible AI, a series where you will find out how not to end up on the New York Times headlines for all the wrong reasons. What better way to learn on how to do things by knowing what not to do? In this series, we give it to you straight because that's how we believe you deserve it. You will hear us laugh, banter, and even debate. What you see is what you get.

My name is Upol and I make AI systems responsible and explainable so that people who are not at the table do not end up on the menu. I work at Georgia Tech affiliated with many institutes including the Data and Society Research Institute. What I say here are purely my own opinions and not of the institutes that I'm affiliated with. I'm also joined by my friend.

Shea Brown (BABL AI) (00:52.522)
I'm Shea Brown. I'm a natural physicist turned AI auditor, and I'm working to ensure that companies are doing their best to protect ordinary people from the dangers of AI. I'm also the founder and CEO of Babl AI, an AI and algorithmic auditing firm. But like Upol, I'm just here representing myself.

Upol Ehsan (01:13.129)
One request before we begin, please do not hit a like or subscribe just yet. Take a moment to listen to this piece, and if and only if you find value, consider pressing that button. You have no idea how much it will annoy the wrong people if this series goes mainstream or viral. So help the algorithm do the work for you. With that said, we'll get started. So how's it going, Shea?

How have you been?

Shea Brown (BABL AI) (01:44.57)
Things are good besides the weather which is there's a big storm here in the Midwest. Things are going well. This is Yeah, this is exciting time. I'll share a tiny bit of personal news. I'm a professor of astrophysics as you know, I don't know if anybody else knows but I also am a CEO of this company and I've just made well, I gave my notice. I basically just two weeks ago.

Upol Ehsan (02:08.945)
Wow.

Shea Brown (BABL AI) (02:11.434)
I gave my notice that this semester is the last semester. And so Responsible AI and Algorithm Modeling was gonna be my full-time job starting in June. So that's the news for me. How about you? Well, it's been almost 12 years. It's been a long time. I've been teaching here for almost 12 years. And yeah, so it's a big...

Upol Ehsan (02:26.149)
How long has it been? Like at the university?

Shea Brown (BABL AI) (02:41.558)
transition. I started Babylon 2018 and it was sort of, as a lot of people who know this sort of side hustle gig, it was you know taking increasingly a lot more time and now we have a fairly big team and there's just no way that I can get away with having two jobs basically. But it's exciting, it's good news. Yeah, no, I have a permanent job. But I think that this is urgent.

Upol Ehsan (03:01.561)
But you're tenured, right?

Upol Ehsan (03:06.229)
Wow.

Shea Brown (BABL AI) (03:10.41)
You know, the work that we're doing, and I mean, everything we're gonna be talking about on this podcast is really urgently needed. And I think I feel this sense of urgency that if I don't go all in now, I'm gonna regret it later that I have this opportunity now to sort of steer a little bit in my little way the course that...

that we take as a society in terms of mitigating these risks. But in any case, that's a bit heavy, but I wanted to just sort of let you know that live here. But how about you? How are things going with you?

Upol Ehsan (03:45.841)
Things are great. I feel like last time since we spoke, I have aged like 40 years because AI moves that fast in this world. And, you know, things are great. I'm actually personally like actually want to share something. I'm also on the tenure track job market. So that has been, I'm sure as someone has gone through that process, the applications continue to get more involved. I'm very grateful that

somehow on the other side of the application and have a few interviews lined up. So that's been quite interesting. And to your point, I think the work that we end up doing and the kind of responsible AI landscape is so volatile and vibrant and rich, kind of everything everywhere all at the same time in both good and bad ways. So I'm really looking forward to digging in. So what do you think we should start with today?

Shea Brown (BABL AI) (04:39.912)
Yeah.

Well, let's start with a little bit of your research because I feel like I know that you sort of pioneered this idea of the algorithmic imprint. And that's something that's totally fascinating to me. And I wanna know kind of, are there any examples out there?

of sort of cases, things that are happening now where you think that this notion of an algorithmic imprint is really apropos, like relevant. So maybe you can kind of maybe explain briefly what it is and then if you have any examples that would be, I'd be interested to hear this.

Upol Ehsan (05:18.929)
No, thank you. I think it's funny if you say that because recently I've had a few organizations kind of reach out because we are starting to wake up to the fact that algorithms don't live forever. They die or they get decommissioned or the company runs out of money. So the question happens, our question is what happens when algorithms die, right? And this is an area of algorithmic impact assessments that I don't think...

anyone has ever looked at, especially when algorithms cause harm during the algorithm's lifetime. How do we deal with them? And the general assumption, right, is that, okay, if we stop using it, that's it. The harms are gone. And what my work on the algorithmic imprint shows is no, that is not the case. In fact, harms in the algorithm's afterlife can be more pernicious and harder to detect.

than those during it, because no one is looking at it at that point, right? So basically, algorithms have afterlives and imprints exist. And sometimes the imprints are harder to detect once things are dead. But then the question becomes, what do we do, right? When you do algorithmic auditing, like, I don't know of any current mechanism that accounts for things in the afterlife. The document literally stops

when the algorithm is decommissioned, if that, right? So there has actually been one example, I don't know if you've seen, it's the Lion dataset. It's a massive open dataset that powers a lot of the AI infrastructure. And unfortunately, they have found child pornography and child abuse kind of samples in it years after the dataset has been released. So to the creator's credit,

They have basically said, we're going to take it down, we're going to do another audit and make sure that things are as safe as possible before we put it up. And therein lies the context, right? Just because you have taken it down, does it mean the harms are gone? What about all the systems that this dataset has actually trained? What about all the other things that live on, right? Even in its depth.

Upol Ehsan (07:48.225)
So that's one quick example of the imprint. I'm curious to hear your thoughts on it. Like if you have any reflections on it from an auditing perspective as well.

Shea Brown (BABL AI) (07:57.586)
Yeah, I think it's a fascinating example, and it's really relevant given that this is such a common occurrence, that there's this sort of foundational data set that has a lot of junk in it. I mean, we can have parallels to large language models now, like the common crawl and all of the junk that gets swept up in that. And then even if you fix that foundational data set, you have...

algorithms which have been trained. And this is relevant and I think we'll probably, we'll come back to the FTC later, but this idea of disengorgement of algorithms which were trained on data sets that either were inappropriate or they shouldn't have had in the first place. And I think that starting to think about the mechanism for how do you fix this might involve something along the lines of this sort of

disengorgement of algorithms. You have to go back and say, what was trained on this to the extent possible and how do we extract, either get rid of those algorithms or extract that dangerous material? But it's a tough problem, especially with open source. These things just live and they get recycled and get recycled. How do we hunt that down? Data provenance is not something that

machine learning engineers have necessarily like had at the very forefront of their mind when they're developing these things. I mean, what's the solution here? Is do you have any proposals for how this gets managed?

Upol Ehsan (09:40.305)
I think, I don't know if I have a solution, but I think we have to start with acknowledging the fact that these things have afterlives. I think that's the basic thing that even now, like imagine this, right? Nowadays, thankfully due to the brilliant work of people like Tim DeBru and Meg Mitchell and others, you know, data sheets for data sets are a thing, model cards are a thing, right? And then metadata system cards. So these are becoming a prevalent part

of the colloquial language around algorithmic deployments. I would love to get to a point where the algorithm's afterlife is just a given. And I think that's a starting point. I don't know how this story ends, but I know where it begins. That we have to start acknowledging these, that they are afterlives. We have to have assessments for plans for assessments in the afterlife.

I don't know the answers to the questions such as how are harms addressed in the afterlife? Who to hold accountable in the afterlife? For instance, how do we even mitigate it? Right, who gives the resources to mitigate this because the company can come back and say we no longer have deployed it So we don't own it anymore, right? And it's not even quote unquote in existence But there is also a very interesting aspect all we have talked about right now are reactive things

things that we are reacting to. So the algorithm is dead, how do we do X and Y? What I'm actually currently working on is how might we develop imprint aware kind of guidelines, algorithm guidelines? So could we empower developers and data scientists to actually get imprint aware algorithm design principles such that...

when they do deploy it, the imprint is as minimal as possible or is as quote unquote biodegradable as possible, right? The metaphor of biodegradability is almost like on the algorithmic side is the imprint degradability. So that's what I've been kind of working with lately is like how do we make it more proactive? But this actually reminds me of the other topic.

Upol Ehsan (12:02.445)
that you wanted to raise, which is kind of you briefly assumed, which is the FTC versus Rite Aid, the recent news that happened. Do you want to share a little bit about like what this is about? And then maybe we can take a deeper dive into it.

Shea Brown (BABL AI) (12:07.976)
out.

Shea Brown (BABL AI) (12:15.45)
Yeah, so I mean, as probably a lot of people who are interested in watching this would know, the FTC has been really proactive about coming up with guidelines and enforcement actions as it relates to AI and automated decision systems, which is something that

I mean, we haven't been taking very much action legally in terms of coming up with new laws. And so regulators like FTC are filling in that gap. And some would say they're just doing their job because they're supposed to enforce current laws. But this is a case where FTC basically settled with Rite Aid under, I think, Article 5 or something like this. But

Basically, Rite Aid was using facial recognition technology in their stores and to do in an automated way which would flag people who are kind of on a suspect list and It would lead to action they'd notify employees in the stores and those employees would take various kinds of actions some of which would include In-person surveillance, you know following that person in the store making sure they're not stealing things

Upol Ehsan (13:04.898)
Holy crap.

Shea Brown (BABL AI) (13:28.53)
And as we know, these kind of technologies are sort of rife with problems. And I mean, not necessarily inherently, but most systems that have been tested tend to perform less well on people of color or people with darker skin, less well with women. And I think so what the FTC said is basically you write aid.

have not taken any steps to mitigate that risk. And that has led to harm of people in your store. And so they got banned from using facial recognition for five years in their stores. They have also, when they are allowed to use this, they have to erase the data that they got. That was another idea of disengorgement, in this case, data.

They have to, when they do eventually start using it again, there's a whole number of requirements, including levels of transparency, things like risk assessments, you know, the usual stuff, usual to us, but not usual to sort of ordinary corporate America. And these risk assessments are not the kind of risk assessments that a typical enterprise risk management person would do. It's more outwardly focused.

more akin to a human rights impact assessment. And so this is exciting. This is really one of the first, I mean, according to FTC, it's the first action taken directly against automated decision systems like this, and with the idea of fairness in mind. And so it's very exciting. And I think that the exciting thing is they're not waiting around for some larger legislation to happen.

Upol Ehsan (15:10.769)
Mmm.

Shea Brown (BABL AI) (15:23.878)
They are saying we have a duty and they're going to exercise all of their consumer protection duties, including this idea of fairness, which is something that they haven't really used. They used it once, like maybe last year for some automotive. They took some action against some used car dealership where it was a fairness issue, something that normally the EEOC might handle. So anyway, it's exciting. They're really taking action now and they're not waiting around for regulations.

Upol Ehsan (15:39.973)
Mm-hmm.

Upol Ehsan (15:52.165)
I want to come back to this, but the used car dealership just reminded me of the other news of how someone basically prompt hacked like a chat bot and bought a Chevy SUV for one buck. If the viewers know what they're talking about, if they can drop a link on the comments, it was hilarious. Someone said like, you know, no back seas, you know.

Shea Brown (BABL AI) (16:08.867)
Oh yes.

Upol Ehsan (16:17.817)
this is fine, you know, no back seas, this is final and firm, right? And then the chat bot is like, yeah, this is final, fine. And they say something down the lines of, you know, one buck, this is my final and best offer. And the chat bot was like, yes. Talk about irresponsible use of technology. And I think one of the things that I always want to highlight is irresponsible use of technology does not always have to have malice.

or intentional malice at the beginning of it. People might have very well intentioned leanings before they deploy a piece. I don't know if RITID intentionally decided to discriminate women and especially people of color. I'm pretty sure they might not. But irresponsibility happens when you are not just mindful of these kind of cases that can go wrong, but also when you deploy technology

understanding that, oh, once we are done, we are done, deployment is done. Instead of having a very iterative mindset where we do these checkpoints, where we do these assessments throughout the lifetime of the technology, not just at the deployment stage or pre-deployment, but throughout it so that we get a temperature check, how well is it going? Right? And your point made me think, so like where would the imprint of this technology be? I'm so glad that they asked the data to be kind of deleted.

Shea Brown (BABL AI) (17:36.386)
Yep.

Upol Ehsan (17:46.709)
Because it wasn't clear initially like whether they would use this data to train something else right that is not facial recognition And you're right. This is a first step I think the law and the legal framework is very much behind all the other things that we need to do But then the other part that I was I was curious to hear about from you is So imagine babble was hired right to kind of help

any other companies? What would Babl do that might have prevented, if any, these kind of outcomes? And it doesn't have to be Babl. It could be any company. Because in my head, I'm thinking of like, what can we learn from these irresponsible use cases so that so that we can be responsible moving forward?

Shea Brown (BABL AI) (18:28.031)
Yeah.

Shea Brown (BABL AI) (18:36.37)
Yeah, well, that's a good question. And I think it's, well, I can't say anything more, but it's actually very relevant for some of the work we're doing currently. But at Babl, we have a sort of, as part of our ethos, is ethical risk assessments, is like at the core of everything we do. And we drill it into everybody in the company.

that you never kind of make any kind of technical decision or do anything technical period until you've really dug deep into the socio-technical risks that are involved. So a company like RightAid, we would have, had they asked us, we would have done an ethical risk assessment ahead of time. Who are the stakeholders?

Who's interacting with this algorithm? Who's getting that message? The employee, what kind of training do those employees have? What understanding do they have of the limitations of the AI? And...

think about the technical aspects of that system that would potentially give rise to the risk of discrimination. So we know and we've known for a long time that facial analysis systems in general, whether it be detection or recognition, have these intrinsic problems and so test for that, figure out what thresholds are going to be problematic. If there is bias that you can't overcome, is it still worth it to deploy it?

And what kind of safeguards do you need to do to mitigate that risk of bias, whether it be through training or some, some other kind of, uh, mitigation. And I think had they gone through that, had Rite Aid gone through that process, and I can't speak to what they actually did or didn't do because I, I didn't read the full documentation. I'm not sure if anybody, if it's even public what they gave the FTC, but had they done that, I would believe that. The FTC would have.

Shea Brown (BABL AI) (20:36.218)
seen that they had done a good faith effort to mitigate because it's a real issue. People stealing things, it's a big problem in retail that a lot of theft happens and so it's not unreasonable to think of ways to mitigate that. But, had they thought about all of the risks ahead of time,

they could have demonstrated, not just demonstrated, for the purposes of showing a regulator something, but actually thought carefully about how stakeholders could be impacted in particular marginalized communities. That would have gone a long way in my opinion and my guess that the FTC would have actually said, okay, they really consider things, let's just have them fix a few things and not like have this five-year ban. And I mean, there was a really pretty severe punishment.

given and it probably could have been, would have been a lot less had they actually considered these things.

Upol Ehsan (21:31.041)
Yeah. The other day when we were chatting, you mentioned something about the NIST, US AI Safety Institute and that stuff. I'm curious like if you want to talk a little bit about that, I'd be very curious to learn a bit more.

Shea Brown (BABL AI) (21:42.718)
Yeah, I think so. You've probably seen the executive order that Biden signed a while back. It was a big deal. I mean, it still is a big deal. And it essentially had a lot of requirements for federal agencies to start implementing AI to begin with. Like a big part of the order is like, get the federal agencies to start using AI.

to save money and to provide efficiencies and just to upskill people. But then another big aspect of course was the risk. And when they say AI, what they really mean is generative AI in large language models. That's really what they meant. Unfortunately, it's sort of overtaken the vocabulary. So

NIST was sort of tasked with coming up with sort of a measurement science for these large language models, because right now, it's as you know, we're really shooting in the dark trying to figure out how to test these things. How do you ensure that they are, that you minimize risk, you test for biases, you test for security vulnerabilities, there's jailbreak issues that come up. There's so much, these are such complex models.

Upol Ehsan (22:51.302)
Mmm.

Shea Brown (BABL AI) (23:04.278)
that we're struggling to figure out how do we come up with benchmarks. And there are plenty of benchmarks out there, but not all of them are relevant for every use case. And so NIST is supposed to be coming up with some sort of guidelines for how you do this measurement and including things like red teaming, which is a hot topic, this sort of adversarial testing.

And so this consortium, this AI Safety Institute was part of the executive order. A US, and there's another equivalent one in the UK that's been formed or has been formed. And part of that institute, it's run by NIST, is that there's gonna be a consortium of like industry, nonprofit, academic people who are coming in to help because that's where a lot of the expert, that's where all the expertise is basically. And so...

Upol Ehsan (23:49.829)
Mm-hmm.

Shea Brown (BABL AI) (23:54.998)
That is that has now been formed. And I think the it's being finalized as we speak. We you know babbles part of it now and we've signed I forget what it's called but it's some fancy name memorandum of some sort you know saying that we're yeah something along these lines. I think that it wasn't quite so simple. You know it's got to be if it's federal it's got to have a lot of jargon. But but the whole the whole point is that there will be working groups trying to figure out how do we do these measurements.

Upol Ehsan (24:08.473)
memorandum of understanding? Okay, okay.

Upol Ehsan (24:26.437)
And I was chatting with one of my favorite people at NIST, Reva Schwartz, and had a brilliant chat with her. And I feel like many people don't realize the essential purpose that NIST serves in the AI ecosystem. Or at least they're starting to realize the enormous kind of purpose that it's serving, right? Not only is it

becoming kind of like a the platform on which many of us can start having cross disciplinary conversations right um advocacy in terms of making sure things are done right in the right way um I was wondering to hear your thoughts on like you know having been in these kind of working groups consortiums what do you think could be an actionable kind of

thing that we can expect tangible, something that there's enough meat in it to bite into it rather than reports on reports on reports. Something that we can expect maybe in the next six to 12 months, like any, this is a crystal ball gazing, but I'm curious to hear like maybe your best wishes.

Shea Brown (BABL AI) (25:43.142)
Yeah, well if I had my wish list, so there's what I expect might happen and then there's sort of my wish list. I'll go with my wish list. My wish list is that they are going to abandon this notion, maybe not totally abandon it, but put aside or lower the importance of this sort of very general testing of these large language models. Because...

There was a working group that's still working now talking about pre-deployment testing and things. It's a NIST working group and has thousands of members on a Slack channel and everybody's talking. It's great. There's really great conversation. But I feel like it's an uphill battle to start trying to test a large language model for any kind of use. And what we need to do is shift to this idea of...

testing for very specific use cases. And think about it more like, even though I don't want to anthropomorphize these models, it's more akin to a certification. If I'm going to have a person who's going to work as a quant, a quantitative person working for an investment firm, or if I'm going to have a banker, or if I want to have a doctor or lawyer, you know, you...

You don't test them over a huge swath of knowledge, you kind of certify them in a very narrow specific set of skills for a particular job. And I think we need to say, all right, if I'm going to have a large language model giving legal advice to lawyers who are going to interact with it in this way, then these are the ways in which I want to certify the system, the behaviors I want to make sure the system has. And I think that's a much more manageable.

way of tackling this and probably much more productive. And so my hope is that we start shifting to this really, because then you can do a risk assessment. What are the risks associated with it? Where are the ways it could fail? It just becomes much more manageable. That's my hope is that we get away from this idea of really general red teaming for just random stuff. That is important, but let's talk about certifying these systems in a really narrow specific function.

Upol Ehsan (28:01.401)
And I also want to build on that idea because I think there is this aspect of, I often, there is an argument like AI as a design material. What do I mean by this? So a stable design material is something that you can predict how it will behave as you design with it. An example is, you know, if you are using marble or wood for carving, you could, it's a stable material. How you chisel it, you can predict its behavior exactly. However,

If you've ever done pottery with clay, clay is a very unstable material. In fact, so when you do pottery, the really master potters try to get a feel for how the clay is or this piece of clay is, right? So AI, in my view, the stochasticity, so the easiest example is you enter the same prompt to chat GPT twice, you know, it's not guaranteed to give you the same output twice, right? So there is this inbuilt stochasticity. Old school machine learning took care of that by fixing the seed.

We would always fix the seed. New school large language models, I don't know how to best do that in a very useful way. You could constrain the output, but again there is this built-in stochasticity. That means that we can no longer have this expectation that somehow we will proactively think of all the use cases, all the bad things that can happen and kind of exhaust it. This does not mean that you don't try your darn best to get that done as exhaustively as possible.

That is not what I'm saying. What I'm saying is that expectation of exhausting everything is unrealistic. So what do we do? First of all, do as exhaustive as possible. Second of all, have testing part of the deployment process. Have regular testing, just like we do regular car maintenance. If you buy a car, Toyota is not going to tell you exactly how, you know, it's not Toyota's job to actually think of all the ways you can goof up the car. But there is a mechanism.

going every X months or every X kilometers or miles and getting it serviced. We don't have that service model in our technology today because often there's, you know like there's the white savior complex that we often talk about. I think there's a developer savior complex as well, right? The developer savior complex, something goes like, goes something like this. I as the developer.

Upol Ehsan (30:25.389)
will be able if I'm just good enough will be able to think of all these societal issues and can do everything that I want and it's it does a disservice to developers. It is I'm not you know going against developers I feel like it puts too much burden on them. It is expecting them to be a sociologist, an anthropologist, an ethicist, god knows whatist right. But you know yes they can gain knowledge but the solution here is to actually have

Shea Brown (BABL AI) (30:40.599)
Yeah.

Upol Ehsan (30:53.449)
multi-stakeholders in the room, an anthropologist, a sociologist, a psychologist, a historian, an ethicist, and make sure these stakeholders are first-class citizens. They are not less than the developers just because quote unquote they are non-technical, which is another word I hate by the way because everyone has technique, but because that their insights are at the same level of importance and value.

as the coding insights or the programmatic insights that a developer has. And I'm grateful that places like NIST are trying to cater to that. And just like democracy, I feel like it's a messy process. I also get very frustrated when people say, oh, it's so messy. It doesn't move as fast. Well, the world is messy. The world is squishy. The world is very much amorphous. So sometimes we just have to be patient.

Shea Brown (BABL AI) (31:48.637)
Yeah.

Upol Ehsan (31:48.685)
And we, so that's something that I've been thinking about. And your comment about the testing really reminded me of the need for multi-stakeholder equitable kind of contributions. So I think we're, yeah, go ahead, sorry.

Shea Brown (BABL AI) (32:02.002)
Yeah, well, I think this is the exciting news is that this is going to be required by the EU AI Act. And I think that probably is for another podcast, but it's this is going to be required ongoing monitoring, multi-stakeholder interactions and engagements. So, yeah.

Upol Ehsan (32:12.421)
Yes.

Shea Brown (BABL AI) (32:23.042)
We're all going to have to adjust and the move fast and break things doesn't jive with this sort of notion of patience and looking at multiple perspectives.

Upol Ehsan (32:33.517)
and you can move fast and break things that you can't fix. And that's the problem. So no, this has been such an interesting chat, Shrey. I wanted to conclude with a few things of the audience actually. We would love to have a section in the series where we answer your questions. So if you have a question, please comment it. If you have something to publicize, let's say you have a workshop coming up that you want to share with the rest of the crew.

Shea Brown (BABL AI) (32:36.947)
Mm-hmm. Yeah.

Upol Ehsan (33:02.333)
Think of like all the ways we use email listservs This is the section where we get to showcase something that you want to be showcased We cannot promise that we'll be able to cover everything But if something is relevant to our audience relevant to the to the series we would love that Especially we would love it from people who are not core kind of AI Especially from civil society nonprofits. So please share your things in a way that we can always

whatever ways we can. Let us know what you liked. In fact, if there's something that stuck out during this show or this episode, please let us know. We would love to know what it is and why it is and if there's something else you want us to discuss, please let us know in the comments. Thank you so much!

Shea Brown (BABL AI) (33:48.29)
Thank you.


What is this series about?
Personal Updates from Upol & Shea
Algorithmic Imprint: How dead algorithms can still hurt people
A recent example of the Imprint: LAION Dataset Scandal
How can we create imprint-aware algorithms design guidelines?
FTC vs Rite Aid Scandal: Biased Facial Recognition
Hilarious mistakes: Chatbot selling a car for $1
How could Rite Aid prevented this scandal?
What's the NIST Trustworthy AI Institute?
Shea's wish list for the NIST working group?
How AI is different as a design material
AI has a developer savior complex
You can move fast and break things that you can't fix
Audience Requests and Announcements