irResponsible AI

🎯 Outsider's Guide to AI Risk Management Frameworks: NIST Generative AI | irResponsible AI EP5S01

June 04, 2024 Upol Ehsan, Shea Brown Season 1 Episode 5
🎯 Outsider's Guide to AI Risk Management Frameworks: NIST Generative AI | irResponsible AI EP5S01
irResponsible AI
More Info
irResponsible AI
🎯 Outsider's Guide to AI Risk Management Frameworks: NIST Generative AI | irResponsible AI EP5S01
Jun 04, 2024 Season 1 Episode 5
Upol Ehsan, Shea Brown

Got questions or comments or topics you want us to cover? Text us!

In this episode we discuss AI Risk Management Frameworks (RMFs) focusing on NIST's Generative AI profile:
✅ Demystify misunderstandings about AI RMFs: what they are for, what they are not for
✅ Unpack challenges of evaluating AI frameworks 
✅ Inert knowledge in frameworks need to be activated through processes and user-centered design to bridge the gap between theory and practice.

What can you do?
🎯 Two simple things: like and subscribe. You have no idea how much it will annoy the wrong people if this series gains traction.  

🎙️Who are your hosts and why should you even bother to listen? 
Upol Ehsan makes AI systems explainable and responsible so that people who aren’t at the table don’t end up on the menu. He is currently at Georgia Tech and had past lives at {Google, IBM, Microsoft} Research. His work pioneered the field of Human-centered Explainable AI. 

Shea Brown is an astrophysicist turned AI auditor, working to ensure companies protect ordinary people from the dangers of AI. He’s the Founder and CEO of BABL AI, an AI auditing firm.

All opinions expressed here are strictly the hosts’ personal opinions and do not represent their employers' perspectives. 

Follow us for more Responsible AI and the occasional sh*tposting:
Upol: https://twitter.com/UpolEhsan 
Shea: https://www.linkedin.com/in/shea-brown-26050465/ 

CHAPTERS:
00:00 - What will we discuss in this episode?
01:22 - What are AI Risk Management Frameworks
03:03 - Understanding NIST's Generative AI Profile
04:00 - What's the difference between NIST's AI RMF vs GenAI Profile?
08:38 - What are other equivalent AI RMFs? 
10:00- How we engage with AI Risk Management Frameworks?
14:28 - Evaluating the Effectiveness of Frameworks
17:20 - Challenges of Framework Evaluation
21:05 - Evaluation Metrics are NOT always quantitative
22:32 - Frameworks are inert-- they need to be activated
24:40 - The Gap of Implementing a Framework in Practice
26:45 - User-centered Design solutions to address the gap
28:36 - Consensus-based framework creation is a chaotic process
30:40 - Tip for small businesses to amplify profile in RAI
31:30 - Takeaways 


#ResponsibleAI #ExplainableAI #podcasts #aiethics

Support the Show.

What can you do?
🎯 You have no idea how much it will annoy the wrong people if this series goes viral. So help the algorithm do the work for you!

Follow us for more Responsible AI:
Upol: https://twitter.com/UpolEhsan
Shea: https://www.linkedin.com/in/shea-brown-26050465/

irResponsible AI +
We are self-funded & independent. Hit support & get exclusive content.
Starting at $3/month
Support
Show Notes Transcript Chapter Markers

Got questions or comments or topics you want us to cover? Text us!

In this episode we discuss AI Risk Management Frameworks (RMFs) focusing on NIST's Generative AI profile:
✅ Demystify misunderstandings about AI RMFs: what they are for, what they are not for
✅ Unpack challenges of evaluating AI frameworks 
✅ Inert knowledge in frameworks need to be activated through processes and user-centered design to bridge the gap between theory and practice.

What can you do?
🎯 Two simple things: like and subscribe. You have no idea how much it will annoy the wrong people if this series gains traction.  

🎙️Who are your hosts and why should you even bother to listen? 
Upol Ehsan makes AI systems explainable and responsible so that people who aren’t at the table don’t end up on the menu. He is currently at Georgia Tech and had past lives at {Google, IBM, Microsoft} Research. His work pioneered the field of Human-centered Explainable AI. 

Shea Brown is an astrophysicist turned AI auditor, working to ensure companies protect ordinary people from the dangers of AI. He’s the Founder and CEO of BABL AI, an AI auditing firm.

All opinions expressed here are strictly the hosts’ personal opinions and do not represent their employers' perspectives. 

Follow us for more Responsible AI and the occasional sh*tposting:
Upol: https://twitter.com/UpolEhsan 
Shea: https://www.linkedin.com/in/shea-brown-26050465/ 

CHAPTERS:
00:00 - What will we discuss in this episode?
01:22 - What are AI Risk Management Frameworks
03:03 - Understanding NIST's Generative AI Profile
04:00 - What's the difference between NIST's AI RMF vs GenAI Profile?
08:38 - What are other equivalent AI RMFs? 
10:00- How we engage with AI Risk Management Frameworks?
14:28 - Evaluating the Effectiveness of Frameworks
17:20 - Challenges of Framework Evaluation
21:05 - Evaluation Metrics are NOT always quantitative
22:32 - Frameworks are inert-- they need to be activated
24:40 - The Gap of Implementing a Framework in Practice
26:45 - User-centered Design solutions to address the gap
28:36 - Consensus-based framework creation is a chaotic process
30:40 - Tip for small businesses to amplify profile in RAI
31:30 - Takeaways 


#ResponsibleAI #ExplainableAI #podcasts #aiethics

Support the Show.

What can you do?
🎯 You have no idea how much it will annoy the wrong people if this series goes viral. So help the algorithm do the work for you!

Follow us for more Responsible AI:
Upol: https://twitter.com/UpolEhsan
Shea: https://www.linkedin.com/in/shea-brown-26050465/

Upol Ehsan (00:01.256)
In this episode of Irresponsible AI, we're going to discuss the following. We'll begin by talking about something that's all the rage these days, AI risk management frameworks. There's so much confusion around them that no one knows what to do with them and more importantly, what not to do with them. So we will use a very specific example. We'll take a deeper dive with NIST's recent generative AI profile of their risk management framework or RMF to answer some of the frequently asked questions, demystify some misunderstandings,

and articulate not what just you can do with them, but more importantly, what you shouldn't do with them. If you're a new listener, we're glad you're here. If you're returning, thank you for coming back. This is Irresponsible AI, a series where you find out how not to end up on the headlines of the New York Times for all the wrong reasons. My name is Upol and I make AI systems explainable and responsible so that people who are not at the table do not end up on the menu.

Views expressed here are entirely mine and have nothing to do with any of the institutions I'm affiliated with, like Georgia Tech and Data & Society. I'm also joined by my friend and co -host,

Shea Brown (BABL AI) (01:06.594)
I'm Shea, an astrophysicist turned AI auditor, working to ensure companies are doing their best to protect ordinary people from the dangers of AI. I'm also the founder and CEO of Babl AI, an algorithmic auditing firm, but like Upol, I'm just here representing myself.

Upol Ehsan (01:23.592)
So Shea, let's start with the basics. What are AI risk management frameworks? If you could give us a quick overview of that.

Shea Brown (BABL AI) (01:31.696)
So, yeah, so risk management frameworks are exactly what they sound like. You know, organizations, whether they be governments or private companies, are exposed to a lot of risks. And so they are gonna, they need some way to structure managing those risks. And there have been a bunch of them come out. I think we'll talk about them soon, but the basic idea is it's a structured way to figure out where those risks are, how do you detect them.

How do you sort of assess how significant they are? How are you going to manage those risks? And it could be anything from like reputational, financial, process risks, compliance risks. So these frameworks are out there and they are very ubiquitous and they're used by organizations of all types to manage their risk internally. And yeah, so one big one, and as you mentioned is the NIST.

Upol Ehsan (02:20.488)
Mm -hmm.

Shea Brown (BABL AI) (02:28.689)
AI risk management framework. And of course we're talking about AI and AI has got everybody worried. And so NIST came out with a risk management framework, which was all the rage. And they came out again with this new thing quite recently, which was this Gen .AI risk management framework. And now we're trying to sort and we're getting all sorts of questions about what is this? What is this? How is it different than what came before? How do we use it? And so I think...

the mode that we should be working in is frequently asked questions. Let's just sort of go through and try to demystify what's actually happened with these things.

Upol Ehsan (03:03.08)
Yeah, so let's start with the first frequently asked question we often get is like what are the highlights? What are the key highlights of this? Gen AI profile that they had done.

Shea Brown (BABL AI) (03:15.218)
Yeah, so what is it? Basically, Gen .ai is, well, we know what it is. It's generative AI. Everybody's using it, large language models, that sort of thing. And this is a very specific set of recommendations for how to manage the risk of generative AI. And they have these things called actions.

which and they're organized grouped in certain ways under different functions in an organization. And those actions are meant to be sort of recommendations. Like I'm going to measure risk in this way, or I'm going to test the system in this way, or have some sort of accountability. So at a high level, it's just sort of organizing actions, which underneath functions, which are going to mitigate risk. But one of the big questions that we get all the time is,

Why is that different? We heard last year that there was this AI risk management framework that NIST put out. Now what is this and how is that different?

Upol Ehsan (04:12.264)
That's a good question. I think people also first need to understand that oftentimes they'll hear the acronym NIST AI RMF that stands for like NIST's AI Risk Management Framework. And actually recently during a client meeting, I heard one of the people saying NIST has created a new framework. And I think that's a misunderstanding. The generative AI profile, that's why literally on the title page of it,

it says a Generative AI Profile. So it's a profile of the risk management framework. So if people need to think about this, the AI RMF is the broader infrastructure, is the broader framework. It is not particularly focused on any specific kind of an AI. The Generative AI kind of builds on this general framework, the AI RMF, and does a profile.

it still carries the same kind of like the map, manage, govern, measure kind of functions that are there in the NIST AI framework. So for people who are trying to use a generative AI, I think it's really important for them to first understand that it is a derivative and a building on or an extension of the main one. It's an instance of, like, you know, in philosophy, Plato had the forms and the instances, right? So like the AI RMF could be the form.

and the Generative AI profile is the instance. So that's the first distinction. The second distinction in my view is that there are certain risks that are covered here like confabulations or hallucinations, IP, et cetera, that are very focused on the Generative AI profile than they were in the original AI RMF. So that's another aspect. So there's a level of scope and specificity. So the Generative AI one by definition and by design is more specific.

The third one, and I think this is more of a subtle distinction, Shrey, I would be curious to hear your thoughts on this as well, is how these two things were created. The AI RMF had, I think, a much broader and a much more consensus -based process of creation, so the original granddaddy of it, versus the generative AI seems like it was a group made of industry, civil society, and academia and research.

Upol Ehsan (06:33.672)
where they got together and this got done quicker than the first one and also had a very different kind of a makeup. I was curious if you could shed a little bit of light on like how the construction of it might have been also a bit different.

Shea Brown (BABL AI) (06:46.963)
Yeah, I think the the first one was much more the broader framework was was a bigger deal in the sense that they had a little bit of time to build consensus and they had multiple rounds of requests for comments. And I think part and that took a long time and it finally got put out and is a big deal and everybody talked about it, right? Because it was high level. This one.

Upol Ehsan (06:53.928)
Mm -hmm. Mm -hmm.

Shea Brown (BABL AI) (07:16.595)
I think was pushed by the urgency around chat GPT, large language models getting released into the wild. People started to panic, people meaning policy makers started to panic and they figure, you know, they look to NIST and say, NIST, give us some guidance here. And so they essentially assembled quite quickly a very large, it's still pretty large, but they put them all in one place, basically on Slack.

Upol Ehsan (07:42.088)
Mm -hmm. Mm -hmm.

Shea Brown (BABL AI) (07:43.058)
and said, let's talk this through and figure out what are our guidelines. And so that was a chaotic and I was in that working group, contributed very little, mostly because there were just so many voices in there. And they came up with these guidelines and they took the structure, as you said, of the different functions, govern, map, measure and manage, and then said, how does that look differently? How do we understand those in terms of generative AI?

Upol Ehsan (07:46.728)
Mm -hmm. Mm -hmm.

Upol Ehsan (08:04.392)
Mm -hmm.

Shea Brown (BABL AI) (08:12.242)
And that was, I think it was the urgency. I think that's simply, however, we're still in the process, like they're accepting comments for this profile that they just released. And so there is a chance for people who are listening now, depending on when you're listening, there's a chance to submit comments. It is something that you could comment on. And so, yeah.

Upol Ehsan (08:12.328)
Hmm.

Upol Ehsan (08:37.608)
question. So this is actually like coming from that. So we have been talking about NIST. I'm curious, like, do you think there are any other equivalent frameworks out there other than NIST? Because I don't want to make this a NIST centered thing. AI risk management frameworks are its own beast. There are many kind of places going for them. Curious to hear your thoughts on like other equivalent frameworks.

Shea Brown (BABL AI) (09:00.404)
Yeah, good question. So there are other equivalent broad frameworks for risk management. So, and in particular that focus on AI. So ISO has a number of frameworks, different certification and standards around AI. 42 ,001 for instance is what's called AI management system. And they also have a specific document for risk management of AI. So, and it's different than NIST. There's a lot of overlap.

But ISO is very rigid and kind of structured. Like if you want to get ISO certification, you have to do everything in there. And there are other general risk management frameworks like COSO has a risk management framework, enterprise risk management framework. But it doesn't have a focus on AI. But in terms of generative AI, this particular profile, it's sort of one of a kind at the moment because there hasn't really been that kind of attention on generative AI.

in particular. One thing I wanted to ask you, I'm really curious. I think a lot of people ask us, how do we engage? I have literally clients who've been like, hey, I see that you're in this consortium or you're in this working group. We want to engage with NIST. We'd like to work through this. How are you supposed to engage? There's two questions specifically, like NIST or a particular framework. And then there's a lot more generally, how do you engage with these frameworks? Period.

So do you have any thoughts on like what's the general approach here?

Upol Ehsan (10:29.416)
Mm -hmm.

Upol Ehsan (10:33.448)
I think this has been a humbling journey for me as an academic, as a researcher on how to engage with policy. So I'll share my perspectives not as an expert, but also as someone who is still figuring this out. The broader thing is it's not easy to engage. It is not intuitive, at least from an outsider's perspective. And I am not an insider, right? So this notion of public comments was new to me. I didn't understand what it actually meant. Like, you know, what is a public comment?

And turns out that there is a norm and a nomenclature to how you send a public comment. So if people want to send a public comment, my best recommendation would be to, as a starting point, right, work with someone who has done it before. And you'll see, you'll learn the ropes of like how to structure it, because again, a public comment is literally that it will be public. So your name, your institution's name, one good...

that recently Janet Haven, who is the managing executive director of Data and Society shared with us during a keynote at my workshop, the Human -Centered Explainable AI workshop at ACM CHI, was that researchers, especially independent researchers have a very interesting, powerful way of engaging with these frameworks because they can make a public comment that is just them. They don't have to worry about Shea Brown being representing Babl.

They don't have to worry about Janet Haven representing data and society or X and Y. You could make a comment that is individual and that gives you power in the sense that you could say something very upfront and directly that may not be possible if you are also giving the comment on behalf of an organization. So that's a very interesting kind of nuance there. So that is on the on the public commenting side. The other one that credit to NIST here, I will say.

And a lot of people kind of give them a lot of crap for this. I am not one of them. I really think that to your point, Shea, that the fact that they had done this generative AI profile in a timely manner, people often don't understand how long these things take. If you do them in the traditional way they have been done. So I also was part of that Slack group. The Slack working group barely contributed a lot, to be honest with you. So.

Upol Ehsan (12:58.28)
But because there were so many voices, but credit that this has been out and now it's the first version out. This is not the final version out. So the part of the engagement is also like we have a chance right now to kind of morph it and kind of contour it the way our inputs are there. The other notion of engagement is actually using it. So that's the other part. You could use it in two different ways.

I have seen my clients often want to use it as a checkbox ticking way, which I don't really appreciate, but oftentimes for, not necessarily for compliance, because this is not a compliance thing here, but more of a, did we do this kind of a way? And it really doesn't engage with the spirit of the requirement or the spirit of the framework. And I think that's a massive distinction as an organization we need to think about that when we're trying to engage with the framework.

What is the spirit behind it? Do we just want to write a line on our website saying that, hey, we engage, you know, we applied the NISQ, you know, whatever framework, right? Or the ISO. Or are we really trying to live up to the spirit of it? So that's the second part of my engagement comment is that when we are engaging it or when we are using it, are we using it to just do kind of like an ethics washing or like a compliance washing or compliance theater kind of a thing? Or are we really trying to engage with it?

with the spirit behind it. So, but throwing back to you, another question that we always get is how do we evaluate the effectiveness of a framework that is kind of this broad?

Shea Brown (BABL AI) (14:27.608)
Yeah.

Shea Brown (BABL AI) (14:41.782)
Yeah, this is really difficult. This is one of those questions that I think is at the forefront or should be at the forefront of research in AI governance and responsible AI, because we don't really have a lot of experience really assessing what kind of controls, because these are all kind of organizational controls at the end of the day. These frameworks are recommending controls that organizations put into place to mitigate risk. And

There is not a ton of research yet because of how new all of this is at what are the effective things. And we did a, we meaning Babl, we did a research project that was funded by the Tech Ethics Lab called The Current State of AI Governance. And this was a year and a half ago now. And we went around and talked to organizations and figuring out what are you doing around responsible AI? How are you measuring things? How are you assessing how effective it is? And that was the big gap that we saw.

and I don't think things have changed since then, people don't really measure the effectiveness. If I require a risk assessment or if I require this kind of testing, how effective is that at mitigating the risk? And so the short answer here is that there aren't really a lot of metrics and it's something that requires a lot of work. But it's an opportunity, I think, for organizations who are interested in, to your point,

following the spirit of these standards to think hard about what am I going to implement out of these things? What resonates with my organization? Because none of them are mandatory at the moment. But thinking about what will resonate in my organization, try to implement it in good faith, and then think about how do I measure the effectiveness of that? And something that the tech industry is good at in other areas, but here it's difficult to do that. And I think...

Yeah, I don't know what your thoughts are on that. I'd like to hear your reflections.

Upol Ehsan (16:41.608)
This is evaluation is a hard point. And one thing I hope the audience can take away from this is like, just because any organization or anyone puts out a risk management framework, unless they have given us a study or something that validates it, we do not know how well this works. And I want to make a caveat here. What do we mean by a study? Framework evaluation is very different than let's say usability testing.

and one should not conflate one with the other. What gives me an authority to say this? Because I've made frameworks. I have made frameworks that evaluate the socio -technical gap in explainable AI. And that's a really, it's a very difficult thing to evaluate. A framework is a very difficult thing to evaluate. But you can do this when you do case studies with it. And once you have enough case studies of enough varieties, you could reasonably make an argument that this, what is a framework? Let's take a step back.

A framework is something that helps you think. That is what it is. A framework is nothing but an analytic tool. It is not supposed to be a solution. It is something that helps you think through a problem in a systematic, hopefully a systematic way. So how do you evaluate? How well does this thing help you think through a problem? Well, you give it a problem. And once you give it a problem, you can run through enough case studies of enough variety to then start seeing, okay, is this going well or not?

So again, no matter who comes up, even if it's me coming up with a framework, we should always before using it ask, has there been any kind of evaluation of its effectiveness? We do not necessarily validate a framework. That term is meaningless. This is not a positivistic investigation where you just validate something or not. That's not true. And again,

This is another critique that I often hear about frameworks is like, it's too broad So I want to I want to ask what's the alternative being too specific and then being not really applicable?

Shea Brown (BABL AI) (18:46.133)
Yeah.

Shea Brown (BABL AI) (18:52.343)
Yeah, well, I think there's a, I mean, I'm glad you stepped back and talked about frameworks because, you know, there are a lot of what you see when people talk about publish, especially around responsible AI, they say, well, here's a framework. There are conceptual frameworks, right? Conceptual framework is I have a bunch of concepts like fairness, transparency, explainability, privacy, and I'm going to have some framework which will relate those concepts to each other.

Upol Ehsan (19:09.416)
Mm -hmm.

Shea Brown (BABL AI) (19:20.536)
Either some hierarchy or something that I can make better sense of those concepts. And then there are typically frameworks like this NIST one, which is a bunch of guidelines for things to do. Now, those are like prescriptive, just normative things like you should do this or that. But there's a type of framework in between, which was a procedural framework, which is supposed to tell you how...

Upol Ehsan (19:35.688)
Mm.

Upol Ehsan (19:46.728)
Yes.

Shea Brown (BABL AI) (19:50.167)
you would operationalize those concepts. And at the end, you know, it will turn into actions that you actually have to do. But then how do you sort of procedurally connect all of those together in a way that's systematic and be able to apply to a lot of things? And that's in my mind anyway, the missing piece that a lot of the research that we do at Babl, we focus on that. Like what is how do you connect those concepts?

Upol Ehsan (20:14.856)
Mm -hmm.

Shea Brown (BABL AI) (20:18.072)
and those sort of individual actions and commitments you have in a way that's going to be repeatable, that you can measure, validate in the sense that you look at what the goal is, which is to minimize risk. Well, what is that? That's reducing the frequency of something happens or the probability that it happens, which is related to the frequency, or the impact when it does happen.

And so any control you put in place, any framework that's meant to put a control in place, it had better reduce either of those two, either the impact or the frequency. And that's in my mind, the way you come up with a metric and it's not easy, but I think that's the thing. So go ahead.

Upol Ehsan (20:52.168)
Mm -hmm.

Upol Ehsan (20:58.568)
Yes.

And I wanted to build on that real quick because the other thing speaking of metrics is I feel like we need to be very careful that not everything can be quantitatively measured. Measurement doesn't always mean numbers. You can qualitatively measure things. In fact, many things in life are measured qualitatively. How much does one like someone? Right? You don't give a Likert scale to your partner and say, hey, on a scale of one to 10, maybe you do, but that would be weird. But.

There is a lot of meaningful things on this world that are measured qualitatively. So one thing that I hope the audience doesn't get away from is like there is this number worshiping that I'm seeing these days when the numbers necessarily don't mean anything on the construct that it is actually trying to study. Right. And there is this almost that the deity of data science, so to speak, has given us this fetishizing.

of quantification. And I think when we develop metrics, we have to be very careful that are the metrics truly measuring what we're trying to measure versus are they metrocizing for the sake of metrocizing, which then people often gamify and optimize. Right. So this is the problem. Classic problem with leaderboards is that leaderboards are great when they start, but then humans are really cool and really smart and they'll figure a way out on how to kind of optimize it.

Shea Brown (BABL AI) (22:18.168)
Yeah.

Shea Brown (BABL AI) (22:30.01)
Yeah. So let me add, I want to throw it back to you real quick because there was something you said when we were during the pre -show about how some of these standards are like kind of inert and something really struck me and I like the way you framed this. So I don't know if you can kind of come up off the top of your head, remember that conversation, but it was really to me felt very powerful concept around.

Upol Ehsan (22:30.12)
Upol Ehsan (22:44.904)
Mmm.

Shea Brown (BABL AI) (22:57.369)
sort of this artifact being this inert thing and like, and we need to somehow do something to it to make it useful.

Upol Ehsan (23:04.136)
Yeah, I'll try my best to remember. So, anything like a framework, a data sheet, a model card, whatever it is, these are inert artifacts and by themselves, they don't do much. We need to activate the product, right? So by activating is actually that middle layer that you talked about, Shrey, a little bit ago is like what Babl is doing in terms of creating processes, right? So,

Activation you activate something by creating a process and the inert part is the framework So a lot of the times people get very dissatisfied with frameworks. We were like, well, I can do anything with it Well, i'm sorry the the framework wasn't meant for you to do something by itself You have to activate it. You have to give it the use case. You have to do the processes There is the action that comes from the user side that I think is really important So that's one

And that actually makes me go to the other point is that there is a gap and Shea you can correct me if I'm wrong. Right now I see a massive gap between the frameworks put out by anyone. I don't want to even, regardless of agency versus how they're actually used in practice. Right? So like me using...

an ISO or a NIST framework or an OECD AI principle, I don't care what you do, like insert your favorite framework here, versus using them in practice. Like there seems to be a massive gap between the two. I'm curious if you could touch on that, because I know at Babbel you guys are doing that.

Shea Brown (BABL AI) (24:45.081)
Yeah, we have to deal with that gap a lot. And I think the problem is, and actually credit to NIST, because NIST is better than most organizations at trying to bridge that gap ahead of time. Not perfect by any means, but they do put effort into it. But the problem is a lot of, and this goes with standards a lot, like the really rigid standards, like product safety and things like this.

Upol Ehsan (24:57.8)
Mm -hmm.

Shea Brown (BABL AI) (25:14.043)
but especially for these risk management frameworks, is that they're kind of divorced in some sense from the people who actually do them or implement them or have to deal with the organizations. Now there are often industry voices in these things where they are going to articulate kind of what their companies take on these things are and sometimes that gets encoded. Sometimes their company has actually done those things. A lot of times they're kind of aspirational. And I think...

There's a big educational gap between the standard, how you're supposed to understand it, and then actually putting it into practice. Because the words don't always translate. Like, what do you mean by analyze the risk for this brand new thing which just came out, Generative AI. It has all of these features. I'm going to use it in a product of mine. There's a ton of work that needs to get done to connect those things. And I think that's the gap that, you know, a company like ours and yours.

try to play a role there. I mean, that's good for us, good for business, but I think it would be better if we try to fill those gaps during the process because I think some of the gaps exist in the standards themselves. It's not necessarily that they're hitting all the right marks and that it's just problems with people trying to implement it. I think that's also problems with the standards themselves. And so it's a bit of a back and forth that has to happen.

Upol Ehsan (26:13.064)
Mm -hmm.

Upol Ehsan (26:37.08)
Yes. And to build on that point, I think this is also, and now we are, I think, slowly switching to the solutions part of this kind of episode is what can we do about it, right? So one thing that I feel like we could do is when we make something, right? And we have learned this during UX research is that the more iterative small chunks,

you could make the better the approximation of the product market fit that you can get at the end. And one thing that I wish a lot of the people who are coming up with this frameworks often adopt is a user -centered design approach. Who is the user? Who will end up using this? Can we think from their perspective?

Are they a big business? Are they a small business? Do they have the capacity of running and retteaming exercise? If they don't, what does that look like, right? There is this like, I work with businesses of all sizes and I've started realizing that some of the frameworks and some of the suggestions in these frameworks are sometimes not even feasible, right? And that creates a problem. So if we can take a bit of a user centered design approach to your point, having a little bit of...

that could bridge that gap, not bridge, maybe like address the gap between how certain frameworks are built versus how they're put into action. And then kind of minimize the gaps as much as possible. I'm curious to hear your thoughts on something that I am personally very interested in. You know, you've been part of the AI kind of safety institute, working groups. A lot of viewers like to know like, what does it look like working on the inside? Like, what does that?

What does the how is the sausage actually made?

Shea Brown (BABL AI) (28:37.147)
Yeah, so.

It's chaotic a little bit. So there are a lot of voices. I think in these working groups, the basic idea is that it's crowdsourcing knowledge. You're trying to crowdsource some from the people who are presumably at the front lines of trying to deal with this new issue. And so it's a lot of...

Upol Ehsan (28:43.144)
Mm -hmm.

Shea Brown (BABL AI) (29:09.341)
The way NIST structures it is that they're trying to structure the conversation around key things that need to get done. Like, so what's the scope of the things that we need to cover? Weigh in on, is this the right scope? Are these the things we need to cover? They're going to incorporate that, that's going to get refined. And then once it's like, we know what we want to tackle, let's break out into working groups and task forces. And then, okay, now same thing.

Roughly what would, you know, brainstorm, brainstorm, brainstorm, refine, refine, refine, critique, critique, filter out. Then, you know, once it's done eventually, once it's done, it will be submitted like this Gen. AI profile. It will be submitted for comments. Now the funny thing is, the things we're working on now are very similar to that profile. It's the same, you know, it's the same sort of topic, but now we're gonna go deeper.

Upol Ehsan (29:55.464)
Mm -hmm.

Shea Brown (BABL AI) (30:08.156)
And the idea of this, you know, get the measurement science, which is, you know, NIST ultimately is about science and measuring things and getting some sense of like, how do we measure anywhere from like vacuum cleaner suction to, you know, AI. And it's, it's a lot about making sure your voice is heard and all voices are heard and crowdsourcing knowledge. And it's, it's.

Upol Ehsan (30:25.288)
Mm -hmm.

Shea Brown (BABL AI) (30:37.053)
It's a lot. And I think, I mean, here's a tip I'll just throw in for those of you who out there who have like a small business around AI or consulting. One of the best ways to get your name out there is actually to participate in these kind of groups or comment like we mentioned before. When there's an opportunity to comment on a law or a standard, write a comment, make it sound really smart, make it be smart, right?

Upol Ehsan (30:37.224)
Hmm.

Shea Brown (BABL AI) (31:06.588)
take your time, really put it out there and make it public because it is public and that's a great way to get your voice out there and actually lend some credibility to what you're doing. You're not just making a product, you're actually trying to contribute to our knowledge on how to operate or govern those products properly.

Upol Ehsan (31:29.64)
Yeah, so as we are nearing the end of the episode, I think what I was trying to do is maybe some takeaways. I'll start with one, Shrey, if you have any built on it, please. I think it appears to me that first of all, there is not only the fact that these frameworks are inert, they need to be activated. So there's this need for a middle layer where you companies like many companies can come in to institute processes.

that activate some of the guidelines into on the ground settings. But I think a more important thing is we need more education and outreach and evangelizing, right? Because it has been a very humbling journey for me as well on how to even understand these frameworks. What do they even mean? What are these bullet points meaning? So there has to be more education, more outreach. We cannot expect NIST to do everything by God they're understaffed.

If anyone in the administration is listening to this, please fund them, give them more staffing because you have literally given them a task that I don't think their current headcount can sustainably manage. And these people are working their butts off trying to get this out and they're doing a phenomenal job at it. But I think there is also more education, more outreach efforts that can be more grassroots, right? Just like the one that we're doing right now. This episode is fundamentally an outreach and kind of like a...

like an educational kind of episode where we're trying to address some of these gaps. If there are other people who are also kind of more socialized to these frameworks and they understand how they work, maybe they should also write a blog post or create a post of like, this is how you do it. Anything else? Did I miss anything else?

Shea Brown (BABL AI) (33:15.71)
Well, I think even sort of prior to some of those, I would say that one of the takeaways is that if you are working with generative AI in business, in a product or in a government, and you're looking for some guidance, like this is one of the few places where there has been crowdsource consensus, expert, I guess, guidance.

on how you should be managing that risk. And so there is a place now, there's a place, a document you can go to that will have at the very least good ideas for how you might approach that. And so I think it's important for people to actually go and read these things, right? They're there for a reason, even if they are inert in the beginning, try to activate them. And if you only activate a few of the recommendations, that's still a huge win, I think, for NIST and for all of us.

Upol Ehsan (34:07.176)
Absolutely. With that in mind, I'm going to close the episode. Thank you so much for listening. Based on the recent analytics data, turns out only 32 % of you are subscribing. If we can get that number bumped up to at least 50, that will make a massive difference to this channel. So if you haven't, click that subscribe button and share this episode with your friends. Thank you so much for listening.

Shea Brown (BABL AI) (34:32.351)
Thank you.


What are AI Risk Management Frameworks
Understanding NIST's Generative AI Profile
What's the difference between NIST's AI RMF vs GenAI Profile?
What are other equivalent AI RMFs?
How we engage with AI Risk Management Frameworks?
Evaluating the Effectiveness of Frameworks
Challenges of Framework Evaluation
Evaluation Metrics are NOT always quantitative
Frameworks are inert-- they need to be activated
The Gap of Implementing a Framework in Practice
User-centered Design solutions to address the gap
Consensus-based framework creation is a chaotic process
Tip for small businesses to amplify profile in RAI
Takeaways