Lesson 5 – Deep Learning for Coders (2020)

Welcome to lesson five and we’ll be talking about ethics for data science and this corresponds to chapter 3 of the book I’ve also just taught a six-week version of this course, I’m currently teaching an eight-week version of this course and will release some combination or subset of that as a Fast AI and USF ethics for data science class If you want more detail, coming in July I am Rachel Thomas I am the founding director of the Center for Applied Data Ethics at the University of San Francisco and also co-founder of fastai together with Jeremy Howard My background, I have a PhD in math and worked as a data scientist and software engineer in the tech industry and then have been working at USF and on fastai for the past four years now So ethics issues are in the news These articles I think are all from this fall, kind of showing up at this intersection of how technology is impacting our world in many kind of increasingly powerful ways Many of which really raised concerns and I want to start by talking about three cases that I hope everyone working in technology knows about and is on the lookout for So even if you only watch five minutes of this video, these are kind of the three cases I want you to see and one is feedback loops And so feedback loops can occur whenever your model is controlling the next round of data you get So the data that’s returned quickly becomes flawed by the software itself And this can show up in many places One example is with recommendation systems And so recommendation systems are essentially about predicting what content the user will like but they’re also determining what content the user is even exposed to and helping determine what has a chance of becoming popular And so YouTube has gotten a lot of attention about this for kind of highly recommending many conspiracy theories, many kind of very damaging conspiracy theories There is also they’ve kind of put together recommendations of paedophilia picked out of what were kind of in home movies but when are kind of strung together, ones that happen to have young girls in bathing suits or in their pajamas So there’s some really really concerning results and this is not something that anybody intended and we’ll talk about this more later Um, I think particularly for many of us coming from a science background we are often used to thinking of like oh you know like we observe the data but really whenever you’re building products that interact with the real world you’re also kind of controlling what the data looks like Second case study I want everyone to know about it comes from software that’s used to determine poor people’s health benefits It’s used in over half of the 50 states and the Verge did an investigation on what happened when it was rolled out in Arkansas and what happened is there was a bug and the software implementation that incorrectly cut coverage for people with cerebral palsy or diabetes, including Tammy Dobbs who’s pictured here and was interviewed in the article And so these are people that really needed this health care and it was erroneously cut due to this bug and so they were really and they couldn’t get any sort of explanation and there was no appeals or recourse process in place And eventually, this all came out through a lengthy court case But it’s something where it caused a lot of suffering in the meantime And so it’s really important to implement systems with a way to identify and address mistakes and to do that quickly in a way that hopefully minimizes damage because we all know software can have bugs Our code can behave in unexpected ways and we need to be prepared for that I wrote more about this idea in a post two years ago “What HBR gets wrong about algorithms and bias” . And then the third case study that everyone should know about, so this is Latanya Sweeney who’s director of the Data Privacy lab at Harvard, has a PhD in Computer Science And She noticed several years ago that when you google her name you would get these ads saying “Latanya, Sweeney, Arrested?” implying that she has a criminal record She’s the only Latanya Sweeney and has never been arrested She paid $50 to the background check company and confirmed that she’s never been arrested She tried googling some other names and she noticed, for example, Kristen Lindquist got much more neutral ads that just say we found Kristen Lindquist even though Kristen Lindquist has been arrested three times And so being a computer scientist, Dr Sweeney to study this very systematically she looked at over 2,000 names and found that this pattern held in which disproportionately African American names were getting these ads suggesting that the person had a criminal record regardless

of whether they did and Traditionally European American or white names were getting more neutral ads And this problem of, kind of, bias in advertising shows up a ton Advertising is, kind of, the profit model for most of the major tech platforms, and it kind of continues to pop up in high-impact ways Just last year there was research showing how Facebook’s ad system discriminates even when the person placing the ad is not trying to do so So for instance the same housing ad, exact same text, if you change the photo between a white family and a black family, it served two very different audiences So this is something that can really impact people when they’re looking for housing, when they’re applying for jobs, and is a, is a definite area of concern So now I want to, kind of, step back and ask why, why does this matter And, so a very kind of extreme, extreme example, it’s just that data collection has played a pivotal role in several genocides, including, including the Holocaust And so this is a photo of Adolf Hitler meeting with the CEO of IBM at the time I think this photo was taken in 1937, and IBM continued to partner with the Nazis, kind of, long past when many other companies broke their ties They produced computers that were used in concentration camps to code whether people were Jewish, how they were executed, and this is also different from now, where you might sell somebody a computer, and they never hear from them again These machines require a lot of maintenance, some kind of ongoing relationship with vendors to, kind of, upkeep and repair them And it’s something that a Swiss judge ruled: It does not seem unreasonable to deduce that IBM’s technical assistance facilitated the task of the Nazis in the commission of their crimes against humanity Acts also involving accountancy and classification by IBM machines and utilized in the concentration camps themselves.’ I’m told that they haven’t gotten around to apologizing yet Oh, that’s I guess they’ve been busy …terrible too, yeah Okay Yeah, and so this is a very, kind of, very sobering example, but I think it’s important to keep in mind, kind of, what can go wrong, and how technology can be used for, for harm For very, very terrible harm And so this just kind of raises a question, questions that we all need to grapple with of: ‘How would you feel if you discovered that you had been part of a system that ended up hurting society?’ ‘Would you, would you even know? ‘Would you be open to finding out, kind of, how, how things you had built may have been harmful?’ and ‘How can you help make sure this doesn’t happen?’ And so I think these are questions that we all, all need to grapple with It’s also important to think about unintended consequences on how your tech could be used, or misused whether that’s by harassors, by authoritarian governments, for propaganda or disinformation And then on a, kind of a more concrete level you could even end up in jail And so there was a Volkswagen engineer who got prison time for his role in the diesel cheating case So if you remember this is where Volkswagen was cheating on emissions test, and one of the, kind of, programmers that was a part of that And that person was just following orders from what their boss told them to do, but that is not, not a good excuse for, for doing something that’s unethical And so something to be aware of So ethics is, the, the discipline dealing with what’s good and bad It’s a set of moral principles It’s not a set of answers, but it’s kind of learning what sort of, what sort of questions to ask, and even how to weigh these decisions And I’ll say some more about, kind of, ethical foundations, and different ethical philosophies later, later on in this lesson But first I’m going to, kind of, start with some, some use cases Ethics is not the same as religion, laws, social norms, or feelings Although it does have overlap with all these things It’s not a fixed set of rules It’s well founded standards of right and wrong, and this is something where clearly not everybody agrees on the ethical action in, in, every case, but that doesn’t mean that, kind of, anything goes, or that all actions are considered equally ethical There are many things that are widely agreed upon, and there are, kind of, a philosophical, philosophical underpinnings for, kind of, making these decisions And ethics is also the ongoing study and development of our ethical standards It’s a kind of, never-ending process of learning to, kind of, practice our ethical wisdom And I’m gonna refer it, several times too

So here I’m referring to a few articles from the Markkula Center for tech ethics at Santa Clara University In particular the work of Shannon Vallor, Brian Green and Irina Raicu, who is fantastic and they have a lot of resources, some of which I’ll circle back to later, later in this talk I spent years of my life studying ethics It was my major at university and I spent much time on the question of what is ethics? I think I’ll a away from that is that studying the philosophy ethics was not particularly helpful in learning about ethics Yes, and I will try to keep this kind of very, very applied and very practical Also, very kind of tech industry-specific, of what, what do you need in terms of applied ethics? Markoulis said it’s great – they somehow they take stuff that I thought was super dry and turn it into useful checklists and things I did want to note this was really neat so Casey Fiesler’s a professor at University of Colorado that I really admire and she created a crowd-sourced spreadsheet of tech ethics syllabi This was maybe two years ago and got over 200 syllabi entered into this this crowd-sourced spreadsheet and then she did a meta-analysis on them, of kind of looking at all sorts of aspects of the syllabi, and what’s being taught and how it’s being taught and … And published a paper on it What do we teach when we teach tech ethics, and a few interesting things about it? Is it raises, there a lot of ongoing discussions and lack of agreement on how to, how to best teach tech ethics Should it be a standalone course versus worked into every course in the curriculum? Who should teach it – a computer scientist, a philosopher, or a sociologist? And she analyzed for the syllabi what was the course home, and the instructor home? And you can see that the the instructors came from a range of courses, including computer science A range of disciplines, computer science, information science, philosophy, science and tech studies, engineering, law, math, business What topics to cover — a huge range of topics that can be covered, including a law and policy privacy and surveillance in equality justice and human rights, environmental impact, AI and robots, and professional ethics, work in labor, cybersecurity The list goes on and on, and so this is clearly more than can be covered in any even a full semester length course and certainly not in a kind of a single, single lecture What learning outcomes? This is an area where there’s a little bit more agreement, where kind of the number one skill that courses were trying to teach was critique, followed by spotting issues, making arguments And so a lot of this is just even learning to spot what the issues are and how to critically evaluate, kind of a piece of technology or a design proposal to see what could go wrong, what the the risks could be All right So we’re gonna go through kind of a few different core topics And as I suggested this is gonna be a kind of extreme subset of what could be covered I was trying to pick things that we think are very important and high-impact So one is recourse and accountability So I already shared this example earlier of, you know, the system that was determining poor people’s healthcare benefits having a bug And something that was kind of terrible about this was nobody took responsibility, even once the bug was found So the creator of the algorithm was interviewed and asked, they asked him you know should people be able to get an explanation for why their benefits have been cut? and he gave this very callous answer of, ‘You know, yeah, they probably should, but I should probably dust under my bed, you know, like who’s gonna do that,” which is very callous And then he ended up blaming the policymakers for how they had rolled out the algorithm The policymakers, you know, could blame the software engineers that implemented it, and so there was a lot of passing the buck here Dana Boyd has said that, you know, it’s always been a challenge for bureaucracy to assign responsibility, or bureaucracy is used to evade responsibility, and today’s algorithmic systems are often extending bureaucracy A couple of questions and comments about cultural context of any notes that there didn’t seem to be any mention of cultural contexts for ethics as part of those syllabi, and somebody else was asking, ‘How do you deal, you know, is this culturally dependent? And how do you deal with that?” It is culturally dependent I will mention this briefly later on, so I’m gonna share three different ethical philosophies that are kind of from the West, and we’ll talk just briefly of one slide on For instance right now, there are a number of indigenous data sovereignty

movements And I know the Maori data sovereignty movement has been particularly active, but different, you know, different cultures do have different views on ethics, and I think that the cultural context is incredibly important And we will not get into it tonight, but there’s also kind of a growing field of algorithmic colonialism, and, kind of, studying what are some of the issues when you have technologies built in one, you know, particular country, and culture, being implemented, you know, halfway across the world in very different cultural context, often with little to no input from people, people living in that culture And although I do want to say that there are things that are widely, although not universally, agreed on, and so, for instance the Universal Declaration on Human Rights, despite the name it is not universally accepted but many, many different countries have accepted that as a human rights framework, and as those being fundamental rights, and so there are, kind of, principles that are often held cross culturally, although, yeah, it’s rare for something, probably, to be truly, truly universal So returning to this topic of, kind of, accountability and recourse Something to keep in mind is the data contains errors And so there was a dank database used in California It’s tracking supposedly gang members, and an auditor found that there were 42 babies under the age of 1, who had been entered into this database And something concerning about the database is that it’s basically never updated, I mean people are added, but they’re not removed and so once you’re in there, you’re in there And 28 of those babies were marked as having admitted to being gang members And so keep in mind that this is just a really obvious example of the error, but how many other kind of totally wrong entries are there Another example of data containing errors involves the, the three credit bureaus in the United States The FTC’s large-scale study of credit reports found that 26% had at least one mistake in their files, and 5% had errors that could be devastating I’m gonna, this is the headline of an article that was written by a public radio reporter who went to get a an apartment and the landlord called him back afterwards and said, you know, your background check showed up that you had firearms convictions, and this person did not have any firearms convictions, and it’s something where in most cases the landlord would probably not even tell, tell you and let you know, that’s why you weren’t getting the apartment And so, and this guy looked into it I should note that this guy was white, which I’m sure helped him in getting the benefit of the doubt and found this error, and he made dozens of calls, and could not get it fixed until he told them that he was a reporter, and that he was going to be writing about it, which is something that most of us would not be able to do But it was Even once he had pinpointed the error, and he had to, you know, talk to the, you know, like, County Clerk, in the place he used to live It was still a very difficult process to get it updated, and this can have a huge, huge impact on people’s lives .There’s also the issue of when technology is used in ways that the creators may not have intended So for instance with facial recognition it is pretty much entirely being developed for adults, yet NYPD is putting the photos of children as young as age 11 into, into, databases And we know the error rates are higher This is not how it was developed So this is, this is a, a, serious, serious concern And there are a number of, kind of, misuses The Georgetown Center for Privacy and Technology, which is fantastic, you should definitely be following them, did a report, ‘Garbage In, Garbage Out […]’, looking at how police were using facial recognition in practice and they found some really concerning examples, for instance, in one case NYPD had a photo of a suspect, and they… It wasn’t returning any matches, and they said: ‘Well this person kind of looks like Woody Harrelson,’ so then they googled the actor Woody Harrelson, and put his face into the facial recognition and used that to generate leads And this is clearly not the correct use at all, but it’s, it’s a way that it’s being, it’s being used And so there’s, kind of, total lack of accountability here And then another, kind of, study of cases in all 50 states of police officers, kind of, abusing confidential databases to look up ex-romantic partners, or to look up activists and so, you know, here this is not necessarily error in the data, although that can be present

as well, but kind of keeping in mind how it can be misused by the users All right So the next topic is feedback loops and metrics And so I talked a bit about feedback loops in the beginning as kind of one of one of the three key use cases And so this is a topic, I wrote a blog post about this fall “The problem with metrics is a big problem for AI” And then together with David Uminsky who is director of The Data Institute expanded this into a paper “Reliance on metrics is a fundamental challenge for AI” And this was accepted to the “Ethics and Data Science” conference But over emphasizing metrics can lead to a number of problems including manipulation, gaming, myopic focus on short-term goals because it’s easier to track short-term quantities, unexpected negative consequences And much of AI and machine learning centers on optimizing a metric This is kind of both, you know, the strength of machine learning is it’s gotten really really good at optimizing metrics But I think this is also kind of inherently a weakness or a limitation I’m going to give a few examples And this can happen even not just in machine learning kind of but in analog examples as well So this is from a study of when English is, England’s public health system implemented a lot more targets around numbers in the early 2000s And the study was called “What’s measured is what matters” And so they found so one of the targets was around reducing ER wait times, which seems like a good goal However this led to cancelling scheduled operations to draft extra staff into the ER So if they felt like there were too many people in the ER they would just start canceling operations so they could get more doctors requiring patients to wait in queues of ambulances because time waiting an ambulance didn’t count towards your your er wait time Turning stretchers into beds by putting them in hallways, I mean there are so big discrepancies in the numbers reported by hospitals versus by patients And so if you ask the hospital on average how long are people waiting you get a very different answer than when you’re asking the patients how long did you have to wait? Another, another example is of essay grading software And so this essay grading software I believe is being used in 22 states now in the United States Yes, 20 states and it tends to focus on metrics like sentence length, vocabulary, spelling, subject verb agreement Because these are the things that we, we know how to measure and how to measure with a computer But it can’t evaluate things like creativity or novelty However gibberish essays with lots of sophisticated words score well And there are even examples of people creating computer programs to generate these kind of gibberish sophisticated essays And then there you know graded by this other computer program and highly rated And there’s also bias in this Essays by african-american students received lower grades from the computer than from expert human graders And essays by students from mainland China received higher scores from the computer than from expert human graders And authors of the study thought that they, this, this results suggest they may be using chunks of pre memorized text that score well And this is, these are just kind of two examples, I have a bunch more in the blog post and even more in the paper of ways that metrics can invite manipulation and gaming whenever they’re they’re given a lot of emphasis And this is a good hearts laws of kind of a law that a lot of people talk about And it’s this idea that the more you rely on a metric the kind of the less reliable it becomes So returning to this example of feedback loops and recommendation systems, Guillaume Chaslot is a former google, youtube engineer YouTube is owned by Google and he wrote a really great post And he’s done a ton to raise awareness about this issue and founded the nonprofit ‘AlgoTransparency’ which kind of externally tries to monitor YouTube’s recommendations He’s partnered with the Guardian and the Wall Street Journal to do investigations But he wrote a post around how kind of in the earlier days the recommendation system was designed to maximize watch time And so and this is, this is something else that’s often going on with metrics is that any metric is just a proxy for what you truly care about And so here, you know the team at Google was saying well, you know, if you’re watching more YouTube it signals to, to us that they’re happier

However this also ends up incentivizing content that tells you the rest of the media is lying because kind of believing that everybody else is lying will encourage you to spend more time on a particular platform So Guillaume wrote a great post about this kind of mechanism that’s at play and you know, this is not just YouTube This is any recommendation system could, I think be susceptible to this and there have been a lot of talk about kind of issues with many recommendation systems across platforms But it is, it is something to be mindful of, and something that the, kind of, creators of this did not anticipate And last year, Gillaume, kind of, gathered this data on So here the x-axis is the number of channels, number of YouTube channels recommending a video, and the y-axis is the log of the views, and we see this extreme outlier, which was Russia’s Today take, Russia Today’s take on the Mueller Report, and this is something that Guillaume observed, and then was picked up by the the Washington Post But this, this strongly suggests that Russia Today has perhaps gamed the the Recommendation Algorithm, which is, which is not surprising, and it’s something that I think many content creators are conscious of, and trying to, you know, experiment and see what, what gets more heavily recommended, and thus more views So it’s, it’s also important to note that our online environments are designed to be addictive, and so when, kind of, what we click on is often used as a proxy of, of what we enjoy, or what we like, that’s not necessarily though for of our, kind of, like our best selves, or our higher selves It’s, you know, it’s what we’re clicking on, in this, kind of, highly addictive environment that’s often appealing to some of our, kind of, lower instincts Zeynep Tufekci uses the analogy of a, of a cafeteria, that’s, kind of, shoving salty, sugary, fatty foods in our faces, and then learning that ‘Hey people really like, salty, sugary, fatty foods!’, which I think most of us do in a, kind of, very primal way, but we often, you know, kind of, our higher self is like ‘Oh I don’t want to be eating junk food all the time,’ and online we often, kind of, don’t have a great mechanisms to say, you know, like ‘Oh I really want to read like more long-form articles that took months to research, and are gonna take a long time to digest.’ While we may want to do that our online environments are not, not always conducive to it Yes? Sylvain made the comment about the false sense of security argument, which is very relevant to masks and things Don’t you have anything to say about its false sense of security argument? Can you say more? There’s a common feedback at the moment that people shouldn’t wear masks, because they might have a false sense of security That, kind of, makes sense to you from an ethical point of view, to be telling people that? No, I don’t think that’s a good argument at all In general there’s so many, other people including Jeremy have pointed this out There’s so many actions we take to make our lives safer, whether that’s wearing seatbelts, or wearing helmets when biking, practicing safe sex, like all sorts of things where we really want to maximize our safety, and so I think, and Zeynep Tufekci had a great thread on this today of It’s not that there can never be any sort of impact in which people have a false sense of security, but it is something that you would really want to be gathering data on, and build a strong case around, and not just assume it’s gonna happen, and that… In most cases people can think of, even if that is a small second-order effect, the effect of doing something that increases safety tends to have a much larger impact on actually increasing safety You have anything to add to that or Yes I mentioned before a lot of our incentives are focused on short term metrics Long term things are much harder to measure, and often involve kind of complex relationships And then the, the, fundamental business model of, of, most of the tech companies is around manipulating people’s behavior, and monopolizing their time, and these things I don’t think an advertising is inherently bad, but they I think it can be negative when, when taken to an extreme There’s a great essay by James Grimmelmann ‘The Platform is the Message’, and he points out: ‘These platforms are structurally at war with themselves.’ ‘The same characteristics that make outrageous & offensive content unacceptable are what make it go viral in the first place.’

And so there’s this, kind of, real tension here in which often things, yeah, that, kind of, can make content really offensive or unacceptable to us, are also what are, kind of, fuelling their popularity, and being promoted in many cases I mean this is it, this is an interesting essay, because he, he, does this, like, really in-depth dive on the ‘Tide Pod Challenge’, which was this meme around eating Tide Pods, which are poisonous, do not eat them, and he really analyzes it though It’s a great look at meme culture, which is very common and how, kind of, argues there’s probably no example of someone talking about the ‘Tide Pod Challenge’, that isn’t partially ironic, which is common in memes, that even, kind of, whatever you’re saying they’re, kind of, layers of irony, and different groups are interpreting them differently, and that even when you try to counteract them, you’re still promoting them So with the ‘Tide Pod Challenge’ a lot of, like, celebrities we’re telling people ‘Don’t eat Tide Pods’, but that was also then, kind of, perpetuating, the, the popularity of this meme So it’s This is an essay I would recommend, that I think it’s pretty insightful And so this is a We’ll get to disinformation shortly, but the the major tech platforms often incentivize and promote disinformation, and this is unintentional, but it’s, it is somewhat built into their design and architecture, their recommendation systems, and ultimately their business models And then on the On the topic of metrics I’m I just wonder, bring up, so there’s this idea of blitzscaling, and the premise is that if a company grows big enough, and fast enough, profits will eventually follow It prioritizes speed over efficiency, and risks potentially disastrous defeat, and Tim O’Reilly wrote a really great article last year talking about many of the problems with this approach, which, I would say, is incredibly widespread, and is, I would say, the fundam Kind of fundamental model underlying a lot of venture capital And in it, though, investors kind of end up anointing winners, as opposed to, to market forces It tends to lend itself towards creating monopolies and duopolies It can It’s bad for founders, and people end up kind of spreading themselves too thin So there are a number, a number of significant downsides to this Why am I bringing this up in an ethics lesson? When we were talking about metrics But hockey, hockey stick growth requires automation, and a reliance on metrics Also prioritizing speed above all else doesn’t leave time to reflect on ethics, and that is something that’s hard that I think it, you do often have to kind of pause to, to think about ethics, and that following this model, when you do have a problem, it’s often going to show up on a huge scale, if you’ve, if you’ve scaled very quickly So, I think, this is something to at least, at least be aware of So one person asks about: ‘Is there a dichotomy between AI ethics, which seems like a very First World problem, and wars, poverty, environmental exploitation, has been, kind of, a different level of a problem, I guess?’ And there’s an answer here, which somebody else, maybe you can comment on whether you agree, or I have anything to add which is that, ‘AI ethics…’, they’re saying, ‘…is very important also for other parts of the world particularly in areas with high cell phone usage For example, many countries in Africa have high cell penetration, people get their news from Facebook, and WhatsApp, and YouTube, and though it’s useful, it’s been the source of many problems’ Did you have any comments on, on kind of…? Yeah so I think the first question… So AI ethics, as I noted earlier, and I’m using the, the, phrase data ethics here, but it’s this very broad, and it refers to a lot of things I think if people are talking about the, you know, ‘In the future can computers achieve sentience and what are the ethics around that?’ and that is not my focus at all I’m very much focused on, and this is our mission with the Center for Applied Data Ethics at the University of San Francisco, is, kind of, how are people being harmed now? What are the most immediate harms? And so in that sense, I don’t think that data ethics has to be a First World or, kind of, futuristic issue It’s… It’s what’s happening now, and yeah, and as, as the person said in a few examples, well one example, I’ll get to later is definitely the, the genocide in Myanmar in which the Muslim minority, the Rohingya are experiencing genocide The UN has ruled that Facebook played a determining role in that, which is really intense, and terrible And so, I think, that’s an example of technology, yeah, leading to very real harm now

They’re also WhatsApp,, which is owned by, owned by Facebook There have been issues with people spreading disinformation, and rumors, and it’s led to several lynching, dozens of lynchings in India Of people kind of spreading these false rumors of ‘Oh there’s a kidnapper coming around’, and in these kind of small, remote villages, and then a visitor, or, or stranger shows up and gets killed WhatsApp also played a very important role, or bad role, in the election of Bolsonaro in Brazil, election of Duterte in the Philippines So, I think, technology is having a kind of very immediate impact on, on people And that Those are the types of ethical questions I’m really interested in, and that I hope, I hope you are interested in as well Do you have anything else to say about that or..? And I will, I will talk about disinformation I realize those were, kind of, some disinformation focused, and I’m gonna talk about bias first I think it’s bias, then disinformation Yes? Question Mhm ‘When we talk about ethics, how much of this is intentional unethical behavior? I see a lot of the examples as more of incompetent behavior, or bad modeling, where the product, or models are rushed without sufficient testing, or thought around bias thereforth, but not necessarily malintent Yeah, no, I agree with that I think that most of this is unintentional I do think there’s a, often, no Well We’ll get into some cases I think that, I think in many cases the profit incentives are misaligned, and I do think that, when people are earning a lot of money it is very hard to consider actions that would reduce their profits, even if they would prevent harm and increase, kind of, ethics And so I think that, you know, there’s at some point where valuing profit over how people are being harmed is, you know, when does, when does that become intentional is, you know, a question to debate, but I, you know, I don’t, I don’t think people are setting out to say like ‘I want to cause a genocide’, or ‘I want to help authoritarian leader get elected’ Most people are not are not starting with that, but I think sometimes it’s a carelessness, and a thoughtlessness, but that I, I do think we are responsible for that, and we’re responsible to kind of be more careful, and more thoughtful in how we approach things Alright So bias So bias, I think, is a, an issue that’s probably gotten a lot of attention, which is great and, I want to get a little bit more in-depth, because sometimes discussions on bias stay a bit superficial There was a great paper by Harini Suresh and John Guttag, last year, that looked at, kind of, came with this taxonomy of different types of bias, and how they had, kind of, different sources in the machine learning, kind of, pipeline And it was really helpful, because, you know, different sources have different causes, and they also require different, different approaches for addressing then The Harini wrote a blog post version of the paper as well, which I love when researchers do that I hope more of you, if you’re writing an academic paper, also write the blogpost version I’m just going to go through a few of these types So one is, representation bias, and so I would imagine many of you have heard of Joy Buolamwini’s work, which has rightly received a lot of publicity In ‘Gender Shades, she and Timnit Gebru investigated commercial computer vision products from Microsoft, IBM and Face++, and then Joy Buolamwini and Deb Raji, did a follow-up study that looked at Amazon and Kairos and several other companies, and the typical results they kind of found basically everywhere was that these products performed significantly, significantly worse on dark skinned women So they were, kind of, doing worse on people with darker skin, compared to lighter skin Worse on women than on men, and then the kind of intersection of that, dark skinned women had these very high error rates And so one example is IBM, their product was 99.7% accurate on light skinned men, and only 65% percent accurate on dark skinned women And again, this is a commercial computer vision product that was released Question? It’s a question from the TWIML study group ‘At the Volkswagen example, in many cases its management that drives and rewards unethical behavior What can an individual engineer do in a case like this? Especially in a place like Silicon Valley where people move companies so often?’ Yeah, so I think, I think that’s a great point Yeah, and that is an example where I would have, I would have much rather seen people that were higher ranking doing jail time about this, because, I think, that they were, they were driving that, and, I think, that, yeah, it’s great to remember that I know many people in the world don’t have this option, but I think for many of us working

in tech, particularly in Silicon Valley, we tend to have a lot of options, and often more options than we realize like Okay? I talk to people frequently that feel trapped in their jobs, even though, you know, they’re a software engineer in Silicon Valley, and, and so many companies are hiring And so I think it is important to use that leverage I think a lot of the, kind of, employee organizing movements are very promising, and that can be useful, but really trying to, kind of, vet the, the ethics of the company you’re joining, and also being willing to walk away if you, if, if you’re able to do so That’s a great, great question So what this, this example of representation bias here, the, kind of, way to address this is to build a more representative data set It’s very important to keep consent in mind of the, the, people, of, if you’re using pictures of people, but Joy Buolamwini and Timnit Gebru did this, as part of, as part of ‘Gender Shades’ However, this is a The fact that this was a problem not just for one company, but basically kind of every company they looked at, was due to this underlying problem, which is that in machine learning bench, benchmark datasets spur on a lot of research, however, kind of, several years ago all the, kind of, popular facial datasets were primarily of light skinned men For instance IJB-A, a kind of popular face dataset several years ago, only 4% of the images were of dark-skinned women Yes? Question: ‘I’ve been worried about COVID-19 contact tracing, and the erosion of privacy location tracking, private surveillance Companies, etc What can we do to protect our digital rights post COVID Can we look to any examples in history of what to expect?’ That is… That is a huge question, and something I have been thinking about as well I am, I’m gonna put that off till later to talk about, and that is something where in the course I teach, I have an entire unit on privacy and surveillance, which I do not in tonight’s lecture, but I can share some materials, although I am already really, even just like, rethinking how I’m gonna teach privacy and surveillance in the age of COVID-19 compared to two months ago, when I taught it the first time, but that is something I think about a lot, and I will talk about later if we have time or, or on the forums, if we, if we don’t That’s a great question A very important question On the topic And, and I will say, and I have not had the time to look into them yet, I do know that there are groups that are working on what are kind of more privacy protecting approaches for, for tracking and they’re also groups putting out, like if we are going to use some sort of tracking, what are the safeguards that need to be in place to do it responsibly Yes? I’ve been looking at that too It does seem like this is a solvable problem with, with technology Not all of these problems are, but you can certainly store tracking history on somebody’s cell phone, and then you could have something where you say when you’ve been infected, and at that point you could tell people that they’ve been infected by sharing the location, in a privacy preserving way I think some people are trying to work on that I’m not sure it’s actually technically a problem So I think there are, sometimes, there are ways to provide the minimum kind of level, you know, kind of application with, with, you know, whilst keeping privacy Yeah, and then I think it’s very important to also have things of, you know, clear like expiration date, like we, you know, like looking back at 9/11 in the United States that, kind of, ushered in all these laws, that were now, kind of stuck with, that have really eroded privacy Of anything we do around COVID-19, being very clear, we are just doing this for COVID-19, and then there’s a time limit, and expires, and it’s kind of for this clear purpose And they’re also issues though of, you know, I mentioned earlier about data containing errors, you know, this has already been an issue, and some other countries that we’re doing, kind of, more surveillance focused approaches of, you know, what about like when it’s wrong and people are getting kind of quarantined, and they don’t even know why, and for no reason, and so to be mindful of those But yeah, we’ll, we’ll talk more about this, kind of, later on Back to Back to bias Yeah, we had kind of the, the benchmarks So, when the benchmark that’s, you know, widely used, has bias, then that is really, kind of, replicated at scale, and we’re seeing this with ImageNet as well, which is, you know, probably the most widely studied computer vision dataset out there Two-thirds of the ImageNet images are from the West

So this pie chart shows that the 45% of the images in ImageNet are from the United States, 7% from Great Britain, 6% from Italy, 3% from Canada, 3% from Australia, you know, and we’re covering a lot of, a lot of this pie without having, having gotten to outside the West, and so then this has shown up in concrete ways of classifiers trained on ImageNet So one of the categories is bridegroom, a man getting married, there are a lot of, you know, cultural components to that, and so they have, you know, much higher error rates on, on bridegrooms from, from the Middle East, or from the Global South And there are, there are people now, kind of, working to diversify these datasets, but it is quite dangerous, that they can really be, kind of, widely built on its scale, or have been widely built on its scale before these biases were recognized Another key study is the COMPAS Recidivism Algorithm, which is used in determining who, who has to pay bail, so in the U.S a very large number of people are in prison, who have not even had a trial yet just, because they’re too poor to afford bail, as well as sentencing decisions and parole decisions And ProPublica, I did a famous investigation in 2016 that I imagine many of you have heard of, in which they found that the false positive rate for black defendants was nearly twice as high as for white defendants So black defendants who were… A study from Dartmouth found that it was, the software is no more accurate than Amazon Mechanical Turk workers So random people on the Internet It’s also the software is, you know, this proprietary black box using over 130 inputs, and it’s no more accurate than a linear classifier on three variables Yet it’s still in use, and it’s in use in many states, Wisconsin as one place where it was challenged, yet the Wisconsin Supreme Court upheld its use If you’re interested in the, kind of, topic of how you define fairness, because there is a lot of intricacy here, and I mean, I don’t know anybody working on this who thinks that what COMPAS is doing is, is right, but they’re using this different, different definition of fairness Arvind Narayanan has a fantastic tutorial ‘…21 fairness definitions and their politics’, that I, that I, highly recommend And so, going back to kind of this taxonomy of types of bias, this is an example of historical bias, and historical bias is a fundamental structural issue, with the first step of the data generation process, and it can exist even given perfect sampling and feature selection So, kind of, with the, the, image classifier that was something where we could, you know, go gather a more representative set of images, and that would help address it, that is not the case here So gathering kind of more data on the U.S criminal justice system it’s all going to be biased, because that’s really, kind of, baked into, baked into, our history and our current state And so this is I think good, good to recognize One thing that can be done to, try to at least, mitigate this is to, to really talk to domain experts, and by the people impacted, and so a really positive example of this is a tutorial from the Fairness, Accountability, and Transparency Conference that Kristian Lum who’s the Lead Statistician for the Human Rights Data Analysis Group, and now professor UPenn, organized together with a former Public Defender, Elizabeth Bender, who’s the staff attorney for New York’s Legal Aid Society, and Terrence Wilkerson, an innocent man who was arrested, and cannot afford bail And Elizabeth and Terrence, were able to provide a lot of insight to how the criminal justice system works in practice which is often, kind of, very different from the, you know, more, kind of, clean, logical abstractions that computer scientists deal with, but it’s really important to understand those kind of intricacies of how this is going to be implemented, and used in these, you know, messy, complicated real world systems Question? ‘Aren’t the AI biases transferred from real life biases For instance, aren’t people being treated differently isn’t everyday phenomenon too.’ That’s correct, yes So this is often, yeah, coming from, from real-world biases, and I’ll come to this in a moment, but algorithmic systems can amplify those biases, so they can make them even worse, but yeah, they are often being learned from, from existing data I… I asked it because, I guess, I often see this being raised as it’s, kind of, a reason not to worry about AI So it’s not AI

Well, I’m gonna get to that in a moment Actually, think in two slides So hold on to that question I just wanna talk about one other type of bias first Measurement bias So this was an interesting paper by Sendhil Mullainathan and Ziad Obermeyer, where they looked at historic electronic health record data to try to determine what factors are most predictive of stroke and they said, you know, this could be useful like prioritizing patients at the ER And so they found that the number one most predictive factor was prior stroke, which, that makes sense Second was, cardiovascular disease, that’s also, that seems reasonable, and then third most, kind of, still very predictive factor was accidental injury, followed by having a benign breast lump, a colonoscopy, or sinusitis And so, I’m not a medical doctor, but I can tell something weird is going on with factors three through six here Like, why would these things be predictive of, of stroke Does anyone want to think about, about why this might be? Any guesses you want to read? Oh someone’s yeah Okay, the first answer was they test for it any time someone has stroke? confirmation bias? overfitting? is because they happen to be in hospital already? Biased data? EHR records these events? Because the data was taken before certain advances in medical science? These are, these are all good guesses Not, not quite what I was looking for, but good good thinking That’s such a nice way of saying no So what, what the researchers say here is that this was about their patients, they are people that utilize health care a lot and people that don’t and they call it, kind of, High Utility versus Low Utility Of Healthcare and there are a lot of factors that go into this, I’m sure just who has health insurance and who can afford their co-pays, there may be cultural factors, there may be racial and gender bias, there is racial and gender bias on how people are treated So a lot of factors, and basically people that utilize health care a lot they will go to a doctor when they have sinusitis and they will also go in when they’re having a stroke and people that do not utilize health care much are probably not going to go in possibly for either and so, so what the authors write is that we haven’t measured stroke which is, you know, a region of the brain being denied, kind of, new blood and new oxygen; what we’ve measured is: who had symptoms, who went to the doctor, received tests and then got this diagnosis of stroke and, you know, that seems like it might be a reasonable proxy for, for who had a stroke but a proxy is you know never exactly what you wanted and in many cases that, that gap ends up being significant and so this is just one form that, that measurement bias can take but I think it’s something to really, kind of, be on the lookout for because it can be quite subtle And so now starting to return to a point that was brought up earlier, aren’t, aren’t people biased? Yes Yes, we are and so there have been dozens and dozens, if not hundreds of studies on this, but I’m just going to quote a few, all of which are linked to in this Sendhil Mullainathan New York Times article if you want to find, find the studies, so this all comes from, you know, peer-reviewed research But when doctors were shown identical files they were much less likely to recommend a helpful cardiac procedure to black patients compared to white patients, and so that was you know, same file, but just changing the race of the patient When bargaining for a used car, black people were offered initial prices $700 higher and received fewer concessions Responding to apartment rental ads on Craigslist with a black name elicited fewer responses than with a white name An all-white jury was 16 points more likely to convict a black defendant than a white one, but when a jury had just one black member it convicted both at the same rate And so I share these to show that kind of no matter what type of data you’re looking, working on, whether that is Medical data or sales data or housing data or criminal justice data that it’s very likely that there’s, there’s bias in it There’s a question: No, I was gonna say I find that last one really interesting, like this kind of idea that a single black member of the jury, I guess it has some kind of like anchoring impact, like it kind of suggests that, I’m sure you’re going to talk about diversity later, but I just want to keep this in mind that maybe even a tiny bit of diversity here just reminds people that there’s a, you know, a range of different types of people and perspectives No, that’s it, that’s a great point yeah And so the question that was, kind of, asked earlier is so why does algorithmic bias matter? Like, I have just shown you that humans are really biased too – so why are, why are we talking about algorithmic bias? And people have brought this up, kind of, like what’s what’s the fuss about it?

And there, I think algorithmic bias is a very significant, worth talking about and I’m going to share four reasons for that One is the machine learning can amplify bias So it’s not just encoding existing biases but in some cases it’s making them worse and there have been a few studies on this, one I like is from Maria De-Arteaga of CMU and here they were, they took people’s, I think job descriptions from LinkedIn, and what they found is that Imbalances ended up being compounded and so in the group of surgeons, only 14% were women however in the true positives, so they were trying to predict the, the job title from the summary, women were only 11% in the true positives So this kind of imbalance has gotten worse And basically there was, kind of, this asymmetry where the, you know, the algorithm has learned it’s safer for, for women to, kind of, not, not guess surgeon Another, so this is one reason, another reason that algorithmic bias is a concern is that algorithms are used very differently than human decision-makers in practice and so people sometimes talk about them as though they are plug-and-play and are interchangeable of, you know, with humans this bias and the algorithm is you know, this bias, why don’t we just substitute it in? However, the, the whole system around it ends up, kind of, being different in practice One One, kind of, aspect of this is people are more likely to assume algorithms are objective or error-free, even if they’re given the option of a human override, and so if you give a person, you know, even if you just say hey, I’m just giving the judge this recommendation, they don’t have to follow it, if it’s coming from a computer many people are gonna take that as objective In some cases also, there may be, you know, pressure from their boss to, you know, not disagree with the computer more times, you know, nobody’s gonna get fired by going with the computer recommendation Algorithms are more likely to be implemented with no appeals process in place, and so we saw that earlier when we were talking about recourse Algorithms are often used at scale They can be replicating an identical bias at scale And algorithmic systems are cheap And all of these, I think, are interconnected So in many cases, I think that algorithmic systems are being implemented, not because they produce better outcomes for everyone, but because they’re, kind of, a cheaper way to do things at scale, you know, offering a recourse process is more expensive Being on the lookout for errors is more expensive So this is kind of cost-cutting measures, and Cathy O’Neil talks about many of these themes in her book, ‘Weapons of Mass Destruction’, kind of, under the idea that the, ‘The privileged are processed by people; the poor are processed by algorithms.’ There’s a question? Two questions Hmm ‘This seems like an intensely deep topic, needing specialized expertise to avoid getting it wrong If you were building an ML product, would you approach an academic institution for consultation on this? Do you see a data, product, development triad becoming a quartet, involving an ethics or data privacy expert?’ Yes So I think interdisciplinary work is very important I would I would definitely focus on trying to find, kind of, domain experts on whatever your particular domain is, who understand the intricacies of that domain, is important And, I think, with the, kind of, with the academic it depends You do want to make sure you get someone who is, kind of, applied enough to, kind of, understand how, how, things are happening in, in, in industry But yeah, I think involving more people, and people from more fields is, is a good, a good approach on the whole ‘Someone invents and publishes a better ML technique, like attention or transformers, and then next a graduate student demonstrates, using it to improve facial recognition by 5%, and then a small start-up publishes an app that does better facial recognition, and then a government uses the app to study downtown walking patterns in endangered species, and after these successes, for court-ordered monitoring, and then a repressive government then takes that method to identify ethnicities, and then you get a genocide No one’s made a huge ethical error at any incremental step, yet the result is horrific I have no doubt that Amazon will soon serve up a personally customized price for each item that maximizes their profits How can such ethical creep be addressed, where the effect is remote for many small causes?’ This all Yeah, so that, that’s a, kind of a, great summary of how, yeah, these things can happen somewhat incrementally I’ll talk about some tools to implement, kind of, towards the end of this lesson, that hopefully can help us So some of it is, I think, we do need to get better at, kind of, trying to think a few more steps ahead, than we have been You know, in particular we’ve seen examples of people, you know, there was this study

of how do I identify protesters in a crowd, even when they had scarves, or sunglasses, or hats on You know, and when the, the, researchers on that were questioned, they were like, ‘Oh it never even occurred to us that bad guys would use this, you know, we just thought it would be for finding bad people’ And so I do think, kind of, everyone should be building their ability to think a few more steps ahead, and part of this is like it’s great to do this in teams, preferably in diverse teams, can help with that, that process I mean, on this question of computer vision there has been, you know, just in the last few months, is it Joe Redmon, creator of YOLO, who has said that he’s no longer working on computer vision just because he thinks the, the misuses so far outweigh the the positives, and Timnit Gebru said she’s, she’s considering that as well So, I think, there are, there are times where you have to consider And then, I think, also really actively thinking about how to, what safeguards do we need to put in place to, kind of, address the, the misuses that are happening Yes? I just wanted to say somebody really liked the Cathy O’Neil quote: ‘Privileged are processed by people; the poor processed by algorithms’ and they’re looking forward to learning more, reading more from Kathy O’Neal Is there a book that you would recommend? Yes Yeah And in… And Kathy O’Neal also writes in the… And Kathy O’Neill’s a fellow, fellow math PhD, but she also has written a number of good articles And it The book, kind of, goes through a number of those case studies of how algorithms are being used in different places So, kind of in In summary of ‘Humans are biased, why do, why are we making a fuss about algorithmic bias?’ So, one is we saw earlier Machine learning can create feedback loops So it’s, you know, it’s not just, kind of, observing what’s happening in the world, but it’s also determining outcomes, and it’s, kind of, determining what future data is Machine learning can amplify bias Algorithms and humans are used very differently in practice, and then I’ll also say technology is power, and with that comes responsibility, and I think for, for all of us to, to have access to deep learning, we’re still in a, kind of, very fortunate and small percentage of the world, that is able to use this technology right now, and I hope, I hope we will all use it responsibly, and really take our power seriously And I just, I just noticed the time, and I think we’re about to start next section on, on analyzing or, kind of, steps, steps we can take, so this would be a good, a good, place to take a break So let’s meet back in seven minutes, at 7:45 All right, let’s start back up, and actually I was at a slightly different place than I thought, but just a few questions that, that, you can ask about projects you’re working on, and I, I hope you will ask about projects you’re working on The first is, should we, ‘should we even be doing this?’, and considering that maybe there’s some work that we shouldn’t do There’s a paper ‘When the Implication Is Not to Design (Technology)’ As engineers we often tend to respond to problems with, you know, ‘What can I make or build to address this?’, but sometimes the answer is to not make or build anything One example of research that I think has a huge amount of downside, and really no upside I see was, kind of, to identify the ethnicity, particularly for people of ethnic minorities And so there was work done identifying the Chinese Uyghurs, which is the Muslim minority in Western China, which has since, you know, over a million people have been placed in internment camps And I think this is a very, very harmful, harmful line of research I think that the, you know, there have been at least two attempts of, building, building a classifier to try to identify someone’s sexuality, which is, it’s probably, just picking up on kind of stylistic differences, but this is something that a, could also be quite, quite dangerous, as in many countries it’s, it’s illegal to be gay Yes So this is a question for me, which I don’t know the answer to Yeah As that title says a Stanford scientist says he built the gaydar using ‘the lamest’ AI possible to prove a point And my understanding is, that point was to say, you know, I guess it’s something like ‘Hey, you could use fast AI Lesson 1 After an hour or two you can build this thing Anybody can do it You know, how do you feel about this idea that there’s a role to demonstrate what’s readily available with the technology we have? Yeah, I mean, that’s something that I think So I appreciate that, and I’ll talk about this a little bit later, OpenAI with GPT-2,

I think, was trying to raise a, raise a, debate around, around dual use and what is responsible release of, of dual use technology, and what’s a, kind of, responsible way to raise, raise awareness of what is possible In the, in the cases of researchers that have done this on the sexuality question, to me it hasn’t seemed like they’ve put adequate thought into, how they’re conducting that, and who they’re collaborating with, to ensure that it is something that is leading to, kind of, helping address the problem, but I think you’re right that, I think, there is probably some place for letting people, yeah, know what is probably widely available now It reminds me a bit of my pen testing in infosec… Yeah where, where it’s, kind of, considered Well, there’s an ethical way that you can go about pointing out that it’s trivially easy to break into a company’s system Yes Yeah Yes Yeah, I would, I would agree with that, that there, there is an ethical way, but I think that’s something that we as a community, still have more work to do in even determining what that is Other questions to consider are what bias is in the data and something I should highlight is, people often ask me, you know, how can I debias my data, or ensure that its bias free, and that’s not possible All data contains bias, and the, kind of, most, most important thing is just to understand, kind of, how your data set was created, and what its limitations are, so that you’re not blindsided by that bias, but you’re never going to fully remove it And some of the, I think, most promising approaches in this area are work like, Timnit Gebru’s ‘Datasheets for Datasets’, which is, kind of, going through and asking, kind of, a bunch of questions about how your dataset was created, and for what purposes, and how it’s being maintained, and you know what are the risks in that Just to really kind of be aware of, of the context of your data Can the code and data be audited? I think, particularly in the United States, we have a lot of issues with when private companies are creating software that’s really impacting people through the criminal justice system, or hiring and when these things are, you know, kind of their proprietary black boxes that are protected in court That, this creates a lot of kind of of issues of you know, what are what are our rights around that? Looking at error rates for different subgroups is really important and that’s what so kind of so powerful about Joy Buolamwini’s work If she had just looked at light-skinned versus dark skin and men versus women, she wouldn’t have identified just how poorly the algorithms were doing on dark-skinned women What is the accuracy of a simple rule-based alternative? And this is something I think Jeremy talked about last week, which is just kind of good, good machine learning practice, to have a baseline But particularly in cases like the COMPAS Recidivism, where this 130-variable black box is not doing much better than a linear classifier on three variables That raises kind of a question of why are, why are we using this? And then what processes are in place to handle appeals or mistakes, because there will be errors in the data There may be bugs and the implementation, and we need to have a process for recourse Yes Can you explain this for me now? Sorry, I’m asking my own questions, nobody voted them up at all What’s the thinking behind this idea that a simpler model, is it that you gottta say that a simpler model, all other things being the same, you should pick the simpler one? Is that what this baseline’s for? And if so, what’s the kind of thinking behind that? Well, with the COMPAS Recidivism Algorithm Some of this for me is linked to the proprietary black box nature, and so you’re right Maybe if we had a way to introspect and what were our rights around appealing something But I would say, yeah, like why use the more complex thing if the, the simpler one works the same And then how diverse is the team that built it and I’ll talk more about team diversity later, later in this lesson Okay, it was Jeremy at the start, but I’m not the teacher So it actually is, “Jeremy, Do you think transfer learning makes this tougher, auditing the data that led to the initial model?” I assume they mean “Jeremy, please ask Rachel.” No, they were, they were asking you That’s, that’s a good question Again, I think it’s important I would, I would say I think it’s important to have information probably on both datasets, what the initial data set used was and what the the data set you used to fine-tune it Do you have thoughts on that?

What she said And then I’ll say so while bias and fairness as well as accountability and transparency are important, they aren’t everything And so there’s this great paper, “A Mulching Proposal” by Os Keyes, et al And here they talk about a system for turning the elderly into high nutrient slurry, so this is something that it’s clearly unethical but they proposed a way to do it that is fair and accountable and transparent and meets these qualifications And so that kind of shows that some of the limitations of this framework, as well as kind of being a good, a good technique for kind of inspecting whatever framework you are using, of trying to find something that’s clearly unethical that code that can meet, meet the standards you’ve put forth That, that technique, I really like it It’s like, it’s my favorite technique from philosophy It’s this idea that you, you say, okay, given this premise, here’s what it implies And then you try and find an implied result which intuitively it is clearly saying And it’s a really, it’s, it’s yeah, it’s the number one philosophical thinking tool I got out of university And sometimes we can have a lot of fun with it, like this time, too Thank you All right, so the next kind of big case study, your topic I want to discuss is disinformation So in 2016, in Houston a group called Heart of Texas posted about protests outside an Islamic Center And they told people to come armed Another Facebook group posted about a counter-protest to show up supporting freedom of religion and inclusivity And so there were kind of a lot of people present at this, more people on the the side supporting freedom of religion And a reporter, though, for the Houston Chronicle noticed something odd — which he was not able to get in touch with the organizers for either side And It came out many months later that both sides had been organized by Russian trolls And so this is something where you had the people protesting were, you know, genuine Americans kind of protesting their beliefs, but they were doing it in this way that had been kind of completely framed very disingenuously by, by Russian operatives And, so when thinking about disinformation, it is not, people often think about so-called fake news, you know and inspecting like a single post — is this, you know, Is this true or false? But really disinformation is often about orchestrated campaigns of manipulation and that it involves, kind of, all the seeds of truth, kind of the best propaganda always involves kernels of truth, at least It also involves kind of misleading context and, and can, can involve very kind of sincere, sincere people that get, get swept in it A report came out this fall, an investigation from Stanford’s Internet Observatory, where Renee DiResta and Alex Stamos work, of Russia’s kind of most recent disinformation, or most recently identified disinformation campaign And it was operating in six different countries in Africa It often purported to be local news sources It was multi-platform They were encouraging people to join their whatsapp and telegram groups and they were hiring local people as reporters and a lot of, a lot of the content was not, not necessarily disinformation It was stuff on culture and sports and local weather I mean there was a lot of kind of very pro-Russia coverage But then it covered a range of topics and so this is kind of a very sophisticated phase of disinformation And in many cases it was hiring, hiring locals kind of as reporters to work for these sites And I should say well I’ve just given two examples of Russia Russia – certainly does not have a monopoly on disinformation There are plenty of, plenty of people involved and producing it Kind of on a topical topical Issue, there’s been a lot of disinformation around around Corona virus and Covid 19 I, in terms of kind of a personal level, if you’re looking for advice on spotting disinformation or to share with loved ones about this, Mike Caulfield is a great person to follow … and he’s even

So he tweets @holden and then he has started an infodemic blog specifically about the about Covid 19, but he, he talks about his approach, and how people have been trained in schools for 12 years – here’s a text, read it, use your critical thinking skills to figure out what you think about it But professional fact checkers do the opposite — they get to a page and they immediately get off of it and look for kind of higher, higher quality sources to see if they can find confirmation And Caulfield also really promotes the idea of a lot of critical thinking techniques that have been taught take a long time, and you know, we’re not going to spend 30 minutes evaluating each tweet that we see in our Twitter stream It’s better to give people an approach that they can do in 30 seconds that, you know, It’s not gonna be fail proof If you’re just doing something for 30 seconds But it’s better to to check, than to have something that takes 30 minutes that you’re just not going to do at all So I wanted to kind of put this out there as a resource I mean as a whole kind of set of lessons at lessons.checkplease.cc And he’s a, he’s a professor And I, in the data ethics course I’m teaching right now I made my first lesson the first half of which is kind of specifically about Corona virus disinformation I’ve made that available on YouTube I’ve already shared it And so I’ll add a link on the forums, If you want if you want a lot more detail on, on disinformation than just kind of this the short bit here But so going back to kind of like what is disinformation? It’s important to think of it as an ecosystem again, it’s not just a single post or a single news story that, you know, it’s misleading or has false elements in it But it’s this really this broader ecosystem Claire Wartell first draft news, who is a Leading expert on this and does a lot around kind of training journalists and how journalists can report responsibly, talks about the trumpet of amplification and this is where rumors or Memes or things can start on 4chan and 8chan and then move to closed messaging groups such as WhatsApp, Telegram, Facebook Messenger From there to conspiracy communities on Reddit or YouTube, then to kind of more mainstream social media and then picked up by the professional media and politicians And so this can make it very hard to address that it is this kind of multi-platform in many cases campaigns may be utilizing kind of the differing rules or loopholes between the different platforms And I think we certainly are seeing more and more examples where it doesn’t have to go through all these steps But can can jump jump forward- And online discussion is very, very significant because it helps us form our opinions, and then this is tough because I think most of us think of ourselves as pretty independent-minded, but discussion really does you know we evolved as kind of social beings and to be influenced by by people in our in-group and in opposition to people in our out-group and so online discussion impacts us People discuss all sorts of things online: here’s a Reddit discussion about whether the US should cut defense spending, and you have comments, you’re wrong and the defense budget is a good example of how badly the U.S. spends money on the military, and someone else says yeah, but that’s already happening, here’s a huge increase in the military budget, the Pentagon budgets already increasing I didn’t mean to sound like, stop paying for the military, I’m not saying that we cannot pay the bills, but I think it would make sense to cut defense spending Does anyone want to guess what subreddit this is from? unpopularopinion, news, changemyview, netneutrality These are good guesses but they’re wrong I love the way you say, though This is all from what it is It’s from the sub simulatorgpt2 oh, so these comments are all written by GPT-2, and this is in good fun It was clearly labeled on the subreddit, that it’s coming in GPT-2 is a language model from OpenAI That was kind of in a trajectory of research that many, many groups were on, and so it was released I guess about a year ago and, should I read the unicorn story Jeremy? Okay So many of you have probably have probably seen: this this was cherry-picked, but this is still very, very impressive So human written prompt was given to the Language Model: “in a shocking finding, scientists discovered a herd of unicorns living in a remote previously unexplored valley in the Andes mountains Even more surprising to the researchers was the fact that the unicorn spoke perfect English”

And then the next part is all generated by the language model So this is a deep learning model that produced this, and the computer model generated “dr Jorge Perez found what appeared to be a natural fountain surrounded by two peaks of rock and silver snow Perez and the others then ventured further into the valley By the time we reached the top of one peak, the water looked blue, with some crystals on tops Perez and his friends were astonished to see the Unicorn herd These creatures could be seen from the air without having to move too much to see them They were so close, they could touch their horns While examining these bizarre creatures, the scientists discovered that the creatures also spoke some fairly regular English Perez stated we can see for example that they have a common language, something like a dialect or dialectic.” And so I think this is really compelling prose, to have been generated by a computer in this form So we’ve also seen advances in computers generating pictures as specifically GANs So Katie Jones was listed on LinkedIn as a Russia and Eurasia fellow, she was connected to several people from mainstream Washington think-tanks, and The Associated Press discovered that she is not a real person This photo was generated by a GAN And so this, I think it’s kind of scary, when we start thinking about how compelling the text that’s being generated is, and combining that with pictures, these photos are all from thispersondoesnotexist.com, generated by GANs And there’s a very, very real and imminent risk that online discussion will be swamped with fake manipulative agents, to an even greater extent than it already has And this can be used to influence public opinion So oh, actually this is…well, I’ll keep going So, going back in time to 2017, the FCC was considering repealing net neutrality and so they opened up for comments to see, “How do Americans feel about net neutrality?” and this is a sample of many of the comments that were opposed to net neutrality They wanted to repeal it, and included I’ll just read a few clips “Americans as opposed to Washington bureaucrats deserve to enjoy the services they desire.” “Individual citizens as opposed to Washington bureaucrats should be able to select whichever services they desire.” “People like me as opposed to so-called experts should be free to buy whatever products they choose.” And these have been helpfully color-coded so you can see a pattern: that this was a bit of a Mad Libs, where you had a few choices (for green) for the first noun, and then in orange or red–I guess–it’s “as opposed to” or “rather than.” Orange: we’ve got either “Washington bureaucrats,” “so-called experts,” “the FCC,” and so on This analysis was done by Jeff Kao who’s now a computational journalist at ProPublica doing great work, and he did this analysis discovering this campaign, in which these comments were designed to look unique but had been created through some mail-merge-style, kind of putting together, Mad Libs Yes? So this was great work by Jeff He found that So while the FCC received over 22 million comments, less than 4% of them were truly unique This is not all malicious activity, there are many ways where you get a template to contact your legislator about something But in the example shown previously, these were designed to look like they were unique when they weren’t More than 99% of the truly unique comments wanted to keep net neutrality; however, that was not not the case if you looked at the full 22 million comments However, this was in 2017, which may not sound that long ago, but in in the field of natural language processing we’ve had an entire revolution since then–there’s just been so much progress made And this would be (I think) virtually impossible to catch today, if someone was using a sophisticated language model to generate comments So Jess asks a question, which I’m gonna treat as a two-part question I think it’s not necessarily What happens when there’s so much AI trolling that most of what gets straight from the web is AI generated text?

And then the second part: And then what happens when you use that to generate more AI generated text? Yes, for the first part Yeah, this is a real risk, or not “risk,” but kind of challenge we’re facing of real humans can get drowned out when so much text is gonna be AI trolling We’re already seeing, and I (in the interest of time I can talk about disinformation for hours and I had to cut a lot of stuff out) but many people have talked about how the new form of censorship is about drowning people out So it’s not necessarily forbidding someone from saying something but just totally, totally just drowning them out with the massive quantity of text and information and comments And AI can really facilitate that, and so I do not have a good solution to that In terms of AI learning from AI text, I mean, I think you’re gonna get systems that are potentially less and less relevant to humans, and may have harmful effects if they’re being used to create software that is interacting with or impacting humans, so that’s a concern I mean one of the things I find fascinating about this is: we could get to a point where 99.99% of tweets and fastai forum posts, and whatever, are auto-generated Particularly on more like political-type places where a lot of it’s pretty low content, pretty basic The thing is, like if it was actually good: you wouldn’t even know! So what if I told you that 75% of the people you’re talking to on the forum right now are actually bots? How can you tell which ones they are? How would you prove whether I’m right or wrong? Yeah, I think this is a real issue on Twitter Particularly people you don’t know of, wondering like is this an actual person or a bot? I think it’s a common question people wonder about and can be hard to tell But, I think it has significance for- has a lot of significance for- kind of how human government works, you know I think there’s something about humans being in society and having norms and rules and mechanisms that this can really undermine and make difficult So, when GPT2 came out, Jeremy Howard, co-founder of fastai, was quoted in the Verge article on it, “I’ve been trying to warn people about this for a while We have the technology to totally fill twitter, email, and the web up with reasonable sounding, context appropriate prose, which would drown out all other speech and be impossible to filter.” So, one kind of step towards addressing this is the need for digital signatures Oren Etzioni, the head of the Allen Institute on AI, wrote about this in HBR He wrote, “Recent developments in AI point to an age where forgery of documents, pictures, audio recordings, videos, and online identities will occur with unprecedented ease AI is poised to make high fidelity forgery inexpensive and automated, leading to potentially disastrous consequences for democracy, security, and society.” and proposes kind of digital signatures as a means for authentication And, I will say here kind of one of the additional risks of kind of all this forgery and fakes is that it also undermines people speaking the truth And, Zeynep Tufekci, who does a lot of research on protests around the world and in different social movements, has said that she’s often approached by kind of whistleblowers and dissidents who in many cases will risk their lives to try to publicize like a wrongdoing or human rights violation only to have kind of bad actors say, “oh, that picture was photoshopped- that was faked” and that it’s kind of now this big issue for for whistleblowers and dissidents of how can they verify what they are saying and that kind of need for verification And then someone you should definitely be following on this topic is Renee DiResta and she wrote a great article with Mike Godwin last year framing that we really need to think disinformation as a cybersecurity problem, you know

It sees kind of coordinated campaigns of manipulation and bad actors and there’s, I think, some important work happening at Stanford, as well on this Alright, questions on disinformation? Okay, so our next step Ethical Foundations So, now the fastai approach we always like to kind of ground everything in what are the real-world case studies before we get to kind of the theory underpinning it- and I’m not going to go too deep on this at all And, so there is a fun article, “what would an Avenger do?” And, hat tip to Casey Fiesler for suggesting this And, it goes through kind of three common ethical philosophies: Utilitarianism and gives the example of Iron Man I’m trying to match good Deontological Ethics of Captain America being an example of this adhering to the right And, then Virtue Ethics, Thor living by a code of honor and so I thought that was a nice reading Rachel: Yes? Question: Where do you stand on the argument that social media companies are just neutral platforms and that problematic content is the entire responsibility of the users just the same way that phone companies aren’t held responsible when phones are used for scams or car companies held responsible when vehicles are used for, say, terrorist attacks? Rachel: So, I do not think that the platforms are neutral because they make a number of design decisions and enforcement decisions around even kind of what their Terms of Service are and how those are enforced And, keeping in mind harassment can drive many people off of platforms and so kind of many of those decisions is not that “Oh, everybody gets to keep free speech when there’s no enforcement” It’s just changing kind of who is silenced I do think that there are a lot of really difficult questions that are raised about this because I also think that the platforms, you know, they’re not publishers But, they are in this I think kind of a intermediate area where they are performing many of the functions that publishers used to perform So, you know like a newspaper would be, which is curating which articles are in it, which is not what platforms are doing, but they are getting closer closer to that I mean, something I come back to is that it is an uncomfortable amount of power for private companies to have Yeah, and so it does raise a lot of difficult decisions But I, I do not believe that they are they are neutral So, for for this part I mentioned the Markkula Center earlier, definitely check out their site, Ethics in Technology Practice They have a lot of useful, useful resources And I’m gonna go through these relatively quickly as just kind of examples So they give some kind of deontological questions that technologists could ask And so deontological ethics or where you kind of have various kind of rights or duties that you might want to respect And this can include principles like privacy or autonomy How might the dignity and autonomy of each stakeholder be impacted by this project? What considerations of trust and of justice are relevant? Does this project involve any conflicting moral duties to others? In some cases, you know, there’ll be a kind of conflict between different, different rights or duties you’re considering And so this is, this is kind of an example, and they have more, more in the reading of the types of questions you could be asking, kind of when evaluating, of just, even how do you evaluate, kind of whether, whether a project is ethical Consequentialist questions — who will be directly affected, who will be indirectly affected? Will the (and consequentialist includes utilitarianism as well as common good), will the effects in aggregate create more good than harm, and what types of good and harm? Are you thinking about all the relevant types of harm and benefits including psychological, political, environmental, moral, cognitive, emotional, institutional, cultural? Also looking at long term, long term benefits and harms, and then who experiences them? Is this something where the risk of the harm are going to fall disproportionately on the least powerful?

Who’s going to be the ones to accrue the benefits? Have you considered dual use? And so these are, these are again kind of questions you could use when trying to, trying to evaluate a project And I think, and the recommendation of the Markkula Center is that this is a great activity to kind of to be doing as a team and as a group Yes? I was gonna say like I can’t, I can’t overstate how useful this tool is Like it, you know, I think, “Oh, it’s just a list of questions.” Yeah, you know, but like this is kind of to me, this is that this is the big gun tool for for how you, how you handle this It’s like if somebody is helping you think about the right set of questions and then you let go through them with a diverse group of people and discuss the questions I mean that’s, this is, this is gold Like don’t, you know, go back and reread these And don’t, don’t just skip over them Take them to work Use them next time you’re talking about a project They’re a really great, great set of questions to use A great tool in your toolbox And go to the original reading, has even kind of more detail and more elaboration on the questions And then they kind of give a summary of five potential ethical lenses The, the rights approach — which option in best respects the rights of all who have a stake? The justice approach — which option treats people equally or proportionally? And so these two are both deontological The utilitarian approach — which option will produce the most good and do the least harm? The common good approach — which option best serves the community as a whole, not just some members? And so here three and four are both a consequentialist And then — virtue approach, which option leads me to act as the sort of person I want to be and that can involve, you know particular virtues of you know, do you value trustworthiness or truth or courage? And so I mean a great activity if this is something that you’re studying or talking about at work with your teammates, the, the Markkula Center has a number of case studies that you can talk through and will even ask you to kind of evaluate them, you know evaluate them through these five lenses And how does that kind of impact your, your take on what the what the right thing to do is It’s kind of weird for a programmer or a computer programmer or data scientist in some way in some ways to like think of these as tools like fast.ai or pandas, or whatever, but I mean they absolutely are This is like, these are like software tools for your brain, you know, to help you kind of go through a program that might help you debug your thinking Great Thank you And then as someone brought up earlier, so that was a kind of very Western centric intro to ethical philosophy there are other ethical lenses and other cultures And I’ve been doing some reading particularly on the, the Maori worldview I don’t feel confident enough in my understanding that I could represent it, but it is very good to be mindful that there are other other ethical lenses out there and I do very much think that, you know, the people being impacted by a technology like, their, their ethical lens is kind of what matters And that, this is, is a particular issue and we have so many kind of multinational corporations There’s an interesting project going on in New Zealand now where the New Zealand government is kind of considering its AI approach and is at least ostensibly kind of wanting to, wanting to include the Maori view on that So that’s a that’s kind of a little, a little bit of theory But now I want to talk about some kind of practices you can implement in the workplace Again, this is from the Markkula Center So this is their ethics toolkit, which I particularly like And I’m just, I’m not going to go through all of them, I’m just going to tell you a few of my favorites So, Tool 1 is Ethical Risk Sweeping and this I think is similar to the idea of, kind of pen testing (that Jeremy mentioned earlier from security), but to have regularly-scheduled ethical risk sweeps And while no vulnerability, vulnerabilities found is generally good news, that doesn’t mean that it was a wasted effort And you keep doing it, keep looking for, for ethical risk One moment And then assume that you missed some risk in the initial project development Also, you have to set up the incentives properly where you’re rewarding team members for spotting new ethical risk All right So got some comments here So my comment here is about the learning rate finder, and I’m not going to bother with the exact mathematical definition (partly because I’m a terrible mathematician and partly because it doesn’t matter), but if you just remember, oh, sorry, that’s actually not me, I am just

reading something that Patty Hendrix has trained a language model of me So that was me greeting the language model of me, not the real me Thank you This is a Tool 1 I would say another kind of example of this, I think it’s like red teaming of, you know, having a team within your org that’s kind of trying to find your vulnerabilities Tool 3, another one I really like, Expanding the Ethical Circle So whose interests, desires, skills, experiences and values have we just assumed rather than actually consulted? Who are all the stakeholders who will be directly affected? And have we actually asked them what their interests are? Who might use this product that we didn’t expect to use it, or for purposes that we didn’t initially intend? And so then a great implementation of this comes from the University of Washington’s Tech Policy Lab did a project called Diverse Voices And it’s neat, they have both a academic paper on it and then they also kind of have like a guide, lengthy guide on how you would implement this But the idea is how to kind of organize expert panels around, around new technology, and so they, they did a few samples One was they were considering augmented reality and they held expert panels with people with disabilities, people who are formerly are currently incarcerated, and with women to get their get their input and make sure that that was included They did a second one on an autonomous vehicle strategy document and organized expert panels with youth, with people that don’t drive cars, and with extremely low-income people And so I think this is a great guide if you’re kind of unsure of how do you even go about setting something like this up to expand your circle, include, include more people and get, get perspectives that may be underrepresented by your employees So I just want to let you know that this resource is out there Tool 6 is Think About the Terrible People And, and this can be hard because I think we’re often, you know, thinking kind of positively, or thinking about people like ourselves who don’t have terrible intentions But really, think about who might want to abuse, steal, misinterpret, hack, destroy, or weaponize what we build? Who will use it with alarming stupidity or irrationality? What rewards, incentives, openings, has our design inadvertently created for those people? And so kind of remembering back to the section on metrics, you know, how are people going to be trying to game or manipulate this? And how can how can we then remove those rewards or incentives? And so this is, this is an important kind of important step to take And then, Tool 7 is Closing the Loop, Ethical Feedback and Iteration, remembering this is never a finished task and identifying feedback channels that will give you kind of reliable data and integrating this process with quality management and user support and developing formal procedures and chains of responsibility for ethical iteration And this tool reminded me of a blog post by Alex Feerst that I really like Alex Feerst was previously the Chief Legal Officer at Medium And, I guess this was a year ago, he interviewed or something like 15 or 20 people that have worked in trust and safety And trust and safety includes content moderation, although it’s not not solely content moderation And kind of one of the ideas that came up that I really liked was one of, one of the people .. and so many of these people have worked in Trust and Safety for years at big-name companies and one of them said, “The separation of product people and trust people worries me, because in a world where product managers and engineers and visionaries cared about the stuff, it would be baked into how things get built If things stay this way — that product and engineering are Mozart and everyone else is Alfred the butler — the big stuff is not going to change.” And so, I think at least two people in this kind of talk about this idea of needing to better integrate trust and safety, which are often kind of on the front lines of seeing abuse and misuse of a technology product Integrating that more closely with product and engs so that it can kind of be more directly incorporated and you can have a tighter feedback loop there — about what’s going wrong, and and how, how that can be designed against Okay, so those were, these were, well, I link to a few blog posts and research I thought relevant, but inspired by the Markkula Center’s tools for tech ethics and hopefully those

are practices you could think about potentially implementing at your, at your company So next I want to get into diversity, which I know came up earlier And so only 12% of Machine Learning researchers are women This is kind of a very, very dire statistic There’s also a kind of extreme lack of racial diversity and age diversity and other factors, and this is, this is significant A kind of positive example of what diversity can help with, and a post Tracy Chou, who was an early, early an engineer at Quora and later at Pinterest, wrote that the first feature (and so I think she was like one of the first five employees at Quora), “The first feature I built when I worked at Quora was the block button I was eager to work on the feature because I personally felt antagonized and abused on the site,” and she goes on to say that If she hadn’t been there, you know they might not have added the the block button as soon And so that’s kind of like a direct example of how, how having a diverse team can help So my kind of key, key advice for anyone wanting to increase diversity is to start at the opposite end of the pipeline from, from where people talk about the, the workplace I wrote a blog post five years ago, “If you think women in tech is just a pipeline problem, you haven’t been paying attention.” And this was the most popular thing I had ever written until Jeremy and I wrote the, the Covid 19 post last month, so the second most, most popular thing I’ve written But I linked to a ton of, ton of research in there A key statistic to understand is that 41% of women working in tech end up leaving the field, compared to 17% of men And so this is something that recruiting more girls into, into coding or tech is not going to address this problem, if they keep leaving at very high rates I just had a little peek at the YouTube chat, and I see people are asking questions there I just want to remind people that we are not, that Rachel and I do not look at that If you want to ask, ask questions, you should use the forum thread and, and if you see questions that you like, then please put them up, such as this one, “How about an ethical issue bounty program just like the bug bounty programs that some companies have?” You know, I think that’s a neat idea you have, rewarding people for, for finding ethical issues And so the, the reason that women are more likely to leave tech Is and this was found in a meta-analysis of over 200 books, white papers articles — women leave the tech industry because they’re treated unfairly, underpaid, less likely to be fast-tracked than their male colleagues and unable to advance And, and too often, diversity efforts end up just focusing on white women, which is wrong Interviews with 60 women of color who work in stem research found that 100 percent had experienced discrimination and their particular stereotypes varied by race And so it’s very important to focus on women of color in, in diversity efforts as a, kind of the top priority A study found that men’s voices are perceived as more persuasive, fact-based, and logical than women’s voices, even when reading identical scripts Researchers found that women receive more of a feedback and personality criticism and performance evaluations, whereas men are more likely to receive actionable advice tied to concrete business outcomes When women receive mentorship, it’s often advice on how they should change and gain more self-knowledge When men receive mentorship, it’s public endorsement of their authority Only one of these has been statistically linked to getting promoted, it’s the public endorsement of authority And all these studies are linked to another post I wrote called, “The Real Reason Women Quit Tech”, and how to address it Is that a question, Jeremy? Yeah, so if you’re interested, kind of, these two blog posts I link to a ton of ton of relevant research on this And I think this is kind of the the workplace, is the the place to start in addressing these things So another issue is tech interviews are terrible for everyone So now kind of working one step back from people that are already in your in your workplace, but thinking about the interview process And they wrote a post on how to make tech interviews a little less awful, and went through a ton of research And I will, I will say that the the interview problem, I think is a hard one I think it’s very time consuming and hard to to interview people well But kind of the two most interesting pieces of research I came across — one was from

Triplebyte, which is a recruiting company that interviews, kind of does this first-round technical interview for people and then they interview at Y Combinator (it’s a Y Combinator company), and they interview at Y Combinator companies And they have this very interesting data set where they’ve kind of given everybody the same technical interview and then they can see which companies people got offers from when they were, you know interviewing at many of the same companies And the number one finding from their research is that the types of programmers that each company looks for often have little to do with what the company needs or does, rather they reflect company culture and the backgrounds of the founders And this is something where they even, they even gave the advice that if you’re job hunting, look for, try to look for companies where the founders have a similar background to you And that’s something that while I, that makes sense that’s going to be much easier for certain people to do than others And particularly, given the, the gender and racial disparities in VC funding, that’s gonna make a big difference Yes Actually, I would say that was the most common advice I heard from VCs when I became a founder in the Bay Area was when recruiting focus on getting people from your network, and people that are as like-minded and similar as possible That was by far the most common advice that I heard Yeah, I mean this is one of my controversial opinions — I do feel like ultimately, like I get why people hire from their network, and I think that long term, we all need to be developed Well particularly white people need to be developing more diverse networks And that’s like a, you know, like ten-year project That’s not something you can do right when you’re hiring, but really kind of developing a diverse network of friends and trusted acquaintances, a kind of over time But, yeah, thank you for that perspective to Jeremy And then kind of the other study I found really interesting, was one where they, they gave people resumes And in one case, so one resume had more academic qualifications, and then one had more practical experience And then they switched the gender — one was a woman, one was a man (or you know male name and a female name) And basically people were more likely to hire the male and then they would use a post-hoc justification of, “Oh, well, I chose him because he had more academic experience, or I chose him because he had more practical experience.” And that’s something, I think it’s very human to use post hoc justifications, but it’s a, it’s a real risk that definitely shows up in hiring Ultimately, AI, or any other technology I developed or implemented by companies for financial advantage, i.e., more profit, maybe the best way to incentivize ethical behavior is to tie financial or reputational risk to good behavior In some ways, similar to how companies are now investing in cybersecurity because they don’t want to be the next Equifax Can grassroots campaigns help in better ethical behavior with regards to the use of AI Oh, that’s a good question Yeah, and I think there are a lot of analogies with cybersecurity and I know that for a long time I think was hard for people to make, or people had trouble making the case to their bosses of why they should be investing in cybersecurity Particularly because cybersecurity is you know, something like when it’s working, well, you don’t notice it And so that can be, can be hard to build the case So I think that there, there is a place for grassroots campaigns And I’m gonna talk more, I’m gonna talk about policy in a bit It can be hard in in some of these cases where there are not necessarily meaningful alternatives So I do think like monopolies can kind of, kind of make that harder That’s it, yeah, a good good question All right, so next step Actually on this slide is the, the need for policy And so I’m gonna start with a case study of what, what’s one thing that gets companies to take action? And so as I mentioned earlier, an investigator for the UN found that Facebook played a determining role in the Rohingya genocide I think the best article I’ve read on this was by Timothy McLaughlin, who did a super super in-depth dive on Facebook’s role in Myanmar And people people warned Facebook executives in 2013, and in 2014, and in 2015, how the

platform was being used to spread hate speech and to incite violence One person in 2015 even told Facebook executives that Facebook could play the same role in Myanmar that the radio broadcast played during the Rwandan genocide, and radio broadcasts played a very terrible and kind of pivotal role in the Rwandan genocide Somebody close to it, this said that’s not 20/20 hindsight, the scale of this problem was significant and it was already apparent And despite this in 2015, I believe Facebook only had four contractors who even spoke Burmese, the language of the of Myanmar Question? That’s an interesting one How do you think about our opportunity to correct biases in artificial systems, versus the behaviors we see in humans? For example, a sentencing algorithm can be monitored and adjusted, versus a specific biased judge who remains in their role for a long time I mean well, theoretically, though … I think I feel a bit hesitant about the it’s, it’ll be easier to correct bias in algorithms because I feel like the … you still need people kind of making the decisions to prioritize that Like, it requires kind of an overhaul of the systems priorities, I think It also starts with the premise that there are people who can’t be fired, or disciplined, or whatever Which I guess, maybe for some judges that’s true, but tha kind of maybe suggests that judges shouldn’t be lifetime appointments Yeah, even then I think you kind of need the, the change of heart of the people advocating for the new system, which I think can, would be necessary in either case, kind of And that, that’s kind of the critical piece of getting the, the people that are wanting to overhaul the values of a system So returning to, to this issue of the Rohinga genocide And this is a kind of continuing, continuing issue Yeah, this is something that’s just kind of really stunning to me that, that there were so many warnings, and that so many people tried to raise an alarm on this, and that so little action was taken And even, this was last year, Zuckerberg finally said that Facebook would add, or maybe, maybe this was actually, this was probably two years ago, said that Facebook would add, but this is you know after genocides already happening, Facebook would add dozens of Burmese language content reviewers So in contrast, wo we have this, this is how Facebook really failed to respond in any any significant way in Myanmar, Germany passed a much stricter law about hate speech and NetzDG, and the the potential penalty would be up to like 50 million euros Facebook hired 1200 people in under a year because they were so worried about this penalty And so, and I’m not saying like this is a law we want to replicate here I’m just illustrating the difference between being told that you’re contributing, or playing a determining role in a genocide versus a significant financial penalty We have seen what the one thing that makes Facebook take action is And so I think that that is really significant in remembering what the, what the power of a credible threat of a significant fine is And it has to be a lot more than you know, just like a cost of doing business So, I really believe that we need both policy and ethical behavior within industry I think that policy is the appropriate tool for addressing negative externalities, misaligned economic incentives, race to the bottom situations, and enforcing accountability However, ethical behavior of Individuals, and of data scientists, and software engineers working in industry, is very much necessary as well Because, the law is not always going to keep up It’s not going to cover all the edge cases We really need the people in industry to be making kind of ethical ethical decisions as well And so I believe both are significant and important And then, something to note here is that, many, many examples of kind of AI ethics issues, and I haven’t talked about all of these, but there was Amazon’s facial recognition the

ACLU did a study finding that it incorrectly matched 28 members of Congress to criminal mug shots and this disproportionately included Congress people of color And there’s also, this was a terrible article The article was good, but the story is terrible, of a City that’s using this IBM dashboard for predictive policing and a city official said, oh like whenever you have machine learning it’s always 99% accurate, which is false, and quite concerning We had we had the issue in, so in 2016, ProPublica discovered that you could place a housing ad on Facebook and say “I don’t want Latino or black people”, or “I don’t want wheelchair users to see this housing ad”, which seems like a violation of the the Fair Housing Act And so there’s this article, and Facebook was like we’re so sorry, and then over a year later It was still going and ProPublica went back and wrote another article about it There’s also this issue of dozens of companies were Placing ads on facebook, job ads and saying, “We only want young people to see this” There’s Amazon creating the recruiting tool that penalized resumes that had the word “women’s” in it And so something to note about these examples, and many of the examples we’ve talked about today, is that many of these are about human rights and civil rights It’s a good article by Dominique Harrison of the Aspen Institute on this And I kind of agree with Anil Dash’s framing I mean, he wrote, “There is no technology industry anymore, Tech is being used in every industry” And so I think in particular, we need to consider human rights and civil rights such as housing, education, employment, criminal justice, voting, and medical care, and think about what rights we want to safeguard and I do think policy is the appropriate way to do that And I think, I mean, it’s very easy to be discouraged about regulation, but I think sometimes we overlook the positive, or the cases where it’s worked well And so something I really liked about datasheets for data sets by Timnit Gebru et al., is that they go through three case studies of how standardization and regulation came to different industries And so the electronics industry, around circuits and resistors, and so there that’s kind of around the standardization, of you know, what the specs are and what you write down about them And the pharmaceutical industry, and car safety, and none of these are perfect, but it’s still, it was a kind of, very illuminating, the case studies there I mean, in particular I got very interested in the car safety one, and there’s also a great 99% invisible episode, this is a design podcast about it And so some things I learned is that early cars had sharp metal knobs on the dashboard that could lodge in people’s skulls in a crash Non-collapsible steering columns would frequently impale drivers and then even after the collapsible steering column was invented, it wasn’t actually implemented because there was no economic incentive to do so But it’s the collapsible steering column that has saved more lives than anything other than the seatbelt, when it comes to car safety And there was also this widespread belief that cars were dangerous because of the people driving them and it took, it took consumer safety advocates decades to just even change the culture of discussion around this and to start kind of gathering and tracking the data and to put more of an onus on car companies around safety GM hired a private detective to trail Ralph Nader and try to dig up dirt on him And so this was really a battle that we kind of, I take for granted now, and and so kind of shows how much how much it can take to to change, change the needle there And then, kind of a more recent Issue is that it wasn’t until I believe 2011 that it was required that crash test dummies start representing the average female anatomy, in addition to …. previously was kind of just crash test dummies were just like men And that in a crash with the same impact, women were 40% more likely to be injured than men, because that’s kind of who the cars were being designed for So I thought, I thought all this was very interesting and it can be helpful to kind of remember, and remember some of the successes we’ve had And another area that’s very relevant is environmental protections And kind of looking back and Maciej Ceglowski has a great article on this

But you know just remembering like in the US, we used to have rivers that would catch on fire, and London had terrible terrible smog, and that these are things that were, you know were very, would not have been possible to kind of solve as an individual We really needed kind of coordinated, coordinated regulation All right, is then on a kind of closing note So I think a lot of the problems I’ve touched on tonight are really huge, huge and difficult problems and they’re often kind of very complicated and I .. Well, I go into more detail on this in the course, so please, please check out the course once it’s released I always try to offer some like steps towards solutions, but I realize they’re not they’re not always you know as satisfying as I would like of like this is gonna solve it and that’s cuz these are really really difficult problems And Julia Angwin, a former journalist from ProPublica, and now the editor in chief of The Markup, gave a really great interview on privacy last year that I liked and found very encouraging She said, “I strongly believe that in order to solve a problem, you have to diagnose it and that we’re still in the diagnosis phase of this If you think about the turn of the century and industrialization, we had, I don’t know, 30 years of child labor, unlimited work hours, terrible working conditions, and it took a lot of journalists muckraking and advocacy to diagnose the problem, and have some understanding of what it was and then the activism to get laws changed I see my role as trying to make as clear as possible what the downsides are and diagnosing them really accurately so that they can be solvable That’s hard work and lots more people need to be doing it.” I found that really encouraging and that I do, I do think we should be working towards solutions But I think just at this point, even better, diagnosing and understanding kind of the complex problems we’re facing is valuable work A couple of people are very keen to see your full course on ethics Is that something that they might be able to attend or buy or something? So it will be released for free at some point this summer And it was, there was a paid in-person version and offered at the Data Institute as a certificate kind of Similar to how this, this course was supposed to be offered, you know, in person The data ethics one was in-person, and that took place in January and February And then I’m currently teaching a version, version for the Masters of Data Science students at USF, and I will be releasing the free online version and later, sometime before July Thank you I will see you next time