Manipulating the YouTube Algorithm – (Part 1/3) Smarter Every Day 213

– A couple of months ago I made a Twitter thread about some weird activity I saw online, and after I posted that thread tons of engineers from many different tech companies reached out to me privately to tell me their stories My interest in all this started one day when I was scrolling on YouTube, and the algorithm served up a pretty weird video for me to watch You know how the algorithm works, right? It looks at your past activity and tries to figure out what you could watch in the future that would keep you on the platform the longest It optimizes watch time The algorithm suggested I watch this video Now, I’m not a super political guy, but I know those are important topics, and it had 138,000 views and it was only one day old, so to me this looked like it was a real news story So when I clicked on the video it got weird fast, strange music starts playing, and a robot voice comes on, and clearly starts reading me a script – [Voiceover] After Trump sends note to Ginsburg, he breaks silence on plan for Supreme Court Democrats didn’t think Donald would dare Get ready, America Ruth Bader Ginsburg hasn’t been seen in public in for 57 days – There were red flags all over the place The robot voice was reading typos The name of the channel itself was some generic fake news site There’s no way this is legitimate news But the problem is it had engagement levels that were off the chart 94% like to dislike ratio Look at all these comments How can this be happening on a video of such low quality? I started diggin’ a little bit deeper, so I searched YouTube for the exact same title to see what came up, and whoa, look at this, all of these videos are exactly the same title, exactly the same script, but they’re all just a little bit different If you play the videos you get different graphics scrolling across the scene You might get a different robot voice – [Voiceover] Breaking News Mencos, after Trump sends note to Ginsburg he breaks silence on plan for Supreme Court– (orchestral alert) – [Voiceover] After Trump sends message to RBG he breaks silence on big plans for Supreme Court – [Voiceover] After Trump sends note to Ginsburg he breaks silence on plan for Supreme Court (loud alert) – [Voiceover] After Trump sends note to Ginsburg he breaks silence on plan for Supreme Court – The content was essentially the same, it was just arranged in a different way Different photos, different B-roll, different title screens I’m a YouTuber and I spend some time thinking about content ID systems and things like that, and it’s clear to me that these manipulations of the video and even the audio are attempts to get around YouTube’s automated recognition systems Let me explain how this works YouTube engineers look at the individual pixels on a video and then they use the values of these pixels to perform some type of mathematical function, which gives you a number called a hash You then compare hashes to other videos that have been uploaded and try to figure out if the same content has been uploaded by someone else in the past The challenging task here is to make a system that’s fast enough to find the exact copies of specific frames of video across the entire YouTube library while at the same time making it smart enough to not be tricked How would you do that with math? Instead of sampling every pixel, what if they sampled specific spots on every video and measured the color values at those specific locations? They could then compare those spots with every other video uploaded to YouTube But think about what happens if a sneaky person resizes those images The colors would change at those locations The same thing would happen if you rotate the image or flip it or even apply a filter of some kind Now I don’t know how YouTube samples these pixels or the audio from the video or what mathematical functions they use, but I know that’s like a company secret because that’s how they defend the platform If the bad guys were to figure out how these detection algorithms work then they could get around them and they could beat the defenses If you look at these crazy videos like a software engineer you can start to see some really interesting details For example, why would there be a globe spinning in this image? Well if you think about it, that’s going to change the hash What if the YouTube engineers figured the globe trick out and then they shut that technique down? Well, in this video the globe is mirrored That’s going to change the hash in a different way This specific instance is a counter-counter-countermeasure Which is fascinating This one is probably my favorite Why on earth would they put virtual snow falling over the top of a video of the royal family? If you think about it, this changes the math in a kind of random way, and therefore, YouTube’s ability to detect it This is like a new form of camouflage that’s using math instead of colors on this new battlefield where daily fights take place by opposing forces of software engineers But instead of fighting over hills or pieces of land the winner of these individual skirmishes gets a few moments of your time, which in today’s attention economy is super valuable Because of the original video I found, I assumed most of this activity was on the right But I found these videos just two clicks away from a mainstream channel on the left (intense music) – [Voiceover] Speaker of the House, Nancy Pelosi just keeps racking up wins over Donald Trump (intense alert) – [Voiceover] Speaker of the House, Nancy Pelosi,

just keeps racking up wins over Donald Trump – It’s the exact same stuff just trying to manipulate things in the other direction If you look at this channel specifically, it was started over 10 years ago by what appears to be an actual human It uploaded a bunch of gaming content And then at this moment right here it started uploading videos about politics At this point it’s clear to me this is not just low quality content This is a coordinated attack against the YouTube algorithm, complete with countermeasures This is a serious, well-funded activity done by people meant to do harm If you’re a teen or 20-something you probably think these old people are getting duped into voting for someone that doesn’t make sense And if you’ve got a few more miles on the tires, you might be looking at the younger generation and thinking, man, how are all these manipulative people able to whip them up into a frenzy so easily? So is the internet getting worse because we’re getting worse and this is just a reflection of us, or is there actually someone playing with the dials and pitting us against each other? Today’s social discourse takes place on the public forum of the internet And front and center in that forum are three primary platforms, YouTube, Twitter, and Facebook This video is the first in a three-part series on what exactly these external forces are doing to manipulate our social media platforms and how they’re doing it Now a key to a good lie is to convince you that there is no deception When I started trying to research this stuff, there’s all kinds of information on the internet, but it’s very difficult to cut through all the falsehood So my approach is pretty simple I’m literally going to get on airplanes and fly to the engineers who are trying to beat this stuff and have a straight-up conversation with ’em We’re gonna get to Twitter and Facebook later, but for the purposes of this video let’s look at specific active attempts to manipulate the YouTube algorithm Okay, it’s time to move past the speculation stage to the actual engineering data stage So I’m here at this building in California The person I’m gonna talk to does not wanna be on camera, so we’re gonna respect that, but I’m gonna go find out exactly what went down with these specific videos and report back I was just gonna stand in front of the sign and tell you what happened, but I have to think about how I’m gonna say this This is complicated I don’t want to attribute any words to YouTube here, so let’s just assume these are my words, but there seems to be two types of attack against the YouTube algorithm Number one, there seems to be a financial motivation People are trying to create videos to extract ad revenue from YouTube And so this is legitimate content There’s nothing outside of the terms of service here, except maybe the fame engagement policies that YouTube has But for the most part, it’s legitimate content uploaded and meant to extract money from YouTube Number two are the ideological attacks These are attacks meant to sway public opinion and make people think certain things and perhaps even make people fight with each other To understand this better I rented an office in San Francisco to interview Renee DiResta She’s an expert in malicious propaganda online Okay, this is Renee DiResta She is super smart on, I guess you’d say coordinated inauthentic behavior, incentivized content on social media and how to beat it, right? – Yep, I look at how different types of actors manipulate the social ecosystem across platforms – Awesome, I loved your stuff on Rogan and Sam Harris podcasts – Thank you – You’re just great If you’re paying attention, you’ve seen what she does I wanna talk about what’s going on on YouTube – Okay – Recently I found this really weird video It’s clearly manufactured content And from what I can tell there’s two reasons that there’s manufactured content Number one is it’s financially motivated And the second thing is ideological – Yes – Right? Is that correct? Is there a third component I’m not seeing? – No, that’s correct And there’s actually a lot of, there’s actually some overlap there Because if you’re producing really partisan content, particularly sensational stuff, you’re able to capture engagement and get people paying attention Because particularly right now in a highly partisan, polarized country, people are looking for that stuff And they’re not necessarily paying a lot of attention to who the source is So if you make something that looks interesting you’ll be able to theoretically attract views, keep people there, and then you can both monetize and do something divisive You’re gonna use fake accounts to social it with end groups and then you’re gonna try to get a critical mass of real people to come and amplify it – In order to get these videos in front of human eyeballs you have to first trick a robot algorithm type thing, and the way you do that is with artificial engagement Artificial engagement is done with fake logins or compromised accounts They sell them like wine on the black market A new one’s gonna set you back about a quarter A 2014 is gonna set you back about seven bucks Renee showed me some footage of what she calls a click farm They use these devices to try to artificially inflate engagement online You can easily find these places online that will sell these services to you straight up – You know, a lot of us, we use the number of views, number of likes, number of followers as like a hero stick for quality, and so there are hundreds and hundreds and hundreds of these businesses that just offer you things like views So these are people who are just selling, selling likes Funny enough, based on the number of likes on the ad for YouTube likes I would bet that they’re gaming their Instagram (mumbles), too (laughing) The internet is fake – I’ve been doing this wrong (laughing) I’ve been trying to make quality content this whole time

The strategy seems to be pretty simple You make a bunch of videos on one particular topic You put ’em online and then the metadata points to each other, right? At this point the artificial compromised accounts are used to give them artificial lift, and at some point, one of these videos will creep up above the noise in the algorithm and it will start to get shown to actual humans It’s really easy to get mad at YouTube at this point, right? Look at all this stuff that’s happening on the platform But let’s step back, think about it If you were a software engineer, how would you use math and algorithms to detect this activity? I would argue that this is very, very clever, and it’s very hard to detect this in an automated way If you look at the engagement on these videos, the majority of these comments are actual humans discussing the videos These are real people engaging with this content From an engineering perspective, this is extremely difficult to detect, especially at any kind of meaningful scale So we understand the fake engagement piece, right? But think about the content creation itself For a video that I’m proud of, for example, it will take dozens of hours for me to make this thing, right? In this one particular case we’re talking about, we saw dozens of videos uploaded in the same day So clearly computers are involved But how are they doing it? Believe it or not, there’s already an entire industry built around this technology It’s a great way to get these small stories out Whatever website you go to to get your news, you’ve probably seen these things, automatically generated videos Several companies offer these types of services One is called Wochit If you go to the Wochit website, they boast over 1.5 million videos were created on behalf of their customers last month These are videos created for businesses you recognize On their website they show how the system works You type in the topic you wanna make a video about They ingest millions of pieces of licensed content from different sources You slap in a script and yo have a video within minutes This is a very expensive business to business service But for these businesses trying to make it in the attention economy, it’s totally worth it Now let’s think about YouTube The Wochit News YouTube channel has uploaded over 3,000 videos in the last two months Most of these use actual voice actors reading a script This is an incredible amount of content Stop and think about what that could mean for the future of YouTube It works like this You have all of this content like B roll, photos, audio, things of that nature And it goes into this machine and out pops these videos Which is cool if you’re a newsroom and you’re using a service like Wochit to try to create content for legit users online But the problem is, this is just technology Think about what would happen if this was developed by people of ill intent If you’re clever you can change the content that you’re putting into the machine and the machine can start creating videos, each video with its own special flare so it can get around the countermeasures built into the YouTube system and you simply upload all these to different YouTube channels that you’ve created with fake email accounts and there you go You’re suddenly flooding YouTube with automatically created content that has the incentives of making you money or changing the way people think It’s probably mostly financially motivated – I think it’s mostly financially motivated I think Facebook has said the same thing about propaganda A lot of the stuff on their site, coordinated inauthentic activity, mostly economically motivated Even during 2016, now the notion of fake news is so tied to Russia but fake news wasn’t actually about Russia If you remember, back in 2016 during the campaign it was about people just creating these hyper-partisan sites that were literally fake news, demonstratively fake, Pope endorses Donald Trump and this kind of stuff And it was just pushing people to the sites to try to make money on the ads And so that’s what I think a lot of the challenge here is The really strong actors, the nation-states that are trying to do this kind of stuff, will spend the years to build up the audiences over time and then you have these more fly-by-night operations where a blog spins up overnight, they game their way into distribution, and then they just make money on the ads – Someone that’s doing this subversive activity, if they’re doing it we’re not gonna know – Right, for a long time, for a long time Unless they make a mistake – So it’s happening right now – Yeah – It’s happening right now probably in the sidebar of this video people are watching right now, we just may not know it’s inauthentic behavior – I think it’s really hard to find this stuff It gets better and better That’s the other thing I think people assume it’s obviously fake or obviously, obviously incorrect English or obviously sensationalist memes or something like that No, they actually just started repurposing their content form our own real, authentic, hyper-partisan pages – Can we just stop and be disturbed that this is the kind of content that’s getting real eyeballs? And I know what you might be thinking, well it’s clearly different I could tell the difference in that But think about who I am, right? I’m an engineer who understands countermeasures I think through strategy I spend hundreds of hours a month thinking about the YouTube algorithm

I tailor my thumbnails, I understand how titles work All of this to say I still got tricked And so it’s a cat and mouse game Well basically you have offensive content, then you have a countermeasure, and then you have a counter-countermeasure, and then the YouTube engineers have to develop a counter-counter-countermeasure, and so this just continuously ratchets up and I don’t see a way to win, you’re not gonna win this – No you’re never gonna win There’s no winning It’s managing – All you have to do is increase the cost for the adversary to influence society Regardless of whether this material is made in some far-off land to make a quick buck, or if it’s from a malicious nation trying to influence a foreign election, it’s all taking advantage of this flaw in your heart, the desire to fight with your neighbor These people literally make us hate each other and then we turn around and give them our money If your first inclination is to be mad at YouTube right now and some kind of outrage, then you don’t get it Like you don’t see what’s happening here I know these engineers They’re using all the math at their disposal to try to fix this as desperately as they can, but until our hearts change towards political grace these people are gonna keep taking advantage of us I don’t think what kind of laws we make to try to get around this, they’re going to make us fight and we’re gonna sit there and do it and then close our eyes and give them our money We’ve got to be smarter than this – I think a lot of the countering has to be done in the real world, right? It’s the, I think you had said this in a Twitter thread of yours I’d read where you talked about the need to actively practice– – Love thy neighbor kind of stuff – Right, right I think that that’s, unfortunately, it is a, they’re preying on human biases This is the thing, we have a brand new information ecosystem, right? We have democratized creation of content We have no more gatekeepers Anyone can say what they want, do what they want, maximized expression, algorithms to help you find people, but ultimately human nature, like the people have not changed So it’s this fascinating new information environment ecosystem but with a very old set of biases and ways of being that are just kind of part of the human experience, and I don’t think we’re necessarily adept at recognizing what social media has done to us as individuals and as members of society And that I think is one of the key challenges where no amount of regulating of algorithms or catching of bad guys changes that kind of fundamental truth Trying to use real community to kind of return people to that human connection is the thing that we’re missing right now and that’s because it’s much harder to do that, to create the kind of active unity that you’re talking about ’cause like who’s in charge of doing that Normally that would have been like your churches, your neighborhood, your community I think, I don’t know what that looks like ported online where everybody’s spending their time – I’m not trying to scare you by showing you all this stuff Obviously there’s a lot of hard core engineering thought that goes into everything I just showed you, but it’s happening There are bad actors trying to manipulate people online for financial gain and I don’t think it’s YouTube’s fault When somebody wants to do bad things to you, they’re gonna do whatever they can to get your data and exploit it When I realized this series was gonna help educate people I reached out to a company that takes online internet security seriously, and that is ExpressVPN They were all in They decided to sponsor this video series and help make it happen If you’re not using a VPN you seriously need to consider it Your internet connection right now is broadcasting your IP address, which is the way people track you online Check this out, when I go to this website, it tests everything that my internet connection is leaking You can see my IP address, which I’ve masked here ’cause I don’t want you to see it, but you can also see where I’m at Your internet connection is doing this right now Here’s how ExpressVPN works It’s a virtual private network You just turn it on with one click My internet traffic is now encrypted, and going through one of ExpressVPN’s servers located throughout the world so people can no longer figure out where I’m located I protect myself online with ExpressVPN, and if you want that same protection you can get that by going to expressvpn.com/smarter, you can download ExpressVPN today If you get 12 months you get an additional three months for free That brings the cost down to less than seven bucks a month If you’re super smart and you understand the technical stuff behind all this, go do a DNS leak test It’s gonna pass It’s the fastest one I’ve ever tested TechRadar said this is the number one VPN There’s a 30-day money back guarantee, which I’m confident you’re not gonna use because I feel comforted knowing that my personal information is protected on the internet Expressvpn.com/smarter, you can protect your internet connection right now Next up, Twitter and Facebook They were awesome They let me set up cameras in the building They also let me talk to some of the engineers in charge of building these countermeasures It is a fascinating discussion I hope you’ll join me as we continue this algorithm manipulation series I think this is super important stuff

If there’s someone that you think could benefit from knowing that this is how the internet actually works, please pass this video along to ’em Also, consider subscribing if you feel like this earned it This is a ton of work and I hope it brings value to your life That’s it I’m Destin, you’re getting smarter every day Have a good one