๐Ÿ‘Ž This week was something I could have done without

A summary of the paper that lost Timnit Gebru her job | Facebook are forced to be transparent, for once

Greetings all, I do hope this instalment of Horrific/Terrific is as enjoyable as a satisfying bowel movement.

This week's rating: something I could have done without ๐Ÿ‘Ž (that's only one better than HORRIFIC, oh no!). Why's that?

  • We discover the content of the AI research paper that Google are wilfully ignoring
  • Even the far right can't give up Facebook, the strongest of all the cults
  • The mayor of London seems to champion facial recognition used by law enforcement

๐Ÿ’ฌ In a shocking turn of events, WhatsApp doesn't like Apple's new privacy labels

On Tuesday, Apple made it so that anything going into the app store needs to be clearly labelled with what kind of data the app collects about its users. Let's ignore for a second at how basic and non-descriptive their 'streamlined' labels are (e.g. 'user content' โ€” what the hell is that?), and focus on the key, hilarious thing: WhatsApp are calling Apple out for this decision, on the grounds that it is non-competitive.

What WhatsApp are saying (with reality italicised):

  • Apple's native apps, such as iMessage, do not seem to have these labels, therefore this is unfair. Moaning about anti-competitive behaviour is pretty rich coming from Facebook. People should be free to NEVER download any third-party apps, if they so choose (an iPhone containing only native apps is probably the most private kind of phone you can have).
  • The labels only look bad and do not help users understand what we do to protect data. Yes it does 'look bad' if you collect as much data as you do on a messaging service. Yes, you're right... that is bad โ€” you are doing a bad job of this.
  • Our advertisers could see weaker ad performance from iOS users. E.g. 'We will lose money.'

Sorry, WhatsApp, you can't hide behind your bare minimum end-to-end encryption thing forever โ€” please get a grip. For your interest, here's what WhatsApp are putting on their label.


๐ŸŽ“ Here's a breakdown of the paper that Timnit Gebru was fired over

Yes... fired. She was fired. If you disagree, that's fine. Someone has to be wrong.

In case you have no idea what I'm talking about, Timnit Gebru was a top AI ethics researcher at Google, and there was disagreement over a research paper that outlined potential risks of using large data sets, specifically with language models. Google, cementing their role as propaganda machine, insisted the paper should not be released, because it would reflect badly on them.

What's in the paper then?

I'm going to super-summarise it for you (youโ€™re going to sound SO smart at your next social gathering):

โ˜๏ธ FIRST RISK: the environment. Do not underestimate the amount of computing power it takes to train large AI models. Especially when you're working with something with a lot of parameters. Look here...

  • BERT (built by Google themselves, it's what the search engine uses) has 110m parameters. To train it ONCE (just once) creates 1,438 lbs of CO2 (or, a plane flying from New York to San Fransisco and back)
  • Transformer, with 210m parameters, outputs over 600k lbs of CO2, if you train it once.
  • GPT-2, by OpenAI (the one that is easy to confuse with a human sometimes) has... 1.5 billion parameters ๐Ÿ˜ต. There's no data on how much CO2 this creates yet. Perhaps because numbers don't go that high...
  • By the way, GPT-3 has 175 billion parameters. Hopefully you get my point

โœŒ๏ธ SECOND RISK: large, crappy data sets. An absolute classic of a problem. The more data you have, the better you can train your model. A good way to get a lot of text data is to scrape it from the web. Yes the web, that place where people gather to be sexist, racist, and generally hateful. This language finds its way into training data, and the machine, of course, regards this data with the same cold indifference as, say, an article about knitting. You also only get data from the richest people on Earth, because not everyone can afford to be as online as you or I might be...

๐ŸคŸ THIRD RISK: misdirected research effort. Large language models are very good at manipulating data, but not understanding it. So let's put more research into building AI that can understand language, right? WRONG. Big Tech firms have enough money and resource to manipulate accurately enough, so more work goes into building that kind of model.

๐Ÿ–– FOURTH RISK: the machines will be used against us, the humans. Just like the Matrix, but actually believable. The language spat out by one of these large language models can fool humans quite easily. Imagine, generating misinformation at the scale and speed of a machine. Just think of how EVIL you could be. Also, if you're Facebook, your language model may mistranslate 'good morning' to 'attack them'. Oh, non-English languages, why are you so hard?

IMPORTANT NOTE: I did not get my hands on the paper myself, I merely summarised what the good people of MIT Technology Review already summarised.


๐Ÿฅ— A side-salad to all of this weekโ€™s meaty learnings

The whole AI ethics research paper thing was pretty heavy, so here are just some other (smaller) reasons why this week should kindly eat a buffet of dicks.

Sadiq Kahn says: "I want the police service to evolve and find new ways of detecting and fighting crime". And if shoddy facial recognition is the way to do that, then bring it on!

France have fined Amazon and Google, for doing what they always do. Which is, to surveil their users with aggressive tracking technologies. Google have to pay โ‚ฌ100m, and Amazon โ‚ฌ35m, which I'm sure will put a MASSIVE dent in their businesses, definitely ๐Ÿ™„.

The far right are using Facebook to discuss how bad Parler is, and it's quite funny. Some people even think being in a Facebook group that rallies people to join Parler, is the same as being on Parler. All this demonstrates to me is just how hard it is to 'leave' Facebook.

Finally, a thought directly from my head: I was pleased to finally see some talk about the environmental implications of computing power (in Gebru's paper). I want to see more discussion about this in general (does it exist already? Where can I find it?). E.g. can we stop celebrating 'the cloud' as a great place to store stuff โ€” isn't the cloud just a bunch of ice cap-melting server farms?? Consider your own local storage solutions, readers.


Sorry, I couldnโ€™t be bothered to add gifs to illustrate my points. Instead I just added more emojis, hope thatโ€™s coolโ€ฆ