What is it like to be an algorithm

What is it like to be an algorithm? What does it mean to understand something?
We will never know if an AI is conscious, not anymore than I can be certain that you are conscious. But we can try to put ourselves in its shoes and see the world through its eyes.


There’s a trend in machine learning to amass a lot of data and draw conclusion without really “understanding” it. Critics have claimed that this kind of AI, like GPT, may seem to produce impressive results, but do not really understand the world. And to a large extent, I agree, even though it still has merits (see our podcast episode on this ^^).

This work makes you see what a machine learning algorithm does. You’ll see text designed to have as little connotations as possible. If you can draw meaning from a succession of symbols without any kind of reference to the real world, so could an AI. If we all converge to the same kind of semantics, whatever it may be, then it proves that it is universal and that algorithm could also access it.

Let’s solve this question by extending this into a collaborative interpretation work!

………………..

 

………………..

..ᚨᛃ……………

..ᚨᛃᛟ………….ᚨᛊᛟ.

..ᚨᛃᛟᛟ…..ᚨᛊᛟᛟ……..

..ᚨᛃᛟᛟᛟ.ᚨᛊᛟᛟᛟ…………

..ᚨᛃᛟᛟ…..ᚨᛊᛟᛟ……..

..ᚨᛃᛟ………….ᚨᛊᛟ.

..ᚨᛃ……………

………………..

 

.ᚨᛃᛟᛟ…….ᚢ…….ᚨᛊᛟᛟ.

….ᚨᛃᛟᛟ….ᚢ…….ᚨᛊᛟᛟ.

……ᚨᛃᛟᛟ.ᚢᛃᛟ…….ᚨᛊᛟᛟ.

..ᚨᛃᛟᛟ…..ᚢᛃᛟ…….ᚨᛊᛟᛟ.

………ᚢᛃᛟ…….ᚨᛊᛟᛟ.

………ᚢᛃᛟ….ᚨᛊᛟᛟ….

………ᚢᛃᛊᛟᛟ.ᚨᛊᛟᛟ……..

………ᚢᛃᛊᛟᛟ……ᚨᛊᛟᛟ..

………ᚢᛃᛊᛟᛟ………

………ᚢᛃᛊᛟᛟ………

………ᚢᛃᛊᛟ………

………ᚢᛃᛊ………

………ᚢᛃᛊ………

Making a self-aware twitter AI with GPT2

The story so far

It was more than a year ago that I had my playing with gpt2 phase, resulting in a short story co-written with the AI and this little blog http://yo252yo-a.tumblr.com/ which I kinda stopped updating after a while.

But I was bound to come back to it some day! It all started when I decided to open a twitter account for my podcast. I very naturally made a little script to schedule all my tweets (from Google Spreadsheet ^^) so that I could enqueue tweets, obviously. I also went back in time to the archive of my facebook/tumblr/whatever posts to see what could fit this new account since I posted so much enlightening things over the years xD

Once this was in place, it was like my twitter account was managed by a nice little bot (who was simply posting things from a queue, but still). As its parent, I obviously wanted to see it grow: how cool would it be if it could learn and evolve by itself? Could it ever be self-aware (lol)? After all, it already had access to twitter, and it had a collection of my tweets to learn from.

Setup

So I dusted off my colab repository of GPT2, since GPT3, despite all the hype, remains pretty inaccessible. Most notably, I had to make it work with an old version of tensorflow (the recent versions broke it), and I also made it read and write directly to Google Spreadsheet /o/ In the end, I only had to run the code in the colab to fetch the data, train on it, and post it directly in the queue to be twitted. Pretty sweet setup.

The problem is that GPT2 produces mostly crap. And I didn’t know what temperature or training set would be ideal for my purposes. It was time to experiment!

Validation

I ran several training sets on several temperatures. For each, I personally annotated 200 results. I dont think the result will be super significant, but it’s better than nothing.

The success criteria was: is this tweetable (i.e. relatively grammatically correct, at least a bit interesting/surprising, and of course different from the training set). The good samples will be posted on our twitter with the hashtag #shitmygpt2says.

Training sets

The basic training set was the queue of all our tweets for the podcast twitter account, including the archive of all my past tumblr/facebook posts that I sanitized for the occasion (a lot of work xD).

But like my previous attempts, I thought it was a bit sad to limit myself to things produced by me when I had the perfect chance to merge my brain with the people I admire. Furthermore, I kinda wanted to make my twitter AI standalone and able to “learn” as time passes, even though GPT really isn’t the best framework for that ^^

I ended up making a twitter list of people I admire, and used their recent tweets in my dataset. The idea was to make my model aware of “recent events”, recent words, etc…

Yet, I wanted to keep a feeling that the writing style was distinctly mine. It is accounted for in the success criteria, and the core of this experimentation was “how should I mix the training set to keep awareness of the recent world but still control the style of the output?”.

Sequential vs merging

In my previous attempts, I mostly used a “merging” approach feeding everything to the learning phase. The alternative is to feed two corpora in succession during the learning phase.

From what I observed, it seems that GPT2 absorbs the style of whatever it was fed last, even if it is for very few training epochs. For instance, when I fed it corpus A for 1.5k epochs and then corpus B for 100 epochs, it produced results that looked like corpus B, even though it exhibited some signs of having learned A every now and then (pretty rarely though, that’s why I kept so many epochs in the first phase of training).

I kinda think of it with a cooking metaphor, when I first marinate the model in corpus A and then lightly sear it with corpus B.

Here are the experimental results that loosely validate this:

We notice here btw that the merging strategy is pretty poor because consistency of the training set is pretty important with GPT2. The first three lines did not exhibit a strong difference, making me believe that 1k epochs is enough for GPT2 to “forget” about the initial corpus, which is how I ended up with the 1.5k/100 mix which gave me the best outcomes.

Best parameters

Here is the total result of my experiments. GPT2 produces around 93% of crap, which makes sanitizing a pretty tough job ^^ It appears that this could drop to 80% or below by using correctly the “marinade/searing” technique and keeping the training set uniform.

As it is widely known, temperature below 0.8 is pretty bad, but I find myself often pushing above 1, though it seemed to do pretty poorly with my best data sets. I’ll keep using different temperatures as they produce different types of results that I enjoy in their own way. But I’ll probably stop using text corpora as a base (past writing, night vale scripts, etc…) because they don’t seem to bring anything to the table (and could even be detrimental, better stick to tweets).

So we’re pretty far from a self-aware AI that learns from its mistakes, but seeing that I’ll always retrain it on recent tweets, and that it will be trained on my own tweets that include the proposals it made and I kept, I hope that as time passes it’ll still learn to be a bit better (it already started annotating posts with the #shitmygpt2says hashtag itself).

In the future, I’ll run this every now and then in its best configuration, and keep posting on twitter with the hashtag #shitmygpt2says. Stay tuned if you’re interested!

Playing around with GPT2

So lately I’ve been spending a relative amount of time toying with GPT2, who made the headlines about producing text so believable that it was considered dangerous (GPT2 is the toned down version).

ML and Reddit

I started by getting hooked on this GPT2 generated subreddit:

https://www.reddit.com/r/SubSimulatorGPT2/

Which I highly recommend to everyone to read daily as an exercise in critical thinking and challenging the natural human bias to trust everything you see. I especially enjoy the tag trained on r/totallynotrobots which is basically robots pretending to be humans pretending to be robots pretending to be humans.

It wasn’t long before I tried it for myself. I’ve long wanted to download all my social media posts and train some kind of ML on it, and GPT2 seemed like the state of the art.

Torch RNN

Somehow I started to mess around with Torch RNN which was the previous state of the art, I guess, made accessible through this tutorial which gave us such gems as a PBS idea channel episode, a genius buzzfeed skit, or the relatively famous short film Sunspring.

Both Torch RNN and GPT2 are pretty similar in the way they are used (I believe it’s all tensorflow under the hood). They both deliver you a pre-trained model that kinda knows english, I think, and expect as input a txt file of example lines.

But training took ages on my computer (like a whole night for a couple of iterations) because despite being fairly powerful its GPU isn’t supported for the ML training optimizations (sad). I had little hope that anything more sophisticated would be possible on my machine.

Enters colab

Fortunately, people are sometimes really great, and not only did Max Woolf make a wrapper to make GPT2 easy to use, he also made a colaboratory notebook that makes it dead simple to use and most importantly computationally sustainable, since it runs on the Google Compute Engine VM with some sort of free quota. It has a very nice Google Drive integration that makes it easy to save trained model or upload new training data. With this, you can train a model in less than 1h, making it really easy to play with.

Getting data

First of all, it’s been extremely easy to download all my data from social networks (here I’m talking about Google, Facebook, Tumblr, Discord and WordPress). Everything has a dump archive function now (courtesy of EU law I believe?), so that definitely made my life easier. A bit of python scripting to transform the json or xml into txt and we were good to go.

The outcome

I first started the training on the posts of this blog. The outcome was pretty convincing. It felt pretty weird and special to see these lines that felt like I could have written but I actually didn’t. It really seemed like another version of me, which of course tickled my philosophy bone.

Obviously the result wasn’t perfect. It often spouts out nonsensical stuff, but I enjoyed very much weeding out the absurd or malformed proposition to keep something sensical by human conventional standards (let’s say I had around 1 satisfying proposal for 5 results on average).

This way, I had the program write a short story for this blog. I gave it the prompt you see in bold, and it chose among the completions it proposed. I did not add any text myself. As you can see, it’s a bit weird. In particular it doesn’t really lead anywhere, I think GPT2 isn’t very teleological. That definitely was a challenge for a short story ^^ But I like to think that the style is pretty convincing.

And the overall exercise is far from absurd. It reminded me of the Ecriture automatique productions by the surrealists. It’s still an easier read than Naked Lunch. Really gets you thinking about the self, art and authorship, doesn’t it? Who wrote this story in the end? What if I hadn’t done any editing? What does it mean for copyright?

Literary corpora

Prompted by these questions, I trained several models on works of art that I thought would produce interesting outputs. I put all my favorite results on

http://yo252yo-a.tumblr.com

In particular, I trained a model on the Hitchhiker’s Guide to the Galaxy (which produced a lot of “bits of story” and dialogs that were not really usable as standalone excerpts),  Welcome to Night Vale scripts (which were pretty convincing especially when you prompt it with a phrase of the show like “And now, a look at the community calendar!”), or all of homestuck (which was pretty challenging to get anything good out of).

Once I had all these pretty ok results, I immediately processed to try merging my brain (at least this model copy) to the brains of these authors I admire (at least this model copy). The result was a mess until I had the great idea to feed the input corpora not in parallel all at once but in sequence (i.e. do 1000 rounds of training on the authors’ corpus, and then 1000 rounds on mine). The results were pretty nice.

This taught me the single most important fact about playing with GPT2: it’s all about your training data. The parameters (# training rounds, “temperature”) can’t really save you if your input data isn’t the best it can be. You want it as clean and uniform as possible. Which is really the core point of the next section.

Social media corpora

I trained GPT2 models on my conversations and emails, but it was all utter failures. The fact that I’m often using several languages certainly doesn’t help, but the trouble I’ve had with the homestuck corpus makes me believe that GPT2 is simply not very great with dialogs and conversations.

I even tried to sanitize my input further, prefixing my lines of dialogs by “-” and whoever I was talking to by “>”, with the hope of starting a conversation with the GPT2 model, but I couldn’t get anything out of it. Maybe if I went over the corpus manually and kept only the meaningful messages, I’d get something different, but this sounds daunting.

Needless to say that merging this with my blog post corpora was also pretty bad, so in the end I stuck to my blog corpus.

By the way, I also tried to train a model on a list of J.K. Rowkling’s retcon tweets to get crispy new intel about the Harry Potter canon variations, but I couldn’t get it to produce anything new.

Conclusions

  • GPT2 on colab is extremely easy.
  • Your training corpus is everything, really.
  • GPT2 does great with literary types of text but sucks a bit at conversations/informal speech.

Next steps

As intoxicating as it is to watch a ghost of myself produce believable texts, I’m not sure where it leads ^^ My ultimate goal would be to be able to produce some sort of system I can interact with and teach dynamically to get better (i.e. conversational and dynamic retraining) but that seems pretty rare in the world of generational ML models. I might have to dig deeper into Tensorflow, but I can’t really do that with my current machine, so I’m kinda stuck.

I have a couple of pointers for conversational ML (still no dynamic/online/interactive/reinforcement learning though so that limits the interest), but I expect them to be less good than GPT2. Haven’t had time to try them yet (probably they require more power than I have). The dream would be to combine that with GPT2 I guess and figure out a way to dynamically retrain the model on itself.

In any case, it feels really nice to see some progress in my Caprica dream.