![]() ![]() Skip meme captions containing the pipe character | since it’s our special end-of-text-box character.This means that both our feature text and labels will come from a set of only ~70 characters, depending on which ascii characters the training data happens to include. Skip meme captions with non-ascii characters to reduce the complexity the model has to learn.Convert everything to lowercase to reduce the number of characters the model must learn, and because many memes are just all caps anyway.82 is arbitrary, it just made the overall training strings about 100 characters. Apply a maximum string length of 82 characters so we don’t generate super long memes and because the model will train faster. ![]() Apply a minimum string length of 10 characters so we don’t generate boring one-word or one-letter memes.Trim leading and trailing whitespace and replace repeated whitespace ( \s ) with a single space character.Several cleaning techniques were used on the data before training: Finally, the last character by itself (the 2nd array item) is the next character in the sequence.After that is the text of the meme so far, with | used as the end-of-text-box character.Note: it is critical that our convolution kernel width (seen later in this post) is no wider than the 4 spaces plus the index character, aka ≤ 5. The two spaces are just extra spacing to ensure the model can tell the box index apart from the template ID and meme text. The 0 or 1 is the index of the current text box being predicted, generally 0 is the top box and 1 is the bottom box, although many memes are more complex.The string is left padded with zeros so all IDs are the same length. This allows the model to differentiate between the 48 distinct memes we’re feeding it. The first 12 characters are the meme template ID.We are classifying the text strings on the left into one of ~70 different buckets where the buckets are characters. Like most things in machine learning, this is just a classification problem. ] # we'll need our feature text and labels as separate arrays later texts = for row in training_data] labels = for row in training_data] I’m omitting the code for reading from the database and performing initial cleaning because it’s very standard and could be done in multiple ways. If you try the finished model below, you’ll also see that char-level can be more fun!īelow is what the training data looks if the first meme caption is “make all the memes”. Also, character-level deep learning is a superset of word-level deep learning and can therefore achieve higher accuracy if you have enough data and your model design is sufficient to learn all the complexity. Character-level generation rather than word-level was chosen here because memes tend to use spelling and grammar… uh… creatively. However, since we are building a generational model there will be one training example for each character in the caption, totaling ~45,000,000 training examples. To speed up training and reduce complexity of the model, we only use the 48 most popular memes and exactly 20,000 captions per meme, totaling 960,000 captions as training data. The raw dataset we’ll draw from is ~100M public meme captions by users of the Imgflip Meme Generator. I’ll cover takeaways about data cleaning, training, model design, and prediction algorithms. ![]() This will be a practical guide and while I suggest many best practices, I am not an expert in deep learning theory nor have I read every single relevant research paper. The goal of this post is to describe end-to-end how to build a deep conv net for text generation, but in greater depth than some of the existing articles I’ve read. Meme Text Generation with a Deep Convolutional Network in Keras
0 Comments
Leave a Reply. |