Computer Generated Novel

Novel

Creating a computer generated novel was, well, a novel idea to me. Like the poetry assignment, I created a Python script in Google Colaboratory to generate a novel from the Gutenberg version of the Sherlock Holmes series by Sir Arthur Conan Doyle. My reason for picking Sherlock Holmes is that it is normally considered a serious classic in the literary world, and I simply wanted to take Doyle’s text and turn it into gibberish.

The reason I chose to revisit this project is that I struggled so hard to get the original sample I provided on GitHub. If you use the original version of the code, Shelork Homels V1, it will take hours for the notebook to process the text and give you a sample. I had to consult Dr. Whalen because I had no idea what I had done wrong, only that my computer hated me for making me run a code cell for three hours without rest. When I managed to get a sample that was within my word limit, I sort of gave up. However, Dr. Whalen and I met some more and figured out a better workaround, which is Shelorck Homels V2. I never made any samples for this version of the code because I had technically completed the project. I plan to rectify that by generating a 50,000 word novel using the second version of the code. I have included my GitHub README file here, with all the included links so anyone can view everything I mentioned and used for this project.

Shelork-Homels-NaMoGenMo-Novel / README.md

For this project, I wanted to take the Gutenberg version of Sherlock Holmes and run it through a script in Python that completely randomizes and rewrites the text. I created a script in Google Colaboratory using ngrams to randomize the word assortment. There are two versions: Shelorck Homels V1 and Shelorck Homels V2. I used WeasyPrint and Markdown to convert it to HTML so I could edit it using CSS, which I then used to generate a PDF of the final randomized document, which I called “Shelork Homels” because it looked close enough to the real thing to figure out what this project is. Basically, I wanted it to generate a 50,000 word novel that, at a glance, looked legible but really was not. The example I provided in this repository has a total of 52,484 words. It is 122 pages long and has its own title page with page numbers.

All of the code I used came from various examples provided to my Creative Coding class at the University of Mary Washington. Our professor, Zach Whalen, was a great help in providing various examples and helping us expand upon any ideas we might have had. He also provided me with the code for the second version of the code for this project, which can be found in this repository. The code that he provided for us to use can be found in this Colab Notebook and his PDF Workflow. I merely used his examples and experience to make this project possible.

Shelork-Homels-NaMoGenMo-Novel / Sherlork-Homels-Code.py

# Import Sherlock Holmes text file from Gutenberg.

text = open("/content/Sherlock-Holmes.txt", "r").read()

# Set up ngrams.

import random

def chapter():

  ngrams = []

  for b in range(len(text) - 4):
    ngrams.append(text[b:b+4])

  random.shuffle(ngrams)
  seed = random.choice(ngrams)

  new_text = seed

  for i in range(100000):
    for n in ngrams:
      if (n[:3] == new_text[-3:]):
        new_text += n[-1]
        ngrams.remove(n)
        #break

  return(new_text)
  
  
  # let's install weasyprint!
!pip install weasyprint==52.5

# also we need markdown
import markdown

# import some specific parts of weasyprint
from weasyprint import HTML, CSS
from weasyprint.fonts import FontConfiguration


import random

novel = """
# Shelorck Homels
### by Lyndsey Clark
"""

for c in range(1):
  
  #novel += chapter()
  
 # novel += f'''
## Chapter {c}
 # '''
  novel += chapter().capitalize()

#print(novel)


# convert the markdown formatted novel string into html
html = markdown.markdown(novel)


# prepare WeasyPriny
font_config = FontConfiguration()
rendered_html = HTML(string=html)

css = CSS(string='''
@import url('https://fonts.googleapis.com/css2?family=Festive&display=swap');
@import url('https://fonts.googleapis.com/css2?family=Merriweather:wght@300&display=swap');
body {
font-family: 'Merriweather', serif;
}
hr {
  break-after: recto; 
}
h1 {
  font-size: 50pt;
  text-align:center;
  margin-top: 3in;
  font-family: 'Festive',cursive;
}
h2{
  break-before: recto;
  margin-top: 3in;
  font-family: 'Festive',cursive;
}
h3 {
  font-size: 20pt;
  text-align:center;
  break-after: recto;
}
/* set the basic page geometry and start the incrementer */
@page {
  font-family: 'Merriweather', serif;
  margin: 1in;
  size: letter;
  counter-increment: page;
  @bottom-center {
    content: "[" counter(page)"]";
    text-align:center;
    font-style: italic;
    color: #666666;
  }
}
/* blank the footer on the first page */
@page:first{
  @bottom-left {content: ""}
  @bottom-right {content: ""}
  @bottom-center {content: ""}
}
''', font_config=font_config)


# Finally, this creates a PDF called "sample.pdf" with all the above settings
rendered_html.write_pdf('/content/sample.pdf', stylesheets=[css],font_config=font_config)

I have provided a permalink to the original sample down below, if anyone is interested in taking a look at it. This is the version of the novel that took over three hours to generate, because the code I was using was more than a little broken. It was also the only sample that existed before I undertook this project.

Shelork Homels

To update this assignment, I decided to do a complete overhaul of my original idea. Instead of generating one novel, I decided to make four. The Adventures of Sherlock Holmes is comprised of four novels: A Study in Scarlet, The Sign of the Four, The Hounds of Baskervilles, and The Valley of Fear. Using the second version of the code Dr. Whalen helped me to develop, I was able to generate four different novels. Instead of taking 3 hours each, like my original code took to make the first sample, this version only took about a minute to generate a PDF that was over two thousands pages long! I did this four times, coming up with creative gibberish names for the books.

Shelork-Homels-NaMoGenMo-Novel / Shelork-Homels-Code-V2.py

text = open("/content/Sherlock-Holmes.txt", "r").read()

import random

def altchapter():
  ngrams = {}
  d = 4


  for i in range(len(text)-d-1):
    stem = text[i:i+d]
    twig = text[i+d]

    if (stem not in ngrams.keys()):
      ngrams[stem] = [twig]
    else:
      ngrams[stem].append(twig)

  seed = random.choice(list(ngrams.keys()))
  new_text = seed
  #print(seed)
  for i in range(10000):
    root = new_text[-d:]
    if(root in ngrams.keys() and len(ngrams[root]) > 0):
      pick = random.randrange(len(ngrams[root]))
      new_text += ngrams[root][pick]
      ngrams[root].pop(pick)

  return(new_text)
  
  # let's install weasyprint!
!pip install weasyprint==52.5

# also we need markdown
import markdown

# import some specific parts of weasyprint
from weasyprint import HTML, CSS
from weasyprint.fonts import FontConfiguration

import random

novel = """
# Shelorck Homels
### by Lyndsey Clark
"""

for c in range(12):
  
  #novel += chapter()
  
  #novel += f'''
## Chapter {c}
  #'''

  novel += altchapter()

#print(novel)

# convert the markdown formatted novel string into html
html = markdown.markdown(novel)

# prepare WeasyPriny
font_config = FontConfiguration()
rendered_html = HTML(string=html)

css = CSS(string='''
@import url('https://fonts.googleapis.com/css2?family=Festive&display=swap');
@import url('https://fonts.googleapis.com/css2?family=Merriweather:wght@300&display=swap');
body {
font-family: 'Merriweather', serif;
}
hr {
  break-after: recto; 
}
h1 {
  font-size: 50pt;
  text-align:center;
  margin-top: 3in;
  font-family: 'Festive',cursive;
}
h2{
  break-before: recto;
  margin-top: 3in;
  font-family: 'Festive',cursive;
}
h3 {
  font-size: 20pt;
  text-align:center;
  break-after: recto;
}
/* set the basic page geometry and start the incrementer */
@page {
  font-family: 'Merriweather', serif;
  margin: 1in;
  size: letter;
  counter-increment: page;
  @bottom-center {
    content: "[" counter(page)"]";
    text-align:center;
    font-style: italic;
    color: #666666;
  }
}
/* blank the footer on the first page */
@page:first{
  @bottom-left {content: ""}
  @bottom-right {content: ""}
  @bottom-center {content: ""}
}
''', font_config=font_config)

# Finally, this creates a PDF called "sample.pdf" with all the above settings
rendered_html.write_pdf('/content/sample.pdf', stylesheets=[css],font_config=font_config)
Eehs Atrdnuve oft Shelork Homels The Adventures of Sherlock Holmes
Blobified NovelReal Novel
Let Scaud nt Saidry A Study in Scarlet
Our Sfheo ign FhetThe Sign of the Four
Vke Hadunes orlh Btsoilfes The Hounds of Baskervilles
Orf Vehla le FtaeyThe Valley of Fear

Project Reflection

I decided to revisit this project because I thought it would be fun to do things right for a change. The code I was originally using was absolutely busted, and the fact I had to wait over three hours for it to create a novel generated a lot of stress for me. I could not figure out what was wrong! So, I have to give Dr. Whalen a big shout out on this one! Thank you for helping me fix my code so it would actually function! Even though I gave up on the novel after submitting it, I was inspired to come back and generate an entire Shelork Homels series! I had a lot of fun messing with the titles. Plue this version of the project is well over 50,000 words, blasting that goal straight out of the water! I had a lot of fun working with the code knowing it would not take forever or crash at any moment! Had there been more Sherlock Holmes novels, I probably would have generated spoofs of them too! Overall, I really enjoyed updating this project! I think the inclusion of the entire series really fleshed out the idea I had!

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php