Archive for Bloggishness

sadface-bot: A Markov chain bot

Markov bots make for amusing text generators. They don’t make much sense, usually. When they do make sense, it’s pure chance.

sadface draws its vocabulary and concepts from a flat text file, where each line is considered a sentence. The bot chains words together to create sentences, which it passes to the IRC channel it is in.

Right now, sadface only supports one channel, but you can have multiple instances of sadface running with different configuration files. The configuration file is specified at runtime as an argument: python sadface-configgable.py config-file.ini

Included in sadface-bot.zip are sadface-configgable.py and default.ini. If you want to change default.ini, I encourage you to copy default.ini and change the variables, so you can have an untouched default.ini.

You can start sadface with a blank brain_file.txt, but its replies won’t make much sense at all until it’s heard a lot of conversation. I recommend putting several books into the file. Project Gutenberg is a good place to start. Separate sentences by newlines. Replies look best if there are no quotes or tabs in brain_file.txt. You can specify different brain files with your config.ini.

Interact

To play with sadface, /join #sadface on irc.foonetic.net, or supply your own IRC server, channel and brain_file.txt.

Download

sadface depends on:

  • Python 2.7.3, available in repositories or at Python.org
  • python-twisted, available in the Ubuntu, Debian and openSUSE repositories or from source at the Twisted downloads page. Installers are available for Windows and Mac.

Download sadface-bot.zip

Credits

sadface is heavily based off of Eric Florenzano‘s MomBot, which uses the twisted network stack to handle IRC.

sadface uses configuration methods written by hhokanson for AnonBot, an IRC channel anonymizer.

Grabbing Amazon books for reformatting, pleasure

Creative Commons euphoria

I was referred to a book called Storyteller Uprising earlier today. The book’s about page intrigued me: it’s about how people’s increasing distrust of the legacy media is leading them to create new media.

The book’s website helpfully tells me how to get it:

  • Buy the paper hardcopy through Amazon
  • Buy the paper hardcopy from the university he works for
  • Buy the paper hardcopy from a web store that I was previously unaware of
  • Acquire the digital Kindle version from Amazon

At the bottom of the list, it says that the book is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs license. He’s distributing it for free, in other words. To quote the license:

You are free: to Share — to copy, distribute and transmit the work
Under the following conditions:
Attribution — You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).
Noncommercial — You may not use this work for commercial purposes.
No Derivative Works — You may not alter, transform, or build upon this work.

Since the author is distributing it under those terms, I should be able to find it for free somewhere online, download it, read it and if I like it, buy it.

Hey, it worked for Cory Doctorow and <a href="http://www.antipope.org/"Charles Stross. If I like Storyteller Uprising, I might buy a hard copy of it.

Enter Digital Rights Management

I can’t find the full text of Storyteller Uprising online, though. Having Creative-Commons-licensed material be unavailable feels wrong somehow, like having to pay for access to federal court records.

It’s free on the Kindle today, so I go to Amazon’s store and open it in the Amazon Cloud Reader, a in-browser app that supposedly caches your downloads for offline reading. I want to pull it out of my browser’s cache, reformat it to fit on my rooted Nook Color and read it there at my leisure.

Fun fact: Cloud Reader transmits the book’s contents as a series of encrypted strings inside JSONP containers. Here’s a sample of what’s in gz_frag1.jsonp:

loadFragment1({"fragmentData":"encrypted_text","fragmentMetadata":{"compression":1,"encryption":1,"id":1},"imageData":null});

There are 76 book fragments in this novel, each one with several tens of kilobytes of encrypted data taking the place of encrypted_text in the above code. There’s also a gz_fragmap.jsonp and a gz_skeleton0.jsonp, which appear to provide structure and decoding information to the Cloud Reader. The former helpfully mentions at the end that the book is in the mobi7 format.

Discouraged, I consider loading the web app on my Nook. The web app is clunky, though, and doesn’t fit on my Nook, even though my Nook is running Ice Cream Sandwich.

(Side lesson: Cloud Reader is providing me with the plaintext and ciphertext of the book, the decryption engine and the decryption key. In theory, I have everything I need to not only decrypt the book but to create an automated Cloud Reader download tool. The only thing lacking is knowledge of JavaScript, time, effort, and a willingness to violate the Kindle ToS. For more on this topic, read this Doctorow speech.)

The web approach has failed me. What to try next?

Using alternate platforms

I have a Nook Color running a development build of Cyanogen Mod 9, a custom version of Android’s latest version Ice Cream Sandwich.

Ice Cream Sandwich isn’t written to allow users access to certain types of files. These forbidden files include core operating system files and password databases, but they also include things like apps and app-secured files. Users can’t access those files. Amazon’s Kindle application does not take advantage of those security protections.

Instead, Storyteller Uprising was stored on the unprotected SD card, in the folder Android/data/com.amazon.kindle. I opened it with the free, community-developed e-book program Calibre. From Calibre I am able to transcode it as I will.

What’s the upside of this? Free book, now free of DRM shackles.

Success.

Legal Note

Hanson Hosein, the author of Storyteller Uprising, has released it under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. Transcoding the work in question is allowed under section 3 of the CC-BY-NC-ND license applied by the author, specifically “The above rights may be exercised in all media and formats whether now known or hereafter devised. The above rights include the right to make such modifications as are technically necessary to exercise the rights in other media and formats…”