The Library Basement
Reading under ground

Readings for May 2014

Published: 2014-06-14 20:58:00
Category: books Tags: readings Brandon Sanderson Michael Lewis

I am OK with having a back-log of periodicals.

Words of Radiance by Brandon Sanderson

Brandon Sanderson's new novel Words of Radiance is a book of feats. First of all, just look at it, if you get the chance. Take in its girth. The hardback is large. So large, that it defies binding. Yet somehow the good people at Tor found a way to make almost 1100 giant pages stick together in one book. And they even had to cheat a bit, removing the headers from the pages and slamming text far North into the traditional margins.

The second feat is that of storytelling. In adding a second volume to The Stormlight Archive, Sanderson is spinning quite a yarn. A huge story with many characters and plot lines is starting to converge. And Sanderson does a decent job making the reader care about just about everyone on the many pages of the book. At times I think the restrained scope and style of LeGuin is optimal, but I also like me a good, long fantasy novel. So recommended, but remember to read The Way of Kings first if you have not already.

Moneyball by Michael Lewis

This is probably the most popular baseball book of the past two decades and somehow I had not read it yet. But I had the opportunity to borrow it from my father and dove right in.

I really enjoyed this outsider's look inside baseball. In following Billy Beane and the Oakland A's, Lewis does much to help explain the weird economics of Major League Baseball. Now a decade on from the book, it is interesting to look back at the players featured in the novel, as well as at the A's themselves. After a bit of a downturn, the club under Beane is back on top, and still with a very low payroll.

While I enjoy that low-payroll teams can be successful, I have been disturbed by another recent trend in the bigs: an owner can still make a profitable enterprise out of a non-competitive team. If that can be fixed, baseball will be all the stronger. Recommended.

Periodicals

  • Harper's June 2014 - "The Second Doctor Service" by Daniel Mason was a very compelling short story. I was engaged throughout, and left thinking about it for days.

Fun with LXXM-Corpus

Published: 2014-06-13 20:56:00
Category: language Tags: nltk Greek LXX

Once I have a text available for natural language processing, there are a few basic tasks I like to perform to kick the tires. First, I like to run the collocations method of NLTK, which gives common word pairs from the text. For the LXXM, here are the results:

  • ἐν τῇ
  • ἐν τῷ
  • ὁ θεὸς
  • τῆς γῆς
  • καὶ εἶπεν
  • λέγει κύριος
  • ἀνὰ μέσον
  • τὴν γῆν
  • τοῦ θεοῦ
  • ὁ θεός
  • τάδε λέγει
  • πρός με
  • πάντα τὰ
  • ὁ βασιλεὺς
  • οὐ μὴ
  • οὐκ ἔστιν
  • τῇ ἡμέρᾳ
  • οἱ υἱοὶ
  • τῷ κυρίῳ
  • τοῦ βασιλέως

If you disregard the stop words, you can get a decent idea of the fundamental thematic content of the text.

Now for the silliness, using the n-gran random text generator:

ἐν ἀρχῇ ὁδοῦ πόλεως ἐπ' ὀνόμασιν φυλῶν τοῦ Ισραηλ παρώξυναν οὐκ ἐμνήσθησαν διαθήκης ἀδελφῶν καὶ ἐξαποστελῶ πῦρ ἐπὶ Μωαβ ἐν τῷ ἐξαγαγεῖν σε τὸν ἱματισμόν

A categorized, tagged Septuagint corpus

Published: 2014-06-09 20:27:00
Category: κτλ Tags: nltk LXX technology

Last year I created a version of the SBLGNT for use as categorized, tagged, corpus for natural language processing. Now I have done the same with a Septuagint text. I am calling it LXXMorph-Corpus. The source for text and tags is my unicode conversion of the CATSS LXXMorph text. There is at least one category for each file.

The text is arranged with one book per file. Certain books in the source LXXMorph text are split where there is significant textual divergence (manuscript B and A, or the Old Greek and Theodotion). Each file has one or more categories (e.g. pentateuch and writings).

Since there is no punctuation in the source text, the files are laid out with one verse per line. A better arrangement from an NLP perspective would be one line per sentence (thereby preserving the semantic structure). Maybe someday we'll have a freely-licensed LXX text which will include sentence breaks.

Each word is accompanied by the morphological tag in the word/tag format (NLTK will automatically split word and tag on the slash). The part of speech tag is separated from the parsing information with a hyphen, which enables the use of the simplify tags function in NLTK.

Here follows an example of how to load this corpus into NLTK:

from nltk.corpus.reader import CategorizedTaggedCorpusReader

def simplify_tag(tag):
    try:
        if '-' in tag:
            tag = tag.split('-')[0]
        return tag
    except:
        return tag

lxx = CategorizedTaggedCorpusReader('lxxmorph-corpus/', 
    '\d{2}\..*', encoding=u'utf8',
    tag_mapping_function=simplify_tag, 
    cat_file='cats.txt')

Now through the lxx object you have access to tagged words - lxx.tagged_words(), simplified tags - lxx.tagged_words(simplify_tags=True), tagged sentences - lxx.tagged_sents(), and textual categories - lxx.words(categories='former-prophets').

This is a derivative work of the original CATSS LXXMorph text, and so your use of it is subject to the terms of that license. See the README file for more details.

Readings for April 2014

Published: 2014-05-26 07:53:00
Category: books Tags: readings Michael Crichton Orson Scott Card

Reading continues.

Micro by Michael Crichton

There was a time in life when I was devouring everything by Michael Crichton I could get my hands on. As time went by my tastes have changed. And with the passing of Crichton, there were not more opportunities to read his work anyway. Except that his estate had arranged for posthumous releases of works in progress. The first was Pirate Latitudes, which I found to be decidedly half-baked.

This new offering, Micro, is co-authored by Richard Preston, so it has a more finished feel to it. And it is a vintage Crichton story-line: corporate use of bleeding-edge technology leads to mayhem. I will warn the reader that the premise of this book is essentially "Honey, I Shrunk the Kids" in an action/adventure format. I would recommend this for any die-hard Crichton fans out there.

Speaker for the Dead by Orson Scott Card

I really loved Ender's Game, so I've had its sequel Speaker for the Dead queued up for quite some time. I also studiously avoided seeing the film adaptation of the former. Orson Scott Card once again came through with a very thought-provoking tale, well-executed in the science fiction genre.

Card's fiction seems to always address religion, though in this novel it is a major theme. You have the Catholic colony on a lonely planet reacting to the intrusion of the Speaker, who is a sort of "priest" for a new "humanist religion." The Speaker is of course Ender, who through relativistic spaceflight is still running around thousands of years after his xenocide. Ender gets the opportunity for a chance at redemption, as it were, because for the first time since the buggers, humanity has discovered a new sentient species.

I have only read three Card novels, but they have all stuck with me. He is an excellent story teller, and he does not let his genre get in the way. Rather he uses science fiction to create the alternate realities in which tough questions can be addressed. In other words, he is very much like LeGuin, and I love him for it. Recommended.

Periodicals

  • Harper's April 2014
  • Tin House 58
  • Harper's May 2014

Spigot 2.2

Published: 2014-04-29 20:50:00
Category: κτλ Tags: technology

I have released Spigot 2.2. The primary purpose of this release is to support the use of any arbitrary field in the incoming feed in the format of the outgoing message. Before Spigot limited you to the title or link, but now you can have more options, including author, etc.

This update requires a database schema change as well as an update to your configuration file. The new version will prompt you to upgrade these if necessary. I have provided an upgrade script in the git repo to handle this upgrade for you. New users have nothing to worry about.

Fabricus quips again

Published: 2014-04-13 20:52:00
Category: links

Kim Fabricus' "Doodlings", a semi-regular feature on the Faith & Theology blog, continues to warm my heart with humor. This most recent batch produced multiple fits of audible laughter. This was the chief in my estimation:

There are two major legal grounds for divorce in the UK: adultery and “unreasonable behaviour”. Interestingly, these are the same theological grounds on which evangelicals and liberals “divorce” each other – accusations of syncretism on the one hand and irrationality on the other.

So please, help yourself, and subscribe over there.

Readings for March 2014

Published: 2014-04-04 06:00:00
Category: books Tags: J.K. Rowling William Gass readings

While I try to stay sharp with "literary" fiction, I cannot layoff the popcorn fare. There is, of course, nothing wrong with this. Sometimes the literature smarties need to relax and read a page-turner.

Harry Potter and the Deathly Hallows by J.K. Rowling

Here ends the Harry Potter series as well as my re-read. I had actually forgotten a lot of details from the final book in the series - likely due to the film abridging much in service of the film format. What I rediscovered I liked, particularly the narrative of how Dumbledore's youthful pursuit of power shipwrecked his family life.

The series as a whole is of course recommended. It has become an important part of our culture, and it is good reading.

Middle C by William Gass

Gass tells us the story of Joseph Skizzen, the very average man. Skizzen's upbringing was the product of his father's deceptions and ultimate abandonment. Joseph, along with his mother and sister, end up in America, where they must learn their own ways to navigate the American life. In spite of being undocumented and uneducated, Joseph becomes Professor Skizzen, on the music department faculty of a small midwestern university. There he begins cultivation of his private "Inhumanity Museum" and his attempt to express an idea, a single sentence, in its perfect form.

I can honestly say this is the best novel I have read in quite some time. The character Skizzen and his neighbors are a delight to read. Recommended.

Periodicals

  • Harper's March 2014
  • Scientific American September 2013 - I learned a lot from this food-centric issue. One of the most interesting factoids was that humans really need to cook food in order to survive.

The τελος of Greek natural language processing

Published: 2014-03-25 21:11:00
Category: language Tags: Greek NLP

I dream that someday we'll have a full stack of Greek natural language processing tools to facilitate research. These tools will range from transcribing the text to advanced NLP tasks like text classification or sentiment analysis. These tools will of course be open source.

Here is an overview of the components I have imagined (with notes where the tools are already in development):

  • Optical Character Recognition to transcribe the text to a digital form (Rigaudon Polytonic Greek OCR)
  • A user interface for editing the output of the OCR system (a "collaborative corpus linguistics" suite could be used for this and other editing tasks)
  • Collation of related texts for textual criticism.
  • Morphological analysis of the text (Tauber's greek-inflection is a start)
  • Tagging of the text based on above morphological analysis
  • Indexing the text
  • Use of a context-free Grammar or other means to produce syntactical analysis of the text (e.g. syntax trees)
  • A database to store all of this information
  • An API to make this information accessible (towards which Open Scriptures has worked)

We're actually pretty close. And once the full stack is in place, it will greatly increase the speed at which new texts enter the research corpus. This influx of data will improve the results of research and lead to new applications.

Am I missing anything?

Pelican Implementation

Published: 2014-03-21 07:45:00
Category: meta Tags: Pelican

In this post I'll share some of the implementation details for converting this blog from WordPress to Pelican. The process was not difficult, but it did require a bit of figuring to get everything right. Luckily Pelican has good documentation.

File import

Pelican comes with an importer for WordPress XML, so that made things nice and easy. I simply exported from my site and re-imported it into Pelican, converting to Markdown. One thing about the Markdown conversion that did not go well was that the "alt" text of images did not come through correctly. I think this was due to an alternate syntax for links being used by the converter.

Post URL format

I used the /year/month/day/slug format for my WordPress posts (e.g. "http://example.com/2014/03/21/pelican-implementation"). By default Pelican saves the output HTML in a flat structure. If you care about preserving links, this won't do. I used the settings ARTICLE_URL and ARTICLE_URL_SAVEAS to get Pelican to match the output. There are two ways you can go with the SAVEAS setting. Either you can put an index.html file inside the folder path (e.g. /2014/03/21/pelican-implementation/index.html), or you can use some sort of rewrite rule in your web server to point the clean path to the HTML in question. I went with the fool-proof index file method. Here are my versions of these settings:

ARTICLE_URL = '{date:%Y}/{date:%m}/{date:%d}/{slug}/'
ARTICLE_SAVE_AS = '{date:%Y}/{date:%m}/{date:%d}/{slug}/index.html'

Media

Wordpress sticks its uploads in the wp-content directory. You'll probably want to put those contents inside your "images" folder in your Pelican project, and then edit your imported posts to re-point the paths. I used the following to edit the files in place:

for f in *.md; do sed -i "s/wp-content\/uploads/images/g" $f; done

Voila!

Etc.

You can use git or another vcs to keep track both of your inputs and outputs. I'd use the following setting in your pelicanconf.py file to let Pelican know not to mess with git in your output directory:

OUTPUT_RETENTION = (".git")

The End

The majority of the time I spent on this project was on creating my own custom theme. Theming templates was easy with jinja. I am just bad at CSS. I tried to avoid JavaScript for the function of the page. It is only used in my analytics tracking code (Piwik). I also link to Google for some web fonts. But browsers will fall back if the visitor chooses to block remote @fontface calls.

That's it. Let me know if you have any questions.

Greek WOTD - ὑπέρογκος

Published: 2014-03-19 06:03:00
Category: language Tags: Greek WOTD

ὑπέρογκος

Meaning "extremely large" or "rather difficult." Spotted in Lamentations 1:9:

ἀκαθαρσία αὐτῆς πρὸς ποδῶν αὐτῆς οὐκ ἐμνήσθη ἔσχατα αὐτῆς καὶ κατεβίβασεν ὑπέρογκα οὐκ ἔστιν ὁ παρακαλῶν αὐτήν ἰδέ κύριε τὴν ταπείνωσίν μου ὅτι ἐμεγαλύνθη ἐχθρός

Updated Platform

Published: 2014-03-17 06:52:00
Category: meta

For quite some time I have been interested by the prospect of converting this blog to a static format. This has been for various reasons, ease of maintenance and security concerns being foremost. I tried various static blog generators, but found little to love in them.

But then some time in the past year I discovered Pelican and knew it was the platform for me. It's based on Python after all! So I had some aborted attempts at a conversion. In case you are wondering, converting a mature WordPress blog to another format is not always easy. Thankfully there is an import tool with Pelican, and a number of configurable options to help match the new environment to the old as much as possible.

I have endeavored to preserve links where prudent. So this includes links to posts, pages, and the syndicated feed. Links to categories, tags, and particular index pages may be broken.

The hardest part of this conversion required the decision to remove comments from posts. This blog will not be using a public commenting system in the future. Pelican offers Disqus, but that is not a solution I would prefer. If you would like to comment on a post, please email me, and I may add it to the site. I will see if I can develop a way to add existing comments back to their respective posts.

So that's it. Onward and upward.

Early Christian Writings

Published: 2014-03-10 07:47:00
Category: links Tags: Greek

Early Christian Writings is an index of pre-Nicean Christian texts. It includes links to texts and translations (where available), as well as commentaries, and all of the works are tied up in a chronology.

Readings for February 2014

Published: 2014-03-05 06:23:00
Category: books Tags: J.K. Rowling readings Robert Jordan

This month: pops!

The Shadow Rising by Robert Jordan

I completed the fourth of umpteen novels in the Wheel of Time series. This installment was a bit longer than the preceding, but hopefully the length will not continue to increase on a linear scale. Jordan does a decent job presenting some core conflicts in this novel which give some immediacy to the conflict I know will not be ultimately resolved for ten more subsequent installments. I do not feel weary on the journey thus far, so I will continue reading these one every few months. Recommended.

Harry Potter and the Half-Blood Prince by J.K. Rowling

I started a re-read of the Harry Potter novels in September of 2008. Let's just say that it has been a fairly slow burn. But having completed the sixth installment, I rush right on to the ultimate, as you, dear reader, will see in my subsequent readings post.

These novels are really enjoyable. In this read-through I am particular enjoying the themes of Harry Potter. Rowling wrote some novels which are interesting to adults not just in a popcorn fashion, but because they appeal to some complex emotions. Which is good! Also: Snape kills Dumbledore. I figure after all this time, I should not need to give a spoiler warning for that. Recommended.

Periodicals

  • Scientific American August 2013 - This issue covers several angles of MOOCs - massively open online courses. I took one on natural language processing through Coursera and liked it quite a bit. The numbers are not great in terms of students completing the course, but even the few who progress all the way represent a large group learning from a single course.

Readings for January 2014

Published: 2014-03-05 06:11:00
Category: books Tags: John Howard Yoder readings Robert Stone

In which I eventually get to documenting my reading adventures.

The Politics of Jesus by John Howard Yoder

I was gripped by a desire to re-read this classic sometime back. I first read this back in 2011, and it never left me. Yoder's power in this work is in the unmasking of what is plainly in the gospel texts: Jesus did have a politics which was particular to his time and place, but his politics can nonetheless be applied in the here and now. Now why would this message need to be unmasked? It is due to our culture and our theology and our fixation on our own politics, I think. First-century Palestine was such a remote time and place, it seems to many modern readers that Jesus must have been a mystic on a quest to bring others to enlightenment.

Part of what I appreciate about The Politics of Jesus is that Yoder so convincingly draws the pacifist line between state aggression on the one hand and Christian anarchism on the other. He made space for law and order and taxes and civil government while at the same time rejecting the nationalist fervor which leads to consumptive war. My reading of Yoder's vision saved me in a way, for at the time I was torn between rejecting the brutality of the state on the one hand, but being unable to commit to anarchism on the other.

This is a seminal work of Christian thought in the twentieth century, so it is, of course, recommended.

Children of Light by Robert Stone

I love checking out new authors by way of the library, but alas my local libraries are not very accessible given my commute. So I registered as a patron of the Multnomah County library, and now use their Central branch downtown for getting books close to work. Robert Stone's Children of Light was my first acquisition with my new card.

The novel follows a few days in the lives of two Hollywood personalities, a writer and an actor, erstwhile lovers, as they set out on a crash course towards each other. The characters are soaked with alcohol and drugs, personal demons, horrible friends and bad lack. It is quite like reading about a train wreck in slow motion, with some good nods to classic American literature as signposts on the track. To my astonishment, Stone develops the characters in such a way as to remain pitiful without sentimentalizing them. I grew fond of them, was rooting for them. But be warned, fair reader: this is a tragedy. Recommended.

Periodicals

  • Harper's January 2014 - John P. Davidson's account of a modern school for butlers is both amusing and perplexing - it seems that the ultra-rich cannot help but descend into self-parody.

SBL's Online Critical Pseudepigrapha

Published: 2014-02-09 21:16:00
Category: Christianity Tags: apocrypha

The SBL publishes the Online Critical Pseudepigrapha - an online critical edition of various "old testament" pseudepigrapha. It includes the Greek text of nearly thirty works, and there is a critical apparatus for four of them.

The website has been live for quite some time, but apparently inactive for a time (the latest blog post is from 2009). Still, this provides an excellent opportunity to get one's feet wet with this collection.

Categories

Tags

© Nathan D. Smith
This work is distributed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.