Tag Archives: technology

Polytonic Greek in Dvorak layout for Linux

I type with the Dvorak keymap. I also type polytonic Greek, which in every operating system I use is based on the Qwerty keymap. So it gets very confusing and annoying to switch back and forth.

So I modified the Linux Xorg Greek keymap to correspond to the English Dvorak layout. And I’d like to share.

  1. Download the modified keymap.
  2. For steps 3-5 you’ll need root privileges, so use sudo or su to get them.
  3. Find your kxb symbols directory – On Debian based systems it is in /usr/share/X11/xkb/symbols, but it has also been placed in /etc/X11/…
  4. Backup your existing Greek layout by copying “gr” from that folder to a safe place.
  5. Remove the “.txt” extension from the modified keymap you downloaded and place the file in your xkb symbols directory.
  6. From a terminal issue this command to “reset” your keymap to normal:
    setxkbmap -layout us -variant dvorak
    This doesn’t do anything, but you’ll want that in your terminal history so you can get back to regular English Dvorak.
  7. Issue the following command to make the keymap active:
    setxkbmap -layout gr -variant dvpoly

Now you are typing polytonic Greek with a Dvorak layout. Now, as others have noted, this does not have the spirit of the Dvorak keymap. It is not constructed based on actual usage of Greek, and it will not be any more efficient than the Qwerty-based layouts (and given the placement of the accent keys in my layout, it may actually be worse). The purpose of this keymap is purely to make it easier for English Dvorak typists to transition to polytonic Greek.

The primary departure I made from the English Dvorak paradigm was moving the semi-colon so that the Greek accent dead keys could be closer to one another. In this keymap, the “semi-colon” (actually Greek question mark) can be typed using the “Q” key on the Qwerty layout, and the acute and grave accents can be typed using the “Z” and “X” keys respectively (which are “;” and “q” in the Dvorak layout). Here is a screenshot of the layout for a more explicit reference.

The xkb keymaps are under the MIT license. I used this non-polytonic layout as a guide for my work. Please post with comments, questions, bugs, etc. For more info on how to type polytonic Greek in Linux, see this excellent post on B-Greek. It references the Qwerty-based layout, but the same principles apply.

Random Genesis

I have just begun working through Natural Language Processing with Python. One of the first features highlighted in the first chapter is the ability of nltk (the Natural Language ToolKit – a Python module) is to generate random text from a corpus.

Without further ado, here is what my system generated based on the book of Genesis in the KJV:

In the selfsame day , neither do thou any thing that creepeth upon the bank of the east wind , sprung up after th And I will send thee a covering of the Philistines unto Gerar . And he commanded the steward of my master greatly ; and she bare unto Jacob , went forth to go down and buy thee fo But if thou be in the second , and fall upon Adam , in the land is good : and his two womenservants , and begat sons and his eleven sons , and put every man ‘ s

Sound realistic? ;-)

Natural Language Processing with Python

I was browsing through a local bookshop’s computer section recently and saw a title which instantly grabbed my attention: Natural Language Processing with Python. It was a bit more expensive than I wanted to pay at that moment, but I thought I may save up.

As it happily turns out, the entire book is available online under a Creative Commons license (BY-NC-ND). This is the sort of thing which makes me really happy. I am going to be checking it out, and if it is useful enough, I may buy to paper copy to thank the authors and O’Reilly for publishing such a great book.

The book is focused mostly on the Natural Language Tool Kit (nltk) Python module, which is available under an Apache license. I had never used it before, but it looks fairly capable. I must admit I was somewhat surprised that Google finds relatively few pertinent results when searching for “nltk new testament greek” or “nltk biblical studies.” The library seems quite suited to the field, so I am surprised it is not more popular among Bible scholars. If nltk is any good, I intend to change that.

The Best Laid Plans (BibleTech 2011)

I had planned to attend the BibleTech conference this weekend. However Elias came down with his first-ever illness, so I thought it best to cancel my trip and help take care of him. As it happens, Weston’s trip to the conference was also cancelled due to family exigencies, and so he was not able to give his talk about Open Scriptures.

I am pretty sad, because I have really been looking to going to this conference for a few years now. Oh well, better luck next year!

Memory Verse Rank update

I’ve updated the Memory Verse Rank site to what will probably be its final form. Here are the basic changes I made:

  • Changed the color scheme and fonts, and added an icon.
  • Changed some instances of “rank” to “rate” to make a semantic distinction between the act of rating a verse and the overall rankings.
  • Now drawing from among all the Bible books, instead of just Ephesians.
  • Instead of trying to choose a random Bible verse, I will be using the ESV API’s built-in “random” verse function. It returns a passage from a pre-picked set. This has two advantages for the site: 1) it solves some technical difficulties in implementing random verse selection from such a large dataset, and 2) it will keep the pool of memorable verses from becoming too diluted.

Someday I made add support for user accounts, but not for now. I’ll just let the site run and see what happens. Tell your friends.

Memory Verse Rank

I have just launched Memory Verse Rank. The premise is simple: I show you a Bible verse, you tell me if it is a memory verse. What’s a memory verse? Use whatever criteria you like, but start with whether or not you’ve ever actually memorized it. I am not entirely sure where the data will lead, but it will be interesting to analyze the properties of memorable and non-memorable verses. I also hope to have all kinds of stats like “most memorable book,” etc.

Right now I am testing with just Ephesians. I still need to work out a better way to pull random verses from the ESV API, which I use for the scripture backend. Please keep an eye out for bugs, and send me any improvements you would like to see.

Huck Finn: Python edition

Someone proposed a Kickstarter project to replace the “n-word” with “robot” in Huck Finn. The project is in the vein of other recent humorous edits of literature in the public domain, though they claim to have an altruistic goal – to get the redacted version of the story back into the hands of kids everywhere. You can even get your name added as a minor character to the book if you donate enough to the project! Let’s call it “benevolent censorship”. Or maybe “the rape of the public domain.” Actually, best not to describe it at all.

The scope of the project includes comissioning an introduction, altering illustrations, and editing the text. Well, I can help with one part of that. The full text of The Adventures of Huckleberry Finn is available on Project Gutenberg. I’ve written a short Python script to replace every instance of the n-word with “robot.”

Just run it and then you’ll have a nice txt file of the robot edition. There, now the editing part is done. Can I get my name added as a townsperson?

Arduino hacking

An Arduino board with attached electronic componentsOn a lark I bought the most recent issue of Make magazine. This issue was all about the Arduino, an open source prototyping microcontroller. So I ordered one, and so far I am having a ton of fun. I even started a git repo to store my “sketches” (which are the C code which program the microcontroller). So far the best I’ve done is make a music box on which I can control the pitch and the tempo with two different types of potentiometers. The sky is the limit, and I keep coming up with new ideas to try.

In praise of dead trees

The Amazon Kindle and other electronic book readers solve two problems:

  1. For readers, they provide a convenient device on which any number of books can be stored, or even downloaded on the road, and provide a decent reading experience.
  2. For publishers, they solve the pesky problem of libraries and lending among friends. In other words, e-readers make it so that people have to buy every book they read.

As far as #1 goes, more power to the consumers. I’d love to be able to have a huge stack of books with me at any time in a small package. But #2 is where the problem lies for me. Nobody should be surprised by this coming from a blog with “library” in the title.

First of all, I am not going to argue from the aesthetic experience of reading books. Some people like the feel, weight, and smell of genuine books. Other people prefer to have a sleek and lightweight gadget on which to read books, papercut-free. As a public transit commuter, I admit that I would much prefer to carry an e-reader. Big bulky books bring bad looks on the bus. On the other hand, I do appreciate the variety of book smells (with the notable exception of mildew), and there is something appealing in the act of physically turning pages, especially near the end of a book. But I do not think that aesthetic considerations are most important here.

The important aspects of e-reader devices come from the restrictions which digital text place on the reader. Because of the digital restrictions management put on e-book files, you cannot share books with your friends. You cannot borrow them from the library. You cannot make a copy in a different format. That is exactly what the publishers and proprietors of e-readers want. Content providers want each consumer to be in a silo. Every good and work they want to consume would be purchased directly, and sharing would not be possible, since every purchased would be bound to the original consumer.

Clearly these restrictions are disadvantageous for us (the “consumers”). Libraries provide an important function in our society – allowing knowledge and culture to be shared for free amongst everyone. And there is something to be said in favor of loaning a favorite book to a friend. I currently have several books loaned out to friends, and in turn I have some of their books. Books are also good from an environmental standpoint, because they are a durable good which do not require any additional material or energy after their initial manufacture.

I like e-reader devices. Perhaps some day contemporary works will be published in a digital format without restrictions. If that happens, I’ll be first in line to buy an e-reader. Until then, I’m content with the tried and true form of dead-tree books. And I think that paper books will hold their own in the market, because of their intrinsic merits.

XML and the Bible

While working on an importer to bring the SBL Greek New Testament into Open Scriptures, I noticed some interesting features of the SBLGNT XML file. (I promised that I would try to exclude posts of a technical nature from this blog, but I am breaking that promise, because I think this technical discussion is interesting and applicable to Biblical studies.)

The SBLGNT’s XML representation of the Biblical text makes an interesting distinction between tags which have child elements and childless tags. That is, normal XML tags encompass the actual Greek text and its structures (such as paragraphs and books), while childless tags represent insertions which are not original to the text. Here is a truncated Matthew 1:1 in the SBLGNT XML as an example:

<book id="Mt">

  <title>ΚΑΤΑ ΜΑΘΘΑΙΟΝ</title>

  <p>

    <verse-number id="Matthew 1:1">1:1</verse-number>

    <w>Βίβλος</w>

    <suffix> </suffix>

    ...

    <w>Ἀβραάμ</w>

    <suffix>. </suffix>

  </p>

Notice how there is no “verse” tag which encompasses all of the included text. Instead “verse-number” is a tag which is inserted where ever the verse breaks are located. This is opposed to the “p” (paragraph) tag, which encompasses all of the child “w” (word) and “suffix” (spaces and punctuation) tags. Paragraphs are of course present in the original biblical text.

One thing I might have done to take this principle even further would be to put the Book titles where they appear in the Greek manuscripts. In SBLGNT XML, the title is always the first child element of the “book” tag. However, that is not always where the title was in the manuscripts. Sometimes it was printed at the end of the book.

I like the distinction between textual forms and externally imposed structures as reflected in this XML document. I’m not sure what Logos’ exact thinking was behind these design choices, but I think I’ve identified it.