Python Frequency Analysis for Ciphers

Dancing_men

Frequency Analysis is the study of the frequency of letters or groups of letters in a cipher text.

Using Python we can extract the count of letters, bigrams, and trigrams, lets have a look shall we:

$ ./frequency.py --help
usage: frequency.py [-h] [--letters] [--bigrams] [--trigrams] msg

positional arguments:
  msg             Message to count letters in

optional arguments:
  -h, --help      show this help message and exit
  --letters, -l   Frequency of letters
  --bigrams, -b   Frequency of bigrams
  --trigrams, -t  Frequency of trigrams

Lets go ahead and enter a simple sentence and do some testing:

$ ./frequency.py 'all work no play makes jack a dull boy' -l
===== Letters =====
('a', 5)
('l', 5)
('k', 3)
('o', 3)
('y', 2)
('c', 1)
('b', 1)
('e', 1)
('d', 1)
('j', 1)
('m', 1)
('n', 1)
('p', 1)
('s', 1)
('r', 1)
('u', 1)
('w', 1)

How about bigrams:

$ ./frequency.py 'all work no play makes jack a dull boy' -b
===== Bigrams =====
('ll', 2)
('ck', 1)
('ac', 1)
('bo', 1)
('ma', 1)
('ke', 1)
('no', 1)
('wo', 1)
('la', 1)
('al', 1)
('ak', 1)
('ja', 1)
('ul', 1)
('es', 1)
('oy', 1)
('ay', 1)
('du', 1)
('or', 1)
('pl', 1)
('rk', 1)

And lastly Trigrams:

$ ./frequency.py 'all work no play makes jack a dull boy' -t
===== Trigrams =====
('boy', 1)
('all', 1)
('dul', 1)
('ull', 1)
('ack', 1)
('wor', 1)
('lay', 1)
('pla', 1)
('mak', 1)
('kes', 1)
('jac', 1)
('ake', 1)
('ork', 1)

The underlying code for this tool is pretty horrendous, but its just a small tool for performing a simple task:

#!/usr/bin/env python
import argparse
from string import ascii_letters
from operator import itemgetter

# Build my Parser with help for user input
parser = argparse.ArgumentParser()
parser.add_argument('msg', help='Message to count letters in')
parser.add_argument('--letters', '-l',  help='Frequency of letters',
            action='store_true',dest='letters', default=None)
parser.add_argument('--bigrams', '-b',  help='Frequency of bigrams',
            action='store_true',dest='bigrams', default=None)
parser.add_argument('--trigrams', '-t',  help='Frequency of trigrams',
            action='store_true',dest='trigrams', default=None)
args = parser.parse_args()
args = parser.parse_args()

if args.letters:
    letter_dict = {}
    for letter in args.msg:
        if letter in ascii_letters:
            try:
                letter_dict[letter] += 1
            except KeyError:
                letter_dict[letter] = 1

    print "="*5, 'Letters', "="*5
    for letter in sorted(letter_dict.items(), key=itemgetter(1), reverse=True):
        print letter

if args.bigrams:
    bigram_dict = {}
    bigram_holder = []
    for letter in args.msg:
        if letter not in ascii_letters:
            bigram_holder = []
            continue
        else:
            bigram_holder.append(letter)

        if len(bigram_holder) == 2:
            bigram = bigram_holder[0] + bigram_holder[1]
            try:
                bigram_dict[bigram] += 1
            except KeyError:
                bigram_dict[bigram] = 1

            last = bigram_holder.pop()
            bigram_holder = []
            bigram_holder.append(last)

    print "="*5, 'Bigrams', "="*5
    for bigram in sorted(bigram_dict.items(), key=itemgetter(1), reverse=True):
        print bigram

if args.trigrams:
    trigram_dict = {}
    trigram_holder = []
    for letter in args.msg:
        if letter not in ascii_letters:
            trigram_holder = []
            continue
        else:
            trigram_holder.append(letter)

        if len(trigram_holder) == 3:
            trigram = trigram_holder[0] + trigram_holder[1] + trigram_holder[2]
            try:
                trigram_dict[trigram] += 1
            except KeyError:
                trigram_dict[trigram] = 1

            l1 = trigram_holder.pop()
            l2 = trigram_holder.pop()
            trigram_holder = []
            trigram_holder.append(l2)
            trigram_holder.append(l1)

    print "="*5, 'Trigrams', "="*5
    for trigram in sorted(trigram_dict.items(), key=itemgetter(1), reverse=True):
        print trigram

Wacky Python Image Creation

The other night I had a wacky idea of extracting each pixel from an image in order to save it as a plain text ASCII file.

Of course this is not ideal and can take a bit of time, but like most things I do with python its just for the fun of it.

I figured the easiest way to achieve this would be to use Python’s Image library and save the output to a serialized pickle text file.

I began to write a simple piece of code to extract each pixel from an image.

First off we need to import a few external libraries to assist:

#!/usr/bin/env python
import Image
import argparse
import pickle

In order to give us some useful –help output I used argparse to take input:

parser = argparse.ArgumentParser()
parser.add_argument('filename', help='Image to extract')
parser.add_argument('output', help='Output file')
args = parser.parse_args()

Once we have our file name we can load the image using PIL and grab the size of image:

im = Image.open(args.filename)
width, height = im.size

Here we have to have some way to keep track of our Y and X Axis of pixels, we can do this with a couple strings:

y_cursor = 0
x_cursor = 0

We will need something to hold the data we collect, why not a list:

image = []

This block is a bit more complex, but all we are doing is stepping through each pixel kind of like a typewriter would do when writing a document. We are using the getpixel method from PIL to extract the pixels RGB information:

print 'Processing %s...' % args.filename
while y_cursor < height:
    row = []
        while x_cursor < width:
        row.append(im.getpixel((x_cursor, y_cursor)))
        x_cursor += 1
    x_cursor = 0
    y_cursor += 1
    image.append(row)

And lastly we will save all the captured data to a pickle file:

f = open(args.output, 'wb')
pickle.dump(image, f)

Once this small script was created I attempted to serialize a JPEG image:

first

The details of the image are:

$ ls -lh mox.jpg
-rw-r--r--@ 1 jeffreyness  staff   870K Nov 10 08:56 mox.jpg

And after running the above Python script I got a ASCII file of:

$ ls -lh output
-rw-r--r--  1 jeffreyness  staff    28M Nov 10 08:56 output

Yes, this made me chuckle. Going from 870K binary JPEG to 28M ASCII text, but as mentioned this is just a proof of concept.

The serialized data ended up being quite a few lines, 4916161 to be exact:

$ wc -l output
 4916161 output

Lets go ahead and take a peak at the un-serialized data:

>>> import pickle
>>> f = open('output', 'rb')
>>> image = pickle.load(f)
>>>
>>> len(image)
960
>>>
>>> len(image[0])
1280
>>>
>>> image[0][0]
(40, 43, 34)
>>>
>>> image[0]
[(40, 43, 34), (39, 42, 33), (42, 45, 36), (41, 44, 35), (40, 43, 34), (38, 41, 32), .....

Now all that was left was to write another small script to take this ASCII content and turn it back in to a image.

Like before we will import our libraries and use argparse to take input:

#!/usr/bin/env python
import Image
import argparse
import pickle

parser = argparse.ArgumentParser()
parser.add_argument('filename', help='Extracted Pickle File')
parser.add_argument('output', help='Output Filename')
args = parser.parse_args()

Since we have the filename from argparse we need to read it and parse it through pickle:

f = open(args.filename, 'rb')
image = pickle.load(f)

Using the concept of one list per row I can determine the width and height of the original image via len:

height = len(image)
width = len(image[0])
size = (width, height)

Now we are ready to create the image container and reset our cursors to start position:

im = Image.new('RGB', size)

# define our cursors to
# parse over the images pixel by pixel
y_cursor = 0
x_cursor = 0

Just like before use a typewriter motion to add to each pixel, here we use the PIL method putpixel:

print 'Building %s...' % args.output
while y_cursor < height:
    while x_cursor < width:
        pixel = image[y_cursor][x_cursor]
        im.putpixel((x_cursor, y_cursor), pixel)
        x_cursor += 1
    x_cursor = 0
    y_cursor += 1

And lastly save the Image to our output file name:

im.save(args.output)

And Abracadabra!

last

$ ls -lh new_mox.png
-rw-r--r--  1 jeffreyness  staff   1.7M Nov 10 09:05 new_mox.png

Creating QR Code with Google

So today I thought I would be neat to show how to quickly create QR codes using Google’s Chart Tools.

Google gives us a extremely easy way to create QR code by sending GET data request via URL:

https://chart.googleapis.com/chart?cht=qr&chs=300×300&chl=nessy

That is nice and easy, but I figured I would wrap this up in a small Python script just because:

#!/usr/bin/env python
from urllib2 import quote, urlopen, Request
from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
import sys

def create(width, heigth, data):
    '''Builds a URL for Google to create us a QR code'''
    # build and make URL request
    google_chart = 'https://chart.googleapis.com'
    url = '%s/chart?cht=qr&chs=%sx%s&chl=%s' % (google_chart, width, heigth, quote(data))
    request = urlopen(url)
    response = request.read()

    # write QR code to file
    f = open('qr.png', 'w')
    f.write(response)
    f.close
    return 'Wrote qr.png..'

def read(qr_file):
    '''Reads a QR code by URL'''
    # post image to decode page
    register_openers()
    datagen, headers = multipart_encode({'f': open(qr_file)})
    request = Request('http://zxing.org/w/decode', datagen, headers)
    response = urlopen(request).read()
    return response

if sys.argv[1] == 'create':
    print create(300, 300, sys.argv[2])

if sys.argv[1] == 'read':
    print read(sys.argv[2])

Basic usage works like so:

$ ./qr.py create "My name is Nessy, Hello"
Wrote qr.png..

Running the create option will write a file named qr.png to your current directory:

$ file qr.png
qr.png: PNG image, 300 x 300, 8-bit/color RGB, non-interlaced

qr

And reading this QR code works just like this:

$ ./qr.py read qr.png
My name is Nessy, Hello