Android Open Trivia Database Application

This month I went from owning an iPhone to an Android, I also decided to freshen up on the new hotness in the Android development realm.

I learned Kotlin is now an officially supported Language, Android Studio works much better than the old days of Eclipse and plugins, and OkHttp + Gson make interacting with Rest APIs pretty darn easy.

So what did I manage to create?

Using a list of public-apis found on toddmotto’s Github I came across the Open Trivia Database.

And there you have it I decided to develope a simple Trivia application:

The source code for this project can be found on my Github, Keep in mind I merely hacked this project together with no former Kotlin, and very little Android development experience.

Magic the Gathering Card Recognition

This weekend I took a bit of time to read up on OpenCV (Open Source Computer Vision Library), I wanted to capture images of Magic the Gathering cards, then identify them using a Python library called ImageHash.

Below is a demonstration of what I was able to accomplish in about 2 days of research and hacking:

I’ll try and break down the steps and image manipulation functions I used to achieve this.

1. Capturing Stream from Webcam

Capturing the video stream, then displaying it on screen, is quite simple.  What I did was create an endless loop where I capture a single frame, then display that frame. This loop will complete when the q key is pressed.

import cv2

cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()

    cv2.imshow('frame', frame)

    if cv2.waitKey(1) == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

For demonstration I’m going to show you a snapshot of the stream through this process:

View post on imgur.com

2. Convert Frame to Grayscale

It is common to convert your image to grayscale, this helps us threshold and identify our contours in later steps:

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

View post on imgur.com

3. Convert Frame to Grayscale

A Threshold allows us to convert our images to absolute black, or absolute white. Here, I’m converting any pixel at color 130 or higher to color 255:

_, thresh = cv2.threshold(gray, 130, 255, cv2.THRESH_BINARY)

View post on imgur.com

4. Find all Contours of Threshold Image

Using my threshold frame, we can find all contours:

_, contours, _ = cv2.findContours(thresh,cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

sorted_contours = sorted([ (cv2.contourArea(i), i) for i in contours ], key=lambda a:a[0], reverse=True)

for _, contour in sorted_contours:
    cv2.drawContours(frame, [contour], -1, (0, 255, 0), 2)

This ends up looking something like this:

View post on imgur.com

5. Find card contour

Here is where I got lazy and decided to just hack it together.

If you notice in the above image, there is a contour for the entire frame and the next largest contour is our card’s edges. I decided for the time being to hard code our card contour as the 2nd largest contour:

_, card_contour = sorted_contours[1]
cv2.drawContours(frame, [card_contour], -1, (0, 255, 0), 2)

View post on imgur.com

6. Drawing Card Corner Points

Next I pass our card_contour into opencv’s minAreaRect function, this gives us 4 points, one for each corner:

rect = cv2.minAreaRect(card_contour)
points = cv2.boxPoints(rect)
points = np.int0(points)

for point in points:
    cv2.circle(frame, tuple(point), 10, (0,255,0), -1)

View post on imgur.com

7. Identify Corners and Warp Image

This part was a little challenging, but I found an excellent explanation by Adrian over at pyimagesearch.com.

Basically, he takes advantage of numpy and how opencv provides box points in row / columns.

# create a min area rectangle from our contour
_rect = cv2.minAreaRect(card_contour)
box = cv2.boxPoints(_rect)
box = np.int0(box)

# create empty initialized rectangle
rect = np.zeros((4, 2), dtype = "float32")

# get top left and bottom right points
s = box.sum(axis = 1)
rect[0] = box[np.argmin(s)]
rect[2] = box[np.argmax(s)]

# get top right and bottom left points
diff = np.diff(box, axis = 1)
rect[1] = box[np.argmin(diff)]
rect[3] = box[np.argmax(diff)]

(tl, tr, br, bl) = rect

widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
maxWidth = max(int(widthA), int(widthB))

heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
maxHeight = max(int(heightA), int(heightB))

dst = np.array([
    [0, 0],
    [maxWidth - 1, 0],
    [maxWidth - 1, maxHeight - 1],
    [0, maxHeight - 1]], dtype = "float32")

M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(frame, M, (maxWidth, maxHeight))

The warped frame came out looking as follows:

View post on imgur.com

7. Hash Image and Compare

First off, this solution will not scale! Be forewarned.

I grabbed 4 Magic cards from my collection, then grabbed an official image from the Gatherer website and stuck it in a directory.

After that, I hashed my warped image, then iterated over each official image comparing hashes:

from PIL import Image, ImageFilter
from glob import glob
import imagehash

# hash our warped image
hash = imagehash.average_hash(Image.fromarray(warped))

# loop over all official images
for orig in glob('orig/*.jpeg'):

    # grayscale, resize, and blur original image
    orig_image = Image.open(orig).convert('LA')
    orig_image.resize((maxWidth, maxHeight))
    orig_image.filter(ImageFilter.BLUR)

    # hash original and get hash
    orig_hash = imagehash.average_hash(orig_image)
    score = hash - orig_hash

    print('Comparing image to {}, score {}'.format(
        orig, score
    ))
print('-' * 50)

And that is it, the Goldenglow Moth card resulted in the following print:

Comparing image to orig/doom_blade.jpeg, score 25
Comparing image to orig/forst_breath.jpeg, score 26
Comparing image to orig/goldenglow_moth.jpeg, score 7
Comparing image to orig/kessig_wolf.jpeg, score 13
--------------------------------------------------

MySQL IN Operator Not Using Index

After being alerted for elevated cpu on a staging mysql instance, I logged into the mysql console and ran a SHOW PROCESS LIST:

SHOW PROCESSLIST\G
*************************** 1. row ***************************
     Id: 1
   User: my_user
   Host: x.x.x.x:xxx
     db: my_db
Command: Sleep
   Time: 2
  State:
   Info: NULL
*************************** 2. row ***************************
     Id: 3
   User: my_user
   Host: x.x.x.x:xxx
     db: my_db
Command: Query
   Time: 786
  State: Sending data
   Info: SELECT file_id, fs_file_id
FROM my_data
WHERE file_id IN ('123456', 123457);

A query running for 786 seconds quickly caught my eye, my next course of action was to run an EXPLAIN:

EXPLAIN SELECT file_id, fs_file_id FROM my_data WHERE file_id IN ('123456', 123457)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: my_data
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 131144420
        Extra: Using where
1 row in set (0.00 sec)

The EXPLAIN confirmed my suspicions, this query was not using an index, and ultimately caused a full table scan.

I was expecting to not find an index for the file_id field, but as you can see this field has an index by being a PRIMARY KEY:

SHOW CREATE TABLE my_data\G
*************************** 1. row ***************************
Table: my_data
Create Table: CREATE TABLE `my_data` (
`file_id` int(10) unsigned NOT NULL,
`fs_file_id` binary(16) DEFAULT NULL,
PRIMARY KEY (`file_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
1 row in set (0.00 sec)

Something I didn’t immediately identify earlier was the mixed types in my IN operator‘s list, they contained both integers and strings:

file_id IN ('123456', 123457)

Once I noticed the mixed type I began debugging, I finally learned the IN operator works great with sets of integers:

EXPLAIN SELECT file_id, fs_file_id FROM my_data WHERE file_id IN (123456, 123457)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: my_data
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 2
        Extra: Using where
1 row in set (0.00 sec)

It even works with sets of strings:

EXPLAIN SELECT file_id, fs_file_id FROM my_data WHERE file_id IN ('123456', '123457')\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: my_data
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 2
        Extra: Using where
1 row in set (0.00 sec)

But as we observed from the begining, the IN operator doesn’t act as expected with mixed strings and integers.