This week I've been working on the Facial Keypoint Detection competition hosted by Kaggle. The objective of the competition is to predict keypoint positions on face images.
We start out with three files: the training set
training.csv, the test set
test.csv and a list of 27214 keypoints to predict in
Let's first load these files into Python:
import numpy as np import pandas as pd df = pd.read_csv('training.csv') testdf = pd.read_csv('test.csv') lookup = pd.read_csv('IdLookupTable.csv')
Inspecting our training set using
gives us some useful information. We see that there are 31 columns, 30 of which
are keypoint coordinates. The last column is a space-separated string of pixel
values, ordered by rows. The next important thing is that there are quite some
keypoints missing; we'll have to do something about that.
Given this info, we can do some more preprocessing on our data:
# Make image column a numpy array for d in [df, testdf]: d['Image'] = df['Image'].apply(lambda im: np.fromstring(im, sep=' ')) # fill up missing values with the mean of respective column df = df.fillna(df.mean()) # create training samples/labels for model fitting X = np.vstack(df.ix[:, 'Image']).astype(np.float) y = df.drop('Image', axis=1).values
Voila, we can now use
y to train a model on our data. We have several
options here: neural networks could potentially be very accurate. However, I'm
not very proficient with those yet, so I chose to use a relatively simple
regression model, k-nearest neighbors. This model predicts a value by finding
it's k-nearest neighbors and returning the mean of those as a prediction.
from sklearn.neighbors import KNeighborsRegressor as KNR estim = KNR() estim.fit(X, y)
This train a k-nearest neighbors regression model with the default values
scikit-learn provides. Using this model, we can do our first submission
to the competition! We need to predict keypoints specified in the
data. Relevant fields are
ImageId (points to a row in
FeatureName (specifies the feature we need to predict).
I simply predicted every keypoint for the whole test set and then molded the
results into the submission format
# stack all test images into one numpy array Y = np.vstack(testdf.ix[:, 'Image']).astype(np.float) # predict all keypoints for the images in Y predictions = estim.predict(Y) # now create the result data and write to csv preddf = pd.DataFrame(predictions, columns=df.columns[:-1]) results = pd.DataFrame(columns=['RowId', 'Location']) for i in range(lookup.shape): d = lookup.ix[i, :] r = pd.Series([d['RowId'], preddf.ix[d['ImageId']-1, :][d['FeatureName']]], index=results.columns) results = result.append(r, ignore_index=True) results['RowId'] = results['RowId'].astype(int) results.to_csv('predictions.csv', index=False)
And we're done! Submitting this placed me halfway up the leaderboard.
Some ways to increase accuracy include experimenting with
value is 5), and being smarter about data imputation (we only filled in empty
fields with the column mean; manually labeling the images or otherwise getting
more precise values would be way more accurate).