This document in dvi ps pdf
Kenneth Heafield
<r at kheafield.com>
Language Technologies Institute
http://kheafield.com
Carnegie Mellon University
5000 Forbes Ave NSH 4502
Pittsburgh, PA 15213
Interests
Natural language processing, machine learning, theoretical computer science, distributed systems
Education
PhD program, Carnegie Mellon
August 2008-
Language Technologies Institute in the School of Computer Science
Bachelor of Science, Caltech
September 2003-March 2007
Double major in Mathematics and Computer Science
3.8/4.0 GPA, graduation with honors
Skills
Languages
Written extensively in C++, C, Ruby, SQL, UNIX Shell, LATEX, and HTML
Software
Linux, Hadoop, PostgreSQL, MySQL, Apache, Octave, Matlab, Gnuplot, and GTK
Hobbies
Volunteer system administrator for 1900 user Linux cluster
Experience
Google
March 2007-August 2008
As a Software Engineer with Google Book Search, I worked on a team that uses machine learning to compile card catalogs from multiple sources into a single coherent catalog of books. Previously, I created the scoring system behind a search function in Picasa Web Albums. To share Google's approach to distributed systems, I lectured at MIT on the Hadoop MapReduce framework.
Infosys Technologies
July-September 2006
I traveled to Bangalore, India to intern with the research division of Infosys, India's second largest software outsourcing company. The goal was to automatically organize source code files into a meaningful directory structure. I investigated a technique based on semantic information from the names of functions, local variables, and files. To derive topics from this information, I elected to use Latent Dirichlet Allocation and tweaked it to the domain of source code. As an example, it was able to identify both SSL and logging topics in Apache and correctly label a file covering both topics. Our results were presented at the 2008 India Software Engineering Conference.
Reference: Dr. Girish Rama <Girish_Rama at infosys.com>
Fastsoft
January-April 2006
Netlab spun off a startup and I worked for them as a part-time contractor. Using FAST TCP, the Netlab algorithm responsible for breaking Internet speed records, their Aria product accelerates connections passing through it. This allows senders to use high performance networks more efficiently without custom operating systems. I setup experiments and worked on the performance monitoring and configuration interface.
Reference: Prof. Steven Low <slow at caltech.edu>
Netlab
June 2005-June 2006
As a Richard and Dena Krown Summer Undergraduate Research Fellow, I developed an error model for kernel Principal Component Analysis (kPCA). Professor Low hired me to continue with implementation during the school year. I applied it to identify possible attacks in network traffic, which appear as points with unusually high distance from the manifold learned by kPCA.
Reference: Prof. Steven Low <slow at caltech.edu>
Galaxy Evolution Explorer
June 2004-March 2007
I started working for the Galaxy Evolution Explorer (GALEX) project as a Summer Undergraduate Research Fellow. My goal was finding variable stars and asteroids in observations made by their satellite. To do so, I created a database of all 193 million source measurements and used it to find and analyze over ninety variable objects. The findings were reported in two posters and one journal article. After the summer, they hired me to continue working on the database and to help scientists find interesting data.
References: Dr. Mark Seibert <mseibert at srl.caltech.edu> and Prof. Chris Martin <cmartin at srl.caltech.edu>
Awards
National Science Foundation Graduate Research Fellowship
2008-
$121,500 in stipend and tuition over three years
International Collegiate Programming Contest Regional
2006-07
Third of fifty places as a team of two instead of three
Carnation Scholarship
2005-06
Caltech full tuition academic merit scholarship, 38 awarded per year
Richard and Dena Krown Summer Undergraduate Research Fellowship
2005
$5,000 for ten weeks of summer research
Summer Undergraduate Research Fellowship
2004
$5,000 for ten weeks of summer research
Publications
Publications and unofficial transcript are available at http://kheafield.com/professional/.