Department of Mathematics and Statistics
University of Maryland Baltimore County
1000 Hilltop Circle, Baltimore, MD 21250
Office: SOND 401
Email: jon4@umbc.edu
Short bio
I am a fourth year PhD student applying my strong background in analysis,
optimization, linear algebra, and mathematical statistics to a serious study of
the mathematics of machine learning and data analysis. I enjoy solving problems
by using a judicious combination of math, computers, and common sense. In
addition to math I am interested in machine learning, econometrics,
bioinformatics, biometrics, and education metrics.
Research Projects
Bird Detection
In summer 2013, I participated in the Kaggle (a Data Science community)
competition which involved
detecting birds
from audio recordings. Interestingly, the audio was captured in a
“Long-Term Experimental Research Forest” in the Cascade mountain
range of Oregon. My simple statistical models (multinomial logistic
regression) and machine learning algorithms (random forests) did not perform
well on the noisy variables that we extracted from the data, but I learned a
lot about setting up a Python + MySQL development environment in Windows and
Linux, and about leading/working with a team of highly competent developers on
a fast-paced deadline-driven research project, and I got to get my hands dirty
writing SQL and playing with Python machine learning packages such as
scikit-learn.
Bovine Lameness Detection
This 2012 research project involved classifying cows as lame or sound by
using 3D time-series data from a scale that cows walk across. Despite the
difficulties of noisy high-dimensional data and misbehaving cows/equipment,
we were able to report success in the Phase I USDA trial as of August 2012.
A critical part of this success was due to my development of an efficient
data handling system and innovative heuristic classification algorithms.
The project was funded by a USDA grant to Dr. Uri Tasch of UMBC's Mechanical
Engineering Department who contracted my time through CIRC.
I presented this research at the 2013
CS&E conference in Boston
Fraud Detection
In summer 2011, I took an internship at the United States Financial
Industry Regulatory Authority (FINRA). My task was to study the suitability of
automated fraud detection techniques, specifically Benford's Law, for use at
FINRA. It turned out that Benford's Law could not be used directly because
the data did not satisfy certain assumptions. However, not to be dismayed, I
developed a novel fraud detection technique based on the spirit of Benford's
Law that was applicable to FINRA data. This technique successfully discovered
many data anomalies that will warrant further investigation.
Further, I studied Bernie Madoff's $65,000,000,000 Ponzie scheme and compiled
a set of techniques capable of detecting a future fraud of that type. I did
not have access to the IT system containing production data, so some lucky
future researcher will get to apply my techniques to catch bad guys in real
time.
Quotes
“Whereas Computer Science has focused primarily on how to manually program
computers, Machine Learning focuses on the question of how to get computers
to program themselves”
–Tom M. Mitchell, CMU, July 2006
“... Out of such utilitarian concerns will emerge general principles,
including mathematical ones. A typical and generic problem is to describe a
manifold and its inherent and possibly low-dimensional geometry, when it is
presented through noisy data embedded in a high-dimensional space. If we
have had four centuries of physically based and motivated mathematics, it does
not seem a stretch of the imagination to assume that we will have one or more
centuries of mathematics based on the organization of data and the
intelligence to be derived from it, perhaps to be named the mathematics of
knowledge and intelligence. Mathematics and pure mathematicians have a long
tradition of exploring the issues of data, intelligence, noise and meaning.
The classical works of Kolmogorov and of Shannon illustrate this point. The
future is bright for an expansion of this type of inquiry.”
–J. Glimm, Bulletin of the AMS, Jan. 2010
Support
Teaching Assistantship (MATH 251: Multivariable Calculus) through the Department of Mathematics and Statistics.
Doctor of Philosophy: Applied Mathematics [in progress], UMBC
Master of Science: Applied Mathematics, UMBC
Bachelor of Science: Mathematics, UMBC
Bachelor of Science: Physics, UMBC
“The formulation of a problem is often more essential than its solution, which
may be merely a matter of mathematical or experimental skill.” –Albert Einstein