,Machine Learning in Finance: From Theory to Practice
Instructor’s Manual
Matthew F. Dixon, Igor Halperin and Paul Bilokon
Matthew Dixon
Department of Applied Math, Illinois Institute of Technology e-mail:
Igor Halperin
NYU Tandon School of Engineering and Fidelity Investments, e-mail: ,
e-mail:
Paul Bilokon
Thalesians Ltd, London, e-mail:
v
,vi Matthew F. Dixon, Igor Halperin and Paul Bilokon
Introduction
Machine learning in finance sits at the intersection of a number of emergent and es-
tablished disciplines including pattern recognition, financial econometrics, statistical
computing, probabilistic programming, and dynamic programming. With the trend
towards increasing computational resources and larger datasets, machine learning
has grown into a central computational engineering field, with an emphasis placed
on plug-and-play algorithms made available through open-source machine learning
toolkits. Algorithm focused areas of finance, such as algorithmic trading have been
the primary adopters of this technology. But outside of engineering-based research
groups and business activities, much of the field remains a mystery.
A key barrier to understanding machine learning for non-engineering students and
practitioners is the absence of the well-established theories and concepts that finan-
cial time series analysis equips us with. These serve as the basis for the development
of financial modeling intuition and scientific reasoning. Moreover, machine learning
is heavily entrenched in engineering ontology, which makes developments in the
field somewhat intellectually inaccessible for students, academics, and finance prac-
titioners from the quantitative disciplines such as mathematics, statistics, physics,
and economics. Consequently, there is a great deal of misconception and limited un-
derstanding of the capacity of this field. While machine learning techniques are often
effective, they remain poorly understood and are often mathematically indefensible.
How do we place key concepts in the field of machine learning in the context of more
foundational theory in time series analysis, econometrics, and mathematical statis-
tics? Under which simplifying conditions are advanced machine learning techniques
such as deep neural networks mathematically equivalent to well-known statistical
models such as linear regression? How should we reason about the perceived bene-
fits of using advanced machine learning methods over more traditional econometrics
methods, for different financial applications? What theory supports the application
of machine learning to problems in financial modeling? How does reinforcement
learning provide a model-free approach to the Black–Scholes–Merton model for
derivative pricing? How does Q-learning generalize discrete-time stochastic control
problems in finance?
Advantage of the Book
This book is written for advanced graduate students and academics in the mathe-
matical sciences, in addition to quants and data scientists in the field of finance.
Readers will find it useful as a bridge from these well-established foundational top-
ics to applications of machine learning in finance. Machine learning is presented as
a non-parametric extension of financial econometrics, with an emphasis on novel
algorithmic representations of data, regularization and model averaging to improve
out-of-sample forecasting. The key distinguishing feature from classical financial
econometrics is the absence of an assumption on the data generation process. This
, ML in Finance Instructor’s Manual vii
has important implications for modeling and performance assessment which are
emphasized with examples throughout the book. Some of the main contributions of
the book are as follows
• The textbook market is saturated with excellent books on machine learning.
However, few present the topic from the prospective of financial econometrics
and cast fundamental concepts in machine learning into canonical modeling and
decision frameworks already well-established in finance such as financial time
series analysis, investment science, and financial risk management. Only through
the integration of these disciplines can we develop an intuition into how machine
learning theory informs the practice of financial modeling.
• Machine learning is entrenched in engineering ontology, which makes develop-
ments in the field somewhat intellectually inaccessible for students, academics
and finance practitioners from quantitative disciplines such as mathematics, statis-
tics, physics, and economics. Moreover, financial econometrics has not kept pace
with this transformative field and there is a need to reconcile various modeling
concepts between these disciplines. This textbook is built around powerful math-
ematical ideas that shall serve as the basis for a graduate course for students with
prior training in probability and advanced statistics, linear algebra, times series
analysis, and Python programming.
• This book provides financial market motivated and compact theoretical treatment
of financial modeling with machine learning for the benefit of regulators, wealth
managers, federal research agencies, and professionals in other heavily regulated
business functions in finance who seek a more theoretical exposition to allay
concerns about the “black-box” nature of machine learning.
• Reinforcement learning is presented as a model-free framework for stochastic
control problems in finance, covering portfolio optimization, derivative pricing
and, wealth management applications without assuming a data generation process.
We also provide a model-free approach to problems in market microstructure,
such as optimal execution, with Q-learning. Furthermore, our book is the first to
present on methods of Inverse Reinforcement Learning.
• Multi-choice questions, numerical examples and approximately 80 end-of-chapter
exercises are used throughout the book to reinforce the main technical concepts.
• This book provides Python codes demonstrating the application of machine learn-
ing to algorithmic trading and financial modeling in risk management and equity
research. These codes make use of powerful open-source software toolkits such
as Google’s TensorFlow, and Pandas, a data processing environment for Python.
The codes have provided so that they can either be presented as laboratory session
material or used as a programming assignment.
Recommended Course Syllabus
This book has been written as an introductory text book for a graduate course in
machine learning in finance for students with strong mathematical preparation in
Instructor’s Manual
Matthew F. Dixon, Igor Halperin and Paul Bilokon
Matthew Dixon
Department of Applied Math, Illinois Institute of Technology e-mail:
Igor Halperin
NYU Tandon School of Engineering and Fidelity Investments, e-mail: ,
e-mail:
Paul Bilokon
Thalesians Ltd, London, e-mail:
v
,vi Matthew F. Dixon, Igor Halperin and Paul Bilokon
Introduction
Machine learning in finance sits at the intersection of a number of emergent and es-
tablished disciplines including pattern recognition, financial econometrics, statistical
computing, probabilistic programming, and dynamic programming. With the trend
towards increasing computational resources and larger datasets, machine learning
has grown into a central computational engineering field, with an emphasis placed
on plug-and-play algorithms made available through open-source machine learning
toolkits. Algorithm focused areas of finance, such as algorithmic trading have been
the primary adopters of this technology. But outside of engineering-based research
groups and business activities, much of the field remains a mystery.
A key barrier to understanding machine learning for non-engineering students and
practitioners is the absence of the well-established theories and concepts that finan-
cial time series analysis equips us with. These serve as the basis for the development
of financial modeling intuition and scientific reasoning. Moreover, machine learning
is heavily entrenched in engineering ontology, which makes developments in the
field somewhat intellectually inaccessible for students, academics, and finance prac-
titioners from the quantitative disciplines such as mathematics, statistics, physics,
and economics. Consequently, there is a great deal of misconception and limited un-
derstanding of the capacity of this field. While machine learning techniques are often
effective, they remain poorly understood and are often mathematically indefensible.
How do we place key concepts in the field of machine learning in the context of more
foundational theory in time series analysis, econometrics, and mathematical statis-
tics? Under which simplifying conditions are advanced machine learning techniques
such as deep neural networks mathematically equivalent to well-known statistical
models such as linear regression? How should we reason about the perceived bene-
fits of using advanced machine learning methods over more traditional econometrics
methods, for different financial applications? What theory supports the application
of machine learning to problems in financial modeling? How does reinforcement
learning provide a model-free approach to the Black–Scholes–Merton model for
derivative pricing? How does Q-learning generalize discrete-time stochastic control
problems in finance?
Advantage of the Book
This book is written for advanced graduate students and academics in the mathe-
matical sciences, in addition to quants and data scientists in the field of finance.
Readers will find it useful as a bridge from these well-established foundational top-
ics to applications of machine learning in finance. Machine learning is presented as
a non-parametric extension of financial econometrics, with an emphasis on novel
algorithmic representations of data, regularization and model averaging to improve
out-of-sample forecasting. The key distinguishing feature from classical financial
econometrics is the absence of an assumption on the data generation process. This
, ML in Finance Instructor’s Manual vii
has important implications for modeling and performance assessment which are
emphasized with examples throughout the book. Some of the main contributions of
the book are as follows
• The textbook market is saturated with excellent books on machine learning.
However, few present the topic from the prospective of financial econometrics
and cast fundamental concepts in machine learning into canonical modeling and
decision frameworks already well-established in finance such as financial time
series analysis, investment science, and financial risk management. Only through
the integration of these disciplines can we develop an intuition into how machine
learning theory informs the practice of financial modeling.
• Machine learning is entrenched in engineering ontology, which makes develop-
ments in the field somewhat intellectually inaccessible for students, academics
and finance practitioners from quantitative disciplines such as mathematics, statis-
tics, physics, and economics. Moreover, financial econometrics has not kept pace
with this transformative field and there is a need to reconcile various modeling
concepts between these disciplines. This textbook is built around powerful math-
ematical ideas that shall serve as the basis for a graduate course for students with
prior training in probability and advanced statistics, linear algebra, times series
analysis, and Python programming.
• This book provides financial market motivated and compact theoretical treatment
of financial modeling with machine learning for the benefit of regulators, wealth
managers, federal research agencies, and professionals in other heavily regulated
business functions in finance who seek a more theoretical exposition to allay
concerns about the “black-box” nature of machine learning.
• Reinforcement learning is presented as a model-free framework for stochastic
control problems in finance, covering portfolio optimization, derivative pricing
and, wealth management applications without assuming a data generation process.
We also provide a model-free approach to problems in market microstructure,
such as optimal execution, with Q-learning. Furthermore, our book is the first to
present on methods of Inverse Reinforcement Learning.
• Multi-choice questions, numerical examples and approximately 80 end-of-chapter
exercises are used throughout the book to reinforce the main technical concepts.
• This book provides Python codes demonstrating the application of machine learn-
ing to algorithmic trading and financial modeling in risk management and equity
research. These codes make use of powerful open-source software toolkits such
as Google’s TensorFlow, and Pandas, a data processing environment for Python.
The codes have provided so that they can either be presented as laboratory session
material or used as a programming assignment.
Recommended Course Syllabus
This book has been written as an introductory text book for a graduate course in
machine learning in finance for students with strong mathematical preparation in