Data Science and Machine Learning in Finance
Addressing real-world economic and financial problems via information embedded in data is an active area of academic and professional interests. This course contributes towards this goal with the following two approaches. First, the course provides a foundation to methodically structure large-dimensional datasets and summarise information into interpretable outcomes.
The second part of the course examines how the combination of large datasets accompanied by statistical learning and artificial intelligence techniques are helping practitioners to make more efficient economic and financial decisions. The cwourse is delivered based on a balanced combination of (1) descriptive contents required to formulate financial and economic problems into quantifiable objects of interest, (2) analytical derivations and statistical techniques and (3) software programming.
The course is structured based on four pillars. All pillars are equally weighted in terms of course contents and examinations but also in terms of their importance towards developing a foundation for future careers in data science and machine learning in finance within or outside academia:
- Methodological Frameworks
- Software Implementations
- Finance Theory and Applications
- Data Science: Theories and Implementations
- Regression Analysis: Theory
- Regression Analysis: Implementations and Applications
- Machine Learning: Theory
- Machine Learning: Implementations and Applications
There is no formal prerequisite, however a prior background including calculus, statistics and regression is favourable. The course assumes familiarity with the estimation and inference of the least squares framework covered earlier in the semester one courses.
The course is delivered via weekly sessions and four tutorial workshops. There are three practice problem sets with solutions to further illustrate theories and implementations, followed by three assessment assignments outlined in the semester timetable below. The timetable below is subject to change, please review this timetable on weekly basis:
Friday 9-10 am (Gilbert Scott Building)
Course Tutorials and GTA Support
You are expected to have covered the material ahead of the tutorials. There are two weekly tutorial classes delivered by the following course GTAs, starting in week 3. The schedule will be posted on MyGlasgow.
The classes are arranged to practice analytical problem sets. The first two weeks provide a brief summary of matrix calculus and statistical inference:
- Hadi Movaghari
- (i) Mondays 9-10, (ii) Thursdays 4-5, (iii) Fridays 5-6
- TA Office Hours
The classes are arranged to build up computational foundations to work with data and methodological frameworks:
- Tongtong Wang
- (i) Mondays 4-5, (ii) Mondays 5-6, (iii) Thursdays 5-6
- TA Office Hours Fridays 1-2pm
Financial Datasets and Empirical Exercises
The course contents, practice problem sets and assessment components are based on real-world financial data. It is a requirement that all class participants set up their accounts with the data platforms described below:
- Register your accounts on Financial Analysis Made Easy (FAME) via the university library and additionally Wharton Research Data Services directly on their platform using the university email address.
- This registration is then activated by the business database administration within one week. Please initiate the registration in the first week of the course before we progress towards further course contents and assignments.
- Key statistics and learning outcomes arising from the activities related to the data will be part of the exam. Treat the empirical exercises as an essential part of the learning experience
- As a financial analyst or a research financial economist, you will work with the very same data providers repeatedly. Developing an understanding of the empirical counterparts of theories will be an important takeaway for future careers in finance.
Software Packages and Implementations
Computational and methodological frameworks are implemented in Matlab. An additional spreadsheet is needed for supplementary data transformation and visual inspection, e.g. Libre Office, AWK or Excel (with Analysis ToolPak and Solver Add-in packages enabled). Please make sure you have set up both packages during the first week of the course to be able to practice exercises, replicate examples and complete assignments.
All course material and exercises are designed such that the learning outcomes are achieved based on any computer. However, you may also prefer to consider exploring the following available options to enhance computational capacity and further familiarising yourselves with professional computing systems:
- University HPC Access to HPC machines are provided for research and education purposes. You will be able to access these resources depending on your computational requirements.
- Google Cloud: Machine template is KX8765D, you will need to set up a new machine following the template ID which provides limited free service for the purpose of the class exercises.
The course summative assessment comprises the following four components:
- Quiz (15%) will be made available to access on during week 4 via Moodle. The quiz will be accessible to start within a 24 hours window, and once started the allowed time to complete is 60 minutes. This is an individual assessment and only one attempt is allowed. The quiz comprises multiple-choice questions covering course contents during the first four weeks including methodological learning outcomes, key facts and statistics arising from the numerical and empirical exercises.
- Group Assignment (25%) includes a problem sheet requiring methodological derivations, numerical computations followed by interpretation of results. The problem sheet will be posted during mid February..
- Degree exam in April/May (60%): The final exam will be an individual assessment covering all course contents during the semester including key facts and statistics arising from empirical exercises, class reports and commentaries, methodological derivations and computations. Information regarding the final examination will be released towards the end of the semester.
- Grading is based on meeting the course intended learning outcomes examined in each assignment and following the University's Schedule A. Grades are rewarded based on both the input and output presented in each part thus demonstrating intermediate steps building up towards an overall answer are required and graded.
- Problem set and assignments require accessing real-world financial data from the professional platforms, thus class participants are required to register and activate their accounts with data providers by following the information provided.
Answers to the assignments will be provided in the subsequent week after the deadline and after everyone's submissions are received. Aside from the assessed assignments indicated above, the course includes two practice problem sets with solutions. These are distributed to practice theories and implementations during the semester. Students are expected to attend the office hours and tutorial workshops for reviewing specific queries.
Past exam papers are available via the university portal. These can serve as a basis for preparation, however, note that the exam and course contents are subject to changes on an annual basis.
Textbook and Reading List
- Applied Data Science: Lessons Learned for the Data-Driven Business, By Braschler, Stadelmann, Stockinger, Online version available via the university library
- The elements of statistical learning: data mining, inference, and prediction, By Hastie, Tibshirani, Robert, Online version available via the university library
- Machine Learning in Business: An Introduction to the World of Data Science, by John Hull
- Software Handout
- MATLAB: a practical introduction to programming and problem solving, by Stormy Attaway, Online version available via the university library
Further to the textbooks, there will be journal article readings cited throughout the course. Journal articles indicated as 'required reading' should also be studied in conjunction with textbook reading and form part of the assessments: Reading List.