Instructor: Luís Pedro Coelho <lpc@cmu.edu>
Class Number: 98-111
Meeting Times: 6.30PM SH 220
The class is broken up into three modules: Programming and Software Carpentry, Scientific Programming and Applied Topics. The class is also broken into two sections: a lecture and a lab session. During lecture, we will cover general interest material. Lab session will also have a lecture format. However, it will cover specific technologies and methods.
There is no official textbook for the course, but Python Programming: An Introduction to Computer Science by John Zelle is a good Python introduction. Also, you can check out Dive Into Python, which is available for free online or (in printed form) in the CMU library. Another quality online resource is A Byte of Python.
Grades will depend on homeworks and final project. For a passing grade, you need 7 (out of 10) on at least 7 homeworks (there will be at least 9 homeworks) and a 7 on the final project. Homeworks will be assigned on Tuesdays and are due the next Tuesday at mid-night. Late penalties: up to 24 hours delay: 20% penalty. Up until beginning of lab-session: 30% penalty. Thursday sessions will often include discussion of homeworks. Therefore, after that, you can no longer turn in homeworks for grading.
Homework will normally consists of a multiple-choice/short answer section plus a programming question. Homeworks are to be turned in by email, with the answers either typed-in directly in the body of the email (text-only emails, please), or as a text (.txt) or PDF format for the question and as an attached .py file for the programming assignments. In particular, doc, docx, odt will not be accepted.
This is what I expect that students will have learned by the end of the course:
- Software carpentry: source control, unit testing, profilers. Students should know how to use Subversion, nosetest, and the Python profiler as well as understand the concepts behind these tools (which will enable them to use a different implementation of the same ideas).
- Modern programming paradigms: object oriented. Students should know how polymorphism works and understand when and why it can be useful.
- Students should know how floating-point numbers are represented and their limitations. Detailed knowledge of specific formats is not required.
- Technologies: Python, including numpy. Students should be able to comfortably write medium-sized programs (a few thousand lines of code) using these technologies in an effective way.
Most importantly, I expect students to be (and consider themselves) more efficient programmers. The overall goal is a twenty per-cent improvement in productivity.
This outline is my current thinking on the course. The general topics are very much fixed upon (except for the last module, which is flexible by-design), but the content of each lecture might still be tweaked.
Overview of the course, course policies. Principles of readable code.
This is only for people who don’t know programming and will teach you the basics.
What is Python? Basic types and iterations.
Introduction to object-oriented code.
More Python.
Exception-handling. Defensive programming (assert statement)
Odds & ends of python 2. In-class exercises on Python.
Using the shell.
Version control: Subversion.
Defensive Programming & Unit testing (nosetest).
Debugger & Profiler.
We focus on both basic issues of scientific computing (floating point representation) and an overview of basic algorithms for scientific tasks (in particular, numerical optimisation).
Binary representation. How integers are represented in memory (positive and negative numbers), integer sizes. Discussion of fixed-point vs. floating point. How floating point numbers are represented internally. Ieee numbers, nan’s, Infs.
Using arrays to write code, reductions and broadcasting.
Feb 24.
Reformulating your problems as an optimisation problem. Limitations of this approach. Newton’s method. Gradient descent.
Feb 26
Thinking about memory allocation and temporaries. Projects
Mar 3
Pseudo-random numbers. Issues with stochasticity. Metropolis-Hastings Algorithm
Mar 5
Introduction to OpenOpt. Discussion of possible projects.
Homework: Students should submit a project proposal (or choose from the instructor proposed projects).
Mar 17
Discussion of file formats, encodings. Basic syntax of regular expressions.
Mar 19
Mar 24
Publishing code is often part of the publication process with benefits for both the community and the author. In this lecture, we focus on the aspects inherent to a good, re-usable, software package. We also go over open-source licenses. Re-produceable research.
Mar 31
This final section consists of more advanced topics. No homeworks will be assigned as students should be working on their projects. The topics covered in this module is open to change based on student interests.
Simple design principles behind an effective graphical user interface.
Tools for building a user interface: pyqt.
Organising large quantities of data using a relational database.
How to build a database.
Buffer time for overflow from other lectures.
Tools for interface Python/C/C++/Fortran/R/...