This course aims to teach a suite of algorithms and concepts to a diverse set of participants interested in the general concept of fitting data to models. It starts with mostly simple linear algebra and computational methods, and introduces some more difficult mathematical concepts towards the end. This method also, by design, fits in with our approach of morning lectures and afternoon practice on personal computers. The combined teaching system provides opportunities for much hands-on learning and participants leave the course with practical knowledge of the basic algorithms.
The course is very broad and is primarily intended to cover the fundamentals of each technique we address. Consequently, the major gain is that we can cover many different approaches. Think of it this way: we cover the first chapter or two of a specialized "book" on a given method. We therefore get you through the many fundamentals, which then allow you to dig further through the book on your own. Another way of thinking of our approach is the analogy of a carpenter’s tools—the goal is for participants to understand the utility of each tool and not to become specialists in any one method. In that sense the course is introductory and general.
The course taps into material from a very wide selection of literature in many disciplines involving computation, including but not limited to: statistics and applied mathematics, science, engineering, medicine and biomedicine, computer science, geosciences, system engineering, economics, insurance, finance, business, and aerospace engineering. More specific areas in which you might come across relevant books are: Regression, non-linear regression, linear and non-linear parameter estimation, inversion, system identification, econometrics, biometrics, etc. The diversity of the past participants and their fields has always provided many perspectives on our common interest in data and models. Please note that we do not specifically cover non-parametric statistics, principal component analysis, or Big Data.
You will be able to take the afternoon lab exercises along with you as executables so you can practice the course material at a later time. These algorithms are not intended as a stand-alone package to be used later in regression applications; they are simply given to participants to aid in the course instruction.
Laptops for which you have administrative privileges are required for this course. PCs are recommended. Tablets will not be sufficient for the computing activities performed in this course.
Participants are encouraged to study a basic text prior to attendance. Two suggestions are:
- Data Reduction and Error Analysis for the Physical Sciences, P. R. Bevington and D. K. Robinson, McGraw-Hill, Inc., 2nd ed., 1992.
- Applied Regression Analysis, N. R. Draper and H. Smith, John Wiley and Sons, Inc., 2nd ed., 1981.
This course was previously titled "Data and Models in Engineering, Science, and Business."
It is highly recommended that you apply for a course at least 6-8 weeks before the start date to guarantee there will be space available. After that date you may be placed on a waitlist. Courses with low enrollment may be cancelled up to 4 weeks before start date if sufficient enrollments are not met. If you are able to access the online application form, then registration for that particular course is still open.
Takeaways from this course include:
- Examining how to fit data to models
- Defining linear least squares, non-linear least squares, singular value decomposition, sensitivity analysis, experiment design, and parameter error estimation
- Appreciating grid search, random search, simulated annealing, genetic algorithms, neural networks, and large inverse systems
- Investigating principles leading to rapid application of methods
- Evaluating the results of pre-programmed computer exercises
Who Should Attend:
This course is ideal for anyone who fits data to models. This course is truly broad-based and participants from vastly differing fields are envisioned and encouraged to attend. Some of these fields are engineering, business, natural sciences, geoscience, medicine, statistics, and economics.
Familiarity with computing and statistics is desirable. A fair background in linear algebra is highly recommended. The course is a condensed version of a regular MIT class with the same title, taught by Professor Morgan. The course has also been given at NASA, the University of the West Indies in Barbados, Sakarya University in Turkey, Stanford University, University of Science and Technology of China,the Cyprus Institute, and Texas A&M University.
Recent and past participants in this course have come from: Air Force Office of Scientific Research (AFOSR), Amgen Inc., AT&T, BAE Systems, Bank of America, Boeing, Boehringer Ingelheim Pharmaceuticals, BP America, Cisco Systems, Cox Communications, Delphi, Draper Laboratory, Dupont, EMC, Environmental Protection Agency, ExxonMobil Chemical, General Motors, Hitachi (Japan), Intel, Johnson & Johnson, Korea Power Co., Kraft Foods, Los Alamos Labs, Mathworks, Mayo Clinic, Merck & Co Inc, Merrill Lynch, Motorola, Naval Research Laboratory, New York University, NTT (Japan), Nokia Research Center, Phillips Exeter Academy, Philips North America, Pioneer Investments, Polaroid Corporation, Salesforce, Sandia National Labs, Saudi Arabian Monetary Agency, Toshiba Corporation, University of Pennsylvania, University of West Indies, the U.S. Air Force, and Verizon Wireless.
The format of each day is generally the same: mornings are devoted to lectures while participants spend the afternoons running pre-programmed software based on the morning lectures. During the afternoons, we stop the class often to have a discussion of progress and to give helpful tips and suggestions. Participants can work singly or in pairs at the computer.
Individual lectures will address the following topics:
- Philosophy of Data and Models
- Straight Line Data Analysis
- Least Squares
- Levenberg-Marquardt and Ridge Regression Algorithms
- Damped Least Squares Comparison
- Stochastic Inverse
- Singular Value Decomposition
- Random and Grid-Search Methods
- Simulated Annealing and Genetic Algorithms
- Neural Networks
- Parameter Error Estimates
- Large Inverse Problems
- Experimental Design
Note that the order of the lectures can vary from that given above. A bound copy (and an electronic version) of all PowerPoint lecture notes is given to each participant, to follow lectures and make notes.
View 2019 schedule (pdf)
Class runs 9:00 am to 5:00 pm Monday-Friday.
9:00 am - 12:00 pm - Lecture
12:00 pm - 1:00 pm - Lunch Break
1:00 pm - 5:00 pm - Lab Exercises
DEPUTY CHIEF SCIENTIST, AIR FORCE OFFICE OF SCIENTIFIC RESEARCH
“The course efficiently provided a broad understanding of a wide variety of methods to a very varied and interesting group of students.”
ASSOCIATE PROFESSOR, UNIVERSITY OF THE PACIFIC
“Course was well designed. Lab work was very helpful. Application to real-world problems was well illustrated.”
ELECTRICAL & CONTROLS ENGINEER, BP AMERICA
"I enjoyed the courses taken at MIT this summer. They combined a large amount of theory with lab work in an accelerated fashion. These courses have been the best post-bachelor's courses I have taken thus far."
POSTDOCTORAL RESEARCH FELLOW, BRIGHAM AND WOMEN'S HOSPITAL
“I found it to be a very stimulating and exciting environment. I felt that the instructors were very knowledgeable in the area and were willing to discuss issues related to applications beyond the classroom. Overall, I would attend courses at MIT Professional Education - Short Programs in the future and would recommend the program to colleagues.”
SENIOR MECHANICAL ENGINEER, BAE SYSTEMS
“The lab portions of the class were thoughtfully planned and very instructive.”
PROGRAM MANAGER, UNIVERSITY OF ARKANSAS FOR MEDICAL SCIENCES
“The instructors were excellent, and the in-lab reviews with other participants were enlightening.”
ENGINEERING SPECIALIST, BAXTER HEALTHCARE
“To remain competitive, we need to implement process improvements across the organization to greatly improve efficiency while simultaneously increasing the robustness and efficacy of our products. This knowledge provides additional tools to accomplish this.”
DIRECTOR OF RESEARCH AND DEVELOPMENT, PRESCIENT RIDGE MANAGEMENT
“Definitely a good balance between the lectures in the morning which gave the theory and the labs in the afternoon which allowed time to work on the practical application.”
SIMULATION CONSULTANT, DEMATIC
“The knowledge will help me to better analyze customer data and build more sophisticated simulation models.”
Frank Dale Morgan obtained his BSc (Math/Physics, 1970) and his MSc (Theoretical Solid State Physics, 1972) from the University of the West Indies, Trinidad, where he was a Lecturer in Physics from 1970-1975. From 1975 to 1981, he completed a PhD in Geophysics at MIT. He returned to the University of the West Indies, Trinidad, as a Research Fellow in the Seismic Research Unit. From 1983 to 1985 he was a Research Associate in the Geophysics Department at Stanford University. In 1985, he joined the faculty of the Geophysics Department at Texas A&M University. He is now a Professor of Geophysics at the Massachusetts Institute of Technology in the Department of Earth, Atmospheric, and Planetary Sciences and associated with the Earth Resources Laboratory. His current interests are in rock physics, geoelectromagnetism, applied seismology, inverse theory, environmental and engineering geophysics, electrochemistry, and electronic instrumentation. He teaches courses on the physics and chemistry of rocks, environmental and engineering geophysics, alternative energy, and inverse theory. He is the organizer and principal instructor for the course.
Darrell Coles obtained his BA in Pure Mathematics from the University of Rochester (1994), and his MSc in Geosystems (1998) and PhD in Geophysics (2008) from the Massachusetts Institute of Technology. He worked from 2008 to 2010 in Scotland as a research geophysicist in a joint posting between Total Exploration & Production and the GeoSciences School of University of Edinburgh on cutting edge theory and application of the design of industrial seismic experiments and 4D seismic data analysis (leading to an international patent and several peer-reviewed articles). Since 2010, he has worked as a senior research geophysicist at Schlumberger in Cambridge, MA and Houston, TX. His research efforts have been in optimal experimental design for industrial-scale geoscientific applications, inverse and optimization theory, uncertainty characterization and control, all in the context of seismic data acquisition and analysis. He has obtained several additional patents and written several peer-reviewed publications since joining Schlumberger and is currently branching into commercial software development and data science.
Rama Rao is currently Senior Director and Head of Risk Analytics at PayPal. He leads a team of data analysts who monitor business performance and perform the analytics that go into creating PayPal’s risk policies around the world—boundaries within which users can transact and experience PayPal. Rama has held various analytics roles within PayPal over the last five years, and has led several innovations in business analysis and has also helped build out the PayPal’s risk analytics function in India. Prior to PayPal, Rama was at MIT for nine years where he led a research program, funded by an international consortium of oil majors and service companies, working on innovative uses of acoustic measurements to image and locate hydrocarbons. During this time, Rama taught a fall graduate course in data analytics along with Prof. Dale Morgan. Rama also spent a year at McKinsey where he worked on client initiatives aimed at creating new businesses that leverage existing assets and innovations. Rama continues to visit MIT every summer to teach this course. Rama completed his undergraduate studies at the Indian Institute of Technology, Madras followed by dual Masters and a PhD at MIT.
This course takes place on the MIT campus in Cambridge, Massachusetts. We can also offer this course for groups of employees at your location. Please complete the Custom Programs request form for further details.
|Fundamentals: Core concepts, understandings, and tools (75%)||75|
|Latest Developments: Recent advances and future trends (25%)||25|
|Lecture: Delivery of material in a lecture format (40%)||40|
|Discussion: Guided discussion reinforcing lectures and computer lab work (15%)||15|
|Labs: Demonstrations, experiments, simulations (45%)||45|
|Introductory: Appropriate for a general audience (30%)||30|
|Specialized: Assumes experience in practice area or field (50%)||50|
|Advanced: In-depth explorations at the graduate level (20%)||20|