Machine Learning for Big Data and Text Processing: Foundations

Machine learning is a rapidly expanding area with a diverse collection of tools and approaches. Successfully applying such methods to real tasks may seem to require expertise that many do not possess. However, all these methods share the same basic concepts, use the same building blocks.

Understanding these basics, formulations, and when they are appropriate, is key to using machine learning techniques successfully in practice. This foundational course covers the essential concepts and methods in machine learning, providing participants with an entry level expertise they need to get started and quickly move ahead. 

This course was previously titled Machine Learning for Big Data and Text Analysis.


Machine Learning for Big Data and Text Processing: Foundations may be taken individually or as a core course for the Professional Certificate Program in Machine Learning and Artificial Intelligence.

Lead Instructor(s): 

Regina Barzilay
Tommi Jaakkola
Stefanie Jegelka


Jun 8, 2020 - Jun 9, 2020

Course Length: 

2 Days

Course Fee: 





  • Open

It is highly recommended that you apply for a course at least 6-8 weeks before the start date to guarantee there will be space available. After that date you may be placed on a waitlist. Courses with low enrollment may be cancelled up to 4 weeks before start date if sufficient enrollments are not met. If you are able to access the online application form, then registration for that particular course is still open.

Registration for this program will close by May 8

Participant Takeaways: 

  • Understand the basic machine learning concepts and methods including neural networks
  • Learn how to formulate/set up problems as machine learning tasks
  • Assess which types of methods are likely to be useful for a given class of problems
  • Understand strengths and weakness of learning algorithms

Who Should Attend: 

This course is appropriate to obtain a better understanding of machine learning basics. It is most suitable for those with an undergraduate degree in computer science or other related technical areas. A high-level understanding of programming (thinking in terms of programs) is helpful.

The foundational course describes key concepts, formulations, algorithms, and practical knowledge for people who are getting started or need to brush up in machine learning, and provides participants with core knowledge to succeed in the advanced level course. 

Computer Requirements:

Laptops are required for this course. Tablets will not be sufficient for the computing activities performed in this course.

Program Outline: 

Mon: (5.5h)
[10:00am] Introduction to ML (1h)
[11:00am] Formulation of ML problems (1h)
[  noon ] Joint lunch, Stata Center
[ 1:30pm] Linear classification/regression (1h)
[ 2:30pm] coffee break
[ 2:45pm] Loss, regularization, gradient algorithms (1.5h)
[ 4:15pm] Tutorial on using ML packages (1h)

Tue: (6h)
[ 9:00am] Features, missing data (1h)
[10:00am] Non-linear classification (1h)
[11:00am] coffee break
[11:15am] Feed-forward neural networks: representation (1h)
[12:15pm] lunch
[ 1:45pm] Neural networks: algorithms (1h)
[ 2:45pm] coffee break
[ 3:00pm] Convolutional networks (images) (1h)
[ 4:00pm] Tutorial on using DNN packages (1h)


Course Schedule: 

See the outline above for detailed schedule.

Class runs 10:00 am - 5:00 pm on Monday and 9:00 - 5:00 pm on Tuesday.



This course takes place on the MIT campus in Cambridge, Massachusetts. We can also offer this course for groups of employees at your location. Please complete the Custom Programs request form for further details.