User Tools

Site Tools


This is the beginning of a collection of useful how-to information and resources for Berkeley faculty interested in offering all or part of a course using the EdX platform and technology. Contact if you'd like to contribute to this Wiki and are not in the EECS Dept. (people with EECS logins already have edit access).


Adapting an existing on-campus course for BerkeleyX (i.e. into an edX MOOC or SPOC) is a condensed HOWTO with pointers to more details in each area of course creation and delivery.

Hacking OpenEdX, and Tools to Help Work With edX Courses

This information is for developers who want to create custom functionality for a particular course or research project (see below) based on the edX platform.

  • The edx-dev group is for people across campus hacking edX directly. We meet weekly and sometimes there's free food. The requirement is you sign up to present something useful at the meeting.
  • Developer-facing wiki on GitHub has info on installing the platform and documentation on its evolving APIs.
  • ReadTheDocs documentation on both how to create a course and how to get/work with learner data; you can browse online, download as PDF, or download as an ebook.
  • Sef Kloninger's Five Ways to Extend edX gives a good overview of the different ways to think about customizing the platform.
  • All About Autograding has some specific info on building an external “custom grader” (one of the Five Ways described in Sef's overview, sometimes referred to as an “external grader” or “external checker” in the official edX developer documentation).
  • RuQL is a Ruby-embedded domain-specific language for creating quiz questions that can be exported to edX, to a printable quiz, or to AutoQCM, an open-source tool that lets you create scantron sheets and grade scanned filled-in scantron sheets.

Doing Research on MOOC Data

Generally speaking, you must:

  • Be a Berkeley Principal Investigator, or a Berkeley student working with a Berkeley PI, or a Visiting Scholar or Visiting Industrial Fellow/Researcher working with a Berkeley PI
  • Have completed Human Subjects training. Berkeley-affiliated researchers can do this online: follow the instructions for "Online Training" on this page.

In general, an edX MOOC produces three different data sources:

  • MySQL databases store info about students, their grades, and earned certificates.
  • The clickstream log or event log, which has an event for every host-side or client-side (typically Javascript) action. Each such event is a JSON object. There's a folder for each server, and under that a file for each calendar day, so getting all events for a calendar day requires examining that day in each folder. The number of events is very large so most applications will pre-filter it with a script or with a simple tool such as grep.
  • The MongoDB databases that store the edX forum data (discussion boards)–the full text of posts, replies, and so on. Importing them into Mongo is straightforward using mongoimport.

To learn more about the data:

There's also a variety of user-contributed tools (both for data management and other stuff) on the edX Tools wiki.




start.txt · Last modified: 2018/02/28 17:02 (external edit)