Below is a preliminary list of technical workshops that will be offered during the GC Digital Humanities Research Institute. Workshops build on each other such that successive workshops use skills developed in earlier ones. All participants attend workshops on core skills, then choose which skills they wish to develop further through advanced workshops. Workshop descriptions are subject to small changes before June.

Installation instructions   repo  
Command Line details repo schedule
Intro to git/GitHub details repo schedule
Python for Humanists details repo schedule
Databases & Tidy Data details repo schedule
Introduction to NLTK with Python details repo schedule
HTML/CSS details repo schedule
Machine Learning details repo schedule
Mapping details repo schedule
Ethics details repo schedule
Twitterbots & APIs details repo schedule
Project Lab details repo schedule

Command Line

The command line is a powerful, text-based way to interact with your computer. You can automate tasks such as creating, copying, and converting files, set up your programming environment, run programs, control other computers remotely, and access programs and utilities that do not have graphical equivalents. In this introduction, we will learn common commands to explore and manipulate a simple data set. By the end of the session, we'll be able to navigate your computer, create and manipulate files, and transform text-based data using only the command line. Stepping away from a point-and-click workflow, we move into an environment where we have more minute control over each task we'd like the computer to perform. In addition to being a useful tool in itself, the command line gives us access to a second set of programs and utilities and is a complement to learning programming.

Intro to git/GitHub

Git is a tool for managing changes to a set of files. It allows users to recover earlier versions of a project, and collaborate with other contributors. GitHub is a web-based platform that provides access to open source repositories and facilitates collaboration on files, code, or datasets. This session will introduce participants to version control and collaboration using Git and GitHub, and demonstrate their use in digital projects.

Python for Humanists

Python is a general-purpose programming language that is suitable for a wide variety of core tasks in the digital humanities. Learning Python fundamentals is a gateway to analyzing data, creating visualizations, composing interactive websites, scraping the internet, and engaging in distant reading of texts. This session in Python fundamentals also introduces essential computing concepts such as data types, iteration, input/output, control structures, and importing libraries. This session will serve as a basis for later sessions in databases, natural language processing, and working with APIs.

Databases & Tidy Data

Databases are invaluable tools for organization, and are better than a spreadsheet for working with multiple data sets, asking questions, and adding structure to your data. This workshop will introduce you to the basics of interacting with databases using Python, and will include hands-on practice creating databases and tables, importing data, and querying the database. We will also discuss cleaning data, and what a good data set might look like.

Introduction to NLTK with Python

Digital technologies have made vast amounts of text available to researchers, and this same technological moment has provided us with the capacity to analyze that text. The first step in that analysis is to transform texts designed for human consumption into a form a computer can analyze as well. Using Python and the Natural Langauge ToolKit package (commonly called NLTK), this workshop introduces strategies to turn qualitative texts into quantitative objects. Though that process, we will present a variety of strategies for simple analysis of text-based data.


This session will introduce attendees to two mark-up languages: HTML and CSS. These are two of the most commonly-used languages in rendering information on the web today. In addition to learning the basics and basic differences of each language, attendees will use a simple text editor and their local computer to begin creating a web site. Beyond learning HTML and CSS, users will leave the workshop with a clearer understanding of how the internet works. This workshop is geared towards beginners – no prior experience with either language or website-building is necessary.

Machine Learning

This session will introduce participants to the core concepts of supervised and unsupervised machine learning. We will first do a hands-on example of supervised machine learning with a text classification example, to discuss topics such as exploratory data visualization, data preprocessing, feature representation, and training and testing a machine learning algorithm. We will then work through an unsupervised learning task where we will look for groups in our data using topic modeling. For this session, we will be using the Pandas data analysis library, the scikit learn machine learning library, and the Matplotlib visualization library. This session is aimed towards researchers who want to find patterns in their data or use their data to predict a phenomena.


This workshop will offer an approachable introduction to Geographic Information Systems (GIS), a digital tool that allows to create maps and analyze data in a geospatial context. GIS software can be intimidating, but with this workshop, participants will be able to understand the interface of QGIS, an open-source and versatile GIS solution, and use it for basic yet practical operations. First, we will look at the basic terminlogy of GIS. Then, we will do a step-by-step practice scenario that will allow to learn many of the tools available in QGIS. By the end of this workshop, participants will be able to use QGIS to: add and create vector and raster layers; view and edit fields and features in the attributes table; perform basic geoprocessing operations on vector layers and create basic visualizations to facilitate geospatial data analysis.


This discussion-based workshop will address an array of ethical questions and concerns for folks doing digital projects or research with an emphasis on consent, personhood, confidentiality, political economy, the politics of knowledge production, and accessibility. In addressing these issues, this workshop will first provide a general overview of ethics for institutional research compliance - including the Belmont Report and Institutional Review Board - and then delve into an array of ethical issues that extend beyond institutional purview.

The approach of this workshop is premised on the understanding that there is no simple roadmap for practicing 'good ethics' and, indeed, what constitutes 'good' or 'ethical' for one individual may vary from the next and is often reflective of a scholar's political commitments and personal background. Nonetheless, this workshop will foreground key ethical questions to ask (and keep asking!) when designing and doing digital projects or digital research, and key concepts to draw upon when thinking through these questions.

Twitterbots & APIs

APIs (Application Programming Interfaces) are a structured way for programs to communicate with other programs. A knowledge of APIs allows your programs to communicate with major services such as The New York Times and Twitter and collect data from organizations such as the Library of Congress. In this session, we'll discuss API fundamentals while using the Twitter API to create a Twitterbot—an automated account. We'll also discuss the ethical use of APIs and how tools such as APIs have shaped the modern internet.

Project Lab

What separates projects that turn into something from those that stall out and go nowhere is the formulation (and constant revision and adaptation) of a reasonable, informed, and purposeful project plan. During this session, participants will be introduced to sound project development and management practices. Starting with an end goal in mind, an articulation of the needs and opportunities, audience, resources, and work plan, participants will draft a one to three page proposal for their DH projects.