Workshops


Below is a preliminary list of technical workshops that will be offered during the GC Digital Humanities Research Institute. Workshops build on each other such that successive workshops use skills developed in earlier ones. All participants attend workshops on core skills, then choose which skills they wish to develop further through advanced workshops. Workshop descriptions are subject to small changes before June.

Command Line details repo schedule
Git details repo schedule
Python details repo schedule
Databases details repo schedule
Text Analysis details repo schedule
Mapping details repo schedule
Quantitative Analysis details repo schedule
HTML and CSS details repo schedule
Twitter API details repo schedule
Digital Ethics details repo schedule

Command Line

Introduction to the UNIX command line. Topics covered will include navigating the filesystem, manipulating the environment, executing useful commands, and using pipes to communicate between programs. This session will teach you how to communicate directly with your computer’s operating system using a text-based interface and is a useful first step in learning many other technical skills.

Git

Git is a tool for managing changes to a set of files. It allows users to access open source repositories, recover earlier versions of a project, and collaborate with other contributors. This session will be beneficial to anyone working with data, code, or text.

Python

Python is a programming language that can be used for a wide range of tasks, including collecting and analyzing data in a variety of formats, building web applications, and much more. It is likely the most popular language for academic researchers because of its flexibility and adaptability.

Databases

Databases are invaluable tools for organization and are better than a spreadsheet for working with multiple data sets, asking questions, and adding structure to your data. SQL is a programming language for working with databases. This workshop will introduce you to the basics of SQL, and will include hands-on practice creating databases and tables, importing data, and querying the database.

Text Analysis

This session will introduce text analysis and text classification in Python using The Natural Language Toolkit (NLTK) library and scikit-learn. Through attending this session, you will learn how to use Python to analyze large amounts of text (i.e., literary works, social media corpora, etc.) to find word frequencies, collocations, and learn the basics of text classification with machine learning. This session is designed for researchers who work with various forms of text-based data.

Mapping

This session introduces simple yet powerful ways of displaying spatial information through CartoDB and QGIS. This session will be of particular interest both to researchers working with spatial information as well as anyone interested in storytelling with maps.

Quantitative Analysis

This session will introduce data aggregation and preprocessing, dimension reduction, and supervised and unsupervised machine learning using the Python numpy and sklearn machine learning libraries. This session is aimed towards researchers who want to find patterns in their data or use their data to predict a phenomena.

HTML and CSS

Modern web pages are created using HTML to control content, CSS to control appearance, and JavaScript to dictate behavior. This session will be helpful for anyone that wants to build on the web.

Twitter API

This session will cover the basics of accessing data via the Twitter API. including specific challenges that arise when working with large, text-based data sets. This session will be beneficial for anyone who wants to collect data from Twitter or other social networks.

Digital Ethics

A discussion of digital ethics with an emphasis on social justice, transparency, and accessibility.