This weekend’s hackathon is qualitatively different than typical hackathons.  For us as digital humanists it is about maintaining focus on the problem domain rather than untangling the technology.  As a technical facilitator my biggest job is in helping to abstract out the low-level complexity with judicious tool selection and workflows that free you to devote as much mental processing the the more interesting and creative high-level problems.  As such, we’ll be focusing on the forests more than the trees.

On the other hand, one of the biggest weaknesses of Digital Humanities is that it is often too disconnected from technology which often results in less than optimum research as a result.  This is not a problem for leaders in Digital Humanities who have begun to apply the most sophisticated technologies to Digital Humanities research such as Machine Learning techniques and Data Visualizations borrowed from Big Data genomic research.  However, the field as a whole still spends far too much time deep in the woods of XML Schema at times to the detriment of achieving more ambitious goals and blunting student interest.

Here is an overview of what value we can extract out of our experience tomorrow.  Given competing demands with other classes and the general background of our team, I’ve slightly modified goals to a more realistic subset rather than dilute the core objectives.  As I’ve mentioned in our run-up meetings, it is my hope that this experience will give you insights beyond the classroom that will prove invaluable in grad school, your career and beyond.  Oh, and let’s have fun with this tomorrow.


Domain Experts/Analysts

  • How to read and think about Data
  • Survey similar Digital Humanities Research
  • Formulate Good Research Questions
  • Effectively Presenting a Narrative with Data
  • Communicating with Technical Teams
  • Go beyond Theory to Create a Project of Genuine Interest

Analysts/Programmers ( * beyond the resources of this event )

  • Understanding Data Acquisition:  Acquisition (OCR), Formats, etc
  • * ( XML Markup Language including XPath Syntax )
  • XML/HTML Parsing Engines (BeautifulSoup4, lxml)
  • Regular Expression Syntax (RegEx)
  • * ( Python, Pandas, and other Python Libraries )
  • Jupyter Notebooks for Data Exploration
  • * ( Simplier Visual ML/Natural Language Processing with RapidMiner )

Programmers/Data Visualizers

  • Visualization Guidelines/Types
  • * ( Visualizations with Jupyter Notebooks )
  • Visualizations with Tableau Public
  • Create Static and Interactive Visualizations
  • Story and Visual Narrative Presentation in Data Science

All Team Members

  • Be able to Think in a more Data-Driven Manner
  • Improve visual storytelling
  • Work and communicate across team specialties
  • Understand the complete Data Science / Analytic Pipeline Process
  • Learn higher-level abstractions that don’t require coding