Political Campaign Commercial Collection Reprocessing

Project Title: Political Campaign Commercial Collection Reprocessing

Duration

2021 – “Ongoing”

Institution

Carl Albert Congressional Research and Studies Center Archives
(In collaboration with Harvard University and the University of Iowa)

Project Overview

The Carl Albert Center Archives, along with Harvard University and the University of Iowa, was awarded a collaborative research grant from the National Science Foundation (NSF) for the project:
“Understanding the Evolution of Political Campaign Advertisements over the Last Century.”

The project focused on three major aims:

Make a large underutilized collection of over 120,000 political ads (1912–2016) suitable for academic and public research.
Understand the evolution of political advertising, particularly regarding issue advocacy and gender/minority representations before 1996.
Promote interdisciplinary education in audiovisual analysis for graduate and undergraduate researchers.

Our task at the Carl Albert Center focused on aim #1, delivering a cleaned, structured, and accessible dataset for collaborators. This page documents our process and the innovative solutions developed during the reprocessing effort.

Key Research Components

Addressed challenges of insufficient documentation, undefined workflows, and limited funding.
Developed scalable workflows for managing large AV collections.
Ensured long-term access and usability of digital ad materials.
Shared practical solutions for academic archival environments.

Workflow Documentation & Resources

Case Study & Collection Background

Origin, acquisition, and growth timeline of the ad collection
Collection complexity and disorder: the entropy effect

Infrastructure & System Challenges

Backlogs due to “internet effect” and constant digital growth
Issues with format normalization, file duplication, and legacy media

Initial Tools & Methods

Python templates for batch renaming, classification, metadata mapping
Control methods: Unified Component ID (P_COPY-OID) system
Audio/video data cleansing, aggregation, forensic analysis

Implementation & Phase 1 Workflow

Digitization and file intake
Batch tagging, error detection, AI-assisted classification
Manual QC and correction protocols

Sample Student Access Workflows

Group A–B: Basic metadata cleanup
Group C: Enhanced tagging & AV sync corrections
Group QA-Rover-1: Advanced metadata validation & exception reporting

Research Tools & Appendices

Appendix 1

Building a Python environment (step-by-step setup guide)

Appendix 2

Common workflows for AV transcription

2.1: Creating transcripts & summaries (free tools)
2.2: Accuracy testing – Whisper AI vs. Gensim + NLTK

Temporary Video Viewing Platform

Political Ads Interface — Click image to visit CAC Digital Archives