codebase
computational social science tools for students and scholars
Hello, these scripts are for everyone, but they're designed for people who want to interpret their (social science) data and are interested in engaging with algorithms and programming (in R). Everything I do here you can do as well, for free, on a reasonably modern computer. I'll try to illustrate my code with interesting real data. You can choose to read or code-along.
Recommended guides to complement the content on this page:
Google Colab Notebooks
These notebooks are interactive chunks of annotated R code along with their output. You can just read them, or modify or run them in your web browser of choice.
The notebooks are cloud-based, so you don't need to install anything on your computer or mobile device. You just need a browser, an internet connection, and a google account. The Gemini AI assistant (available on the top right corner of each notebook) can help explain bits of the code you'd like to know more about, and help you adapt it to your data and needs.
Identify large communities in a co-author network. Use network methods to obtain a purposive sample of papers.
Conduct a basic content analysis of a text corpus. Find most frequent words, compare across categories and over time.
Make sense of a large corpus of texts using network methods and an LDA topic model: Using algorithms to sample, summarize, and relate.
Create a co-author network from a SCOPUS database export.
Import and export data in different formats, some basic useful transformations.
Use Excel as a CAQDAS by converting text from word doc(s) to an excel file with one document/paragraph/sentence per row.
Randomly assign order of presentations or anonymize student names. Curve grades based on target mean and median.
Rmarkdown Notebooks
These notebooke are static chunks of annotated R code along with their output. These programs require a bit more juice to run, so they're best run on your laptop or desktop. To code along, please install R and RStudio on your machine, in that order.
Requirements: An internet connection and a Twitter developer account with an approved Academic Track project.
Applications: Analyzing and representing changes in texts over time
Applications: Inductively identifying key themes, identifying "fingerprints" of authors or publications in a large longitudinal text data. Purposive sampling of papers based on theme rather than specific keywords. Measuring and representing similarity/dissimilarity/change over time (Good starting point for Lit Reviews).
Coming soon...
Comparing the performance of LDA and BERT topic models
Creating a co-citation network from data exported from Web of Science/SCOPUS
About
Codebase is primarily a by-product of one of my research projects, sometimes referred to as JRNLS, where I explore how social processes are entangled with knowledge production at the level of the community.
Codebase was recently used to conduct a machine-assisted literature analysis featured in Faraj, S., & Leonardi, P. M. (2022). Strategic organization in the digital age: Rethinking the concept of technology. Strategic Organization, 20(4), 771-785.
To cite this page for code or research methodology, please use:
Bhardwaj, A. (2025) Codebase: Computational social science tools for students and scholars. https://www.anandb.net/code