How to Get Pull Request Data Using GitHub API | by Soner Yıldırım | Sep, 2024

0
5


Getting the diff between any two commits

Towards Data Science
Photo by Bofu Shaw on Unsplash

GitHub is the Wikipedia of code. Not everything in GitHub can be taken for granted but it contains the essence and history of how some of the best software tools are created.

It’d be a shame not to have an API for accessing such a valuable resource. Thankfully, we do have one and it’s called, surprisingly, GitHub API.

Let me first mention what this article is not about. We won’t be talking about git comments or how to use git in software development.

This article is more about using GitHub API for analytical purposes. The first and foremost requirement for analytics is data and GitHub has lots of it.

The amount and variety of information we can get from GitHub API is simply amazing. Also, it’s a well maintained and documented API so we won’t have a hard time getting the information we need.

We can get lots of data from GitHub API such as:

  • Commits per pull request
  • Folder and file structure of a repository
  • Average number of files edited per commit
  • Developer-based data such as who pushed the most commits in the last month
  • File-based data such as…