Automating Code Reviews Using OpenAI and GitHub

The State of Code Reviews in Today’s Development Landscape:

In today’s fast-moving world of software development, AI has made remarkable progress. It can write code, debug errors, and even help design architectures. But let’s be honest, we’re not quite at a point where AI can take over the entire development process. Human developers are still essential, not just for their coding skills, but because they bring something AI can’t replicate (yet): context, intuition, and a deep understanding of the business problem they’re solving.

That said, even experienced developers make mistakes. Under tight deadlines and with growing code complexity, things slip through, logic bugs, performance issues, or even security gaps. This is why code reviews are so critical. They act as a second pair of eyes before any code is merged. Typically, platforms like GitHub are used for this. A developer raises a Pull Request (PR), and a teammate reviews the changes before they’re approved. Below is how the current process works.

Current process of code review - TO THE NEW Blog

Current process of code review

But code reviews themselves aren’t perfect. Reviewers might be overloaded with tasks, unfamiliar with the specific part of the codebase, or just miss something. In teams handling dozens of PRs daily, giving every one of them enough attention is tough. And that’s where AI can lend a hand.

Where Generative AI Fits In

Generative AI — like OpenAI’s models can serve as a helpful assistant during the code review process. Not as a replacement for human reviewers, but as an extra layer of insight. Imagine an AI that instantly looks at your PR, summarizes the changes, points out issues, and suggests better approaches, all within seconds of opening the PR.

Here’s how that helps:

  • Every PR gets at least a baseline review.
  • Reviewers can focus on complex logic or business rules, rather than spotting typos or missed edge cases.
  • Developers get quicker feedback, which means faster iterations and fewer bugs.

In this blog, I’ll show you how to build an automated code review pipeline using:

  • GitHub Actions (to trigger reviews when a PR is raised)
  • Python (to extract code changes and communicate with OpenAI)
  • OpenAI (for generating the review)
  • GitHub PR Comments (to post the AI’s feedback right where the developer needs it)

Architecture Overview

Here’s a simple architecture diagram showing how everything connects:

Automated code review workflow | TO THE NEW blog

Automated code review workflow

Getting Started – What You’ll Need

A GitHub Repository

You can use any existing GitHub repo or create a new one for this. The AI will review future PRs raised in this repository.

An OpenAI Account

If you don’t have one yet, create it here: OpenAI Platform Then:

  • Check your available free credits here.
  • If your credits are exhausted, add a payment method (minimum $5) here.
  • Generate an API key and save it in your GitHub repo under:
    Settings → Secrets and variables → Actions → Add Secret → OPENAI_API_KEY

 

How It Works — Step by Step

1. Developer Raises a PR

This is your usual development flow, a feature or bugfix branch is pushed and a PR is created against the main branch.

2. GitHub Action Is Triggered

GitHub lets you run automated workflows on events like PR creation. You can set this up by adding a YAML file under .github/workflows/ai-review.yml.

The workflow does the following:

  • Checks out the code
  • Installs dependencies (like the OpenAI Python client)
  • Runs a Python script that triggers the review

3. Get the Git Diff

The Python script compares the current branch with the target (usually main) to find out what changed:

subprocess.check_output(["git", "diff", "origin/main...HEAD"], text=True)

This gives us the exact changes the developer made.

4. Send the Diff to OpenAI

We send this diff to OpenAI using a chat completion API, like this:

client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "..."},
        {"role": "user", "content": diff}
    ]
)

We use the system role to tell the AI what kind of response we expect. In this case, a detailed code review. Here’s a sample prompt:

"You are a senior software engineer and an expert code reviewer. "
"When provided with code diffs, you will perform a detailed and structured review. "
"Break your feedback into the following sections:"
"1. Summary of Code Changes – Describe in simple terms what the changes are trying to do."
"2. Code Quality Issues – Point out bugs, code smells, or inefficiencies."
"3. Suggestions for Improvement – Offer clear, better alternatives (with code snippets) for problematic parts."
"4. Overall Assessment – Summarize how good or bad the changes are and if they meet clean code standards."
"Be constructive, concise, and professional."

The user message is the actual code diff.

Why GPT-3.5 Turbo?
It’s fast, affordable, and surprisingly good at spotting bugs and recommending better code. If you want even deeper insights, you could upgrade to GPT-4 or GPT-4o — but for basic reviews, 3.5 is great.

5. Post Review Back to the PR

Finally, we take OpenAI’s response and post it as a comment in the PR using GitHub’s REST API and the GITHUB_TOKEN secret.

Example: What an AI-Powered Code Review Looks Like

To see this in action, here’s a real-world example of an automated code review performed by OpenAI.

We’ve created a sample GitHub repository that demonstrates how this setup works: 🔗 View the repo on GitHub

In that repo, we include a deliberately flawed SQL snippet to test how OpenAI responds during a pull request:

-- No so good SQL Code for Review
SELECT * 
FROM customers 
WHERE status = 'active'
AND register_date > '2022-01-01'
OR status = 'inactive'
ORDER BY customer_name;

When this code is committed and a PR is raised, OpenAI automatically analyzes the changes and posts a review comment directly in the pull request, like this:

Summary of code changes | TO THE NEW blog

Summary of code changes

As shown in the comment above, the AI reviewer provides structured and valuable insights, including:

  • Clear summary of changes: It recognized the addition of the AI review script, GitHub workflow, and a sample SQL file.
  • Code quality feedback: It pointed out security gaps, missing error handling, and suggested more descriptive variable names.
  • Precise analysis of the SQL query: It correctly identified a logical flaw in the SQL query: the condition
WHERE status = 'active' 
AND register_date > '2022-01-01' 
OR status = 'inactive'

was flagged due to missing parentheses. The AI understood that this could lead to incorrect filtering, a common mistake in SQL where the precedence of AND and OR isn’t properly controlled. This shows its ability to reason through syntax and logic in SQL, not just surface-level issues.

  • Actionable suggestions: From improving variable names to correcting the SQL logic, it offered practical, ready-to-use fixes.
  • Professional assessment: The AI provided a balanced review, highlighting the innovation while recommending improvements to make the solution more robust.

This kind of feedback not only improves the quality of the code but also helps developers learn better practices over time without waiting for human reviewers to step in.

Tailoring the response as per needs:

To focus solely on reviewing SQL code, we can simply revise the prompt sent to OpenAI. By tailoring the instructions to limit the review scope to SQL logic and best practices, we ensure that the feedback remains targeted and relevant.

Below is the revised prompt used:

"You are a senior SQL expert and code reviewer. "
"When given a SQL query, provide a detailed and structured review focusing only on the SQL logic. "
"Break your feedback into the following sections:"
"1. SQL Issues Identified - List any logic errors, performance bottlenecks, poor formatting, or bad practices."
"2. Suggested Fixes - Provide the corrected SQL query, with improvements for readability, logic correctness, or efficiency."
"3. Brief Explanation - Explain why the original query needed changes and what the new version improves."
"Be concise, clear, and assume the reviewer has working SQL knowledge."

And here is the review comment generated by OpenAI in response:

SQL Issues | TO THE NEW Blog

SQL Issues

This demonstrates how easily we can tailor prompts to fit specific review requirements like SQL, Python, or even documentation.

Wrapping Up

This setup doesn’t replace human reviewers, it helps them. It ensures that every PR, no matter how small or rushed, gets a consistent, automated review that adds real value. Reviewers can focus on what matters most. Developers get feedback instantly. Teams reduce errors and build confidence in their codebase.

FOUND THIS USEFUL? SHARE IT

Leave a Reply

Your email address will not be published. Required fields are marked *