My current AI Code Review Workflow
In academia, my coding projects were typically solo affairs. My Github repos reflect this. However, working outside of academia, we need to work in teams. This means that future me will not be the only victim of code that my past self has written. Thus, code reviews are necessary.
While code review services most likely already exist, I prefer building my own script that leverages LLMs to avoid unnecessary costs. Why hire a middle man when you can do it yourself? I’ll pay for the middle man for help outside my domain but not with scripting!
I tried to make myself a script with the help of AI so that I can use LLM models to generate code reviews. That way, most of the embarrassing code mistakes are kept locally away from the eyes of my colleagues.
Thankfully, one of my current colleagues eek suggested and used Claude Sonnet 3.7 to do code reviews for me. This involves getting the diff, adding the prompt, and pasting it to the web interface so that it remained free.
Radu (aka eek) was one of my colleagues in Jellysmack and was part of the first team of my very first job. People in that team were very collaborative and generally nice to work with. They’re very much aware of the fact that I came from academia and are quite understanding in teaching me some of the unwritten industry tips.
So anyway, without further ado, here’s the journey to get to that script.
Simple One-Liner Script
The command for getting the difference of what I worked on and what’s in my local branch could be obtained by git diff master...HEAD. Append | pbcopy (on Mac) to that command and you put the results of this diff on the clipboard.
The next step is to go to the Claude web interface, paste it and add the prompt which eek generously gave me:
Check the changes and compare with the style of the repo, are standards respected, is the code DRY, are there any obvious bugs? Any potential improvements?
From there, Claude would critique the code that I wrote. And it’s been quite useful!
One-Stop Script
While using this simple one-liner script, I saw that I sometimes needed to compare against a commit that isn’t pointed to by master. Plus, copy pasting the prompt then the diff to the web interface required a bunch more clicks that I knew could be avoided.
OpenRouter to the Rescue
Hence, I decided that I need one code review script to rule them all. For simplicity, I decided to use APIs instead of writing a bot that would copy paste stuff into Claude. The question is, which service should I use?
Since I prefer my API credits prepaid to prevent any possibility of my bank account from getting emptied by a bug, I decided to use my OpenRouter API key for this. Not only do they offer the possibility of prepaying for API credits so that you never go over, they also let you choose the model.
Money Matters
Claude Sonnet 3.7 is currently 0.01. That’s not much if you’re thinking to do one code review per branch. But for a cheapskate like me, this could feel like it would blow up. So, I decided that my one-stop script should also feature a way to choose the model.
I decided to just let it choose between Claude’s Sonnet and Google’s Gemini Flash. The latter is way cheaper, and apparently is not that bad with programming tasks. So I put that in as a choice. I asked Gemini Flash for a code review and got it for $0.00001. Or 0.01 cents.
Finally, I decided that maybe pasting to the Claude web interface to save $0.0001 was worth it. So I also wanted a way to be able to make the script just put the prompt and the diff together in the clipboard.
Final Script
I put the script in my scripts folder, put that scripts folder in my PATH and now I can access it using code-review.
- Using
code-reviewcopies the prompt and the diff, ready to be pasted to the Claude web interface. - Using
code-review --aiasks Gemini Flash 2.0 (by default) for a code review. Once received, it copies the code review onto my clipboard and also prints it to the terminal.
I invite you to read the code and modify it as you wish! This was generated by me giving an LLM the proper prompts. Once the LLM gave me a code to start with, I decided to just get my hands dirty and modify most of it myself because I figured it would be easier than a continuous back-and-forth.
I’m still trying out the best mix of incorporating LLMs and AI into my workflow but I imagine this is a good start. Here is the script.
import requests
import json
import subprocess
import argparse
import os
PROMPT = "Check the changes and compare with the style of the repo, are standards respected, is the code DRY, are there any obvious bugs? Any potential improvements? Be brief."
def get_git_diff(commit="master"):
"""
Retrieves the git diff between the specified commit and HEAD.
Args:
commit (str, optional): The commit ID to compare against. Defaults to "master".
Returns:
str: The git diff output, or None if an error occurred.
"""
try:
process = subprocess.run(
["git", "diff", f"{commit}...HEAD"],
capture_output=True,
text=True,
check=True # Raise an exception if the command fails
)
return process.stdout
except subprocess.CalledProcessError as e:
print(f"Error getting git diff: {e}")
print(f"Stderr: {e.stderr}") # Print the actual error message from git
return None
except FileNotFoundError:
print("Error: Git not found. Make sure Git is installed and in your PATH.")
return None
def query_openrouter(diff_content, model=None):
"""
Queries the OpenRouter API with the provided diff content.
Args:
diff_content (str): The git diff content.
Returns:
str: The content of the response from OpenRouter, or None on error.
"""
try:
api_key = os.environ.get("OPENROUTER_API_KEY")
if model is None:
model = "gemini"
if not api_key:
print("Error: OPENROUTER_API_KEY environment variable not set.")
return None
if model == "gemini":
model_name = "google/gemini-2.0-flash-001"
elif model == "claude":
model_name = "anthropic/claude-3.7-sonnet:beta"
else:
print(f"Error: Invalid model: {model}. Please use 'gemini' or 'claude'.")
return None
response = requests.post(
url="https://openrouter.ai/api/v1/chat/completions",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
},
data=json.dumps({
"model": model_name,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": f"{PROMPT} {diff_content}"
}
]
}
],
})
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
return response.json()['choices'][0]['message']['content']
except requests.exceptions.RequestException as e:
print(f"Error connecting to OpenRouter: {e}")
return None
except json.JSONDecodeError:
print("Error decoding JSON response.")
return None
except KeyError:
print("Error: Unexpected JSON response format from OpenRouter.")
print(response.text) #Print to help debug
return None
def copy_to_clipboard(diff_content):
"""
Pastes the diff content to the clipboard.
Args:
diff_content (str): The diff content to paste.
Returns:
None
"""
try:
subprocess.run(["pbcopy"], input=diff_content.encode())
except FileNotFoundError:
print("Error: pbcopy not found. Please install it if needed.")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Get git diff and query OpenRouter.")
group = parser.add_mutually_exclusive_group()
group.add_argument("--commit", type=str, default="master", help="The commit ID to compare against.")
group.add_argument("--model", type=str, help="The model to use for the prompt. Can be 'gemini' or 'claude'.")
group.add_argument("--ai", action="store_true", help="Flag to permit this script to use OpenRouter API, which is paid.")
args = parser.parse_args()
diff = PROMPT + " " + get_git_diff(args.commit)
if not args.ai:
if diff:
copy_to_clipboard(diff)
print("Prompt and diff copied to clipboard.")
else:
print("Failed to get git diff.")
else:
if diff:
response_content = query_openrouter(diff, model=args.model)
if response_content:
copy_to_clipboard(response_content)
print(response_content)
else:
print("Failed to get response from OpenRouter.")
else:
print("Failed to get git diff.")
Conclusion
I feel very lucky to be working with more or less the same team again here in IJW. My teammates are collaborative and are always open and understanding.
Unfortunately, such isn’t always the culture of any programming team. Some would not put any weight into your job experience as a researcher. Some would even go further and lambast you for not knowing something as opposed to patiently teaching them. And some keep their strategies to themselves so that only they look better. Instead of letting you in on some use-cases for AI, they’d use it to leave AI-generated comments on your PRs that border on pedantry. I hope to not end up like those types of people.
This is why I chose to work with my current team. Collaborative, understanding, and patient. I feel like I learn something new and useful everyday. So useful, that I want to share so much here. Time and energy is the only limit!