Gmail API Fundamentals
Gmail Cleanup — Practical Scripting
Course 1 · Chapter 3 · Gmail API Fundamentals
📧 Gmail API Fundamentals
Now that you have credentials and configuration set up, it's time to connect to Gmail and start querying emails. This chapter teaches the Gmail API structure, how to authenticate, query emails, and parse the responses.
🔌 Connecting to Gmail
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
def get_gmail_service():
"""Authenticate and return Gmail service."""
SCOPES = ['https://www.googleapis.com/auth/gmail.modify']
creds = None
# Load existing token if available
try:
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
except:
pass
# If no valid credentials, perform OAuth flow
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES
)
creds = flow.run_local_server(port=0)
# Save token for next time
with open('token.json', 'w') as token:
token.write(creds.to_json())
return build('gmail', 'v1', credentials=creds)
🔍 Querying Emails
Gmail Search Syntax
| Query | Meaning |
|---|---|
before:2020/01/01 |
Emails before January 1, 2020 |
from:sender@example.com |
Emails from specific sender |
filename:pdf |
Emails with PDF attachments |
size:>10M |
Emails larger than 10MB |
has:attachment |
Emails with attachments |
Query Example
def find_old_emails(service, days_old: int) -> list:
"""Find emails older than N days."""
from datetime import datetime, timedelta
cutoff_date = (datetime.now() - timedelta(days=days_old)).strftime('%Y/%m/%d')
query = f"before:{cutoff_date}"
try:
results = service.users().messages().list(
userId='me',
q=query,
maxResults=10
).execute()
messages = results.get('messages', [])
print(f"✅ Found {len(messages)} emails older than {days_old} days")
return messages
except Exception as e:
print(f"❌ Error: {e}")
return []
📨 Parsing Email Details
def get_email_details(service, message_id: str) -> dict:
"""Get details about a single email."""
try:
message = service.users().messages().get(
userId='me',
id=message_id
).execute()
headers = message['payload']['headers']
subject = next(h['value'] for h in headers if h['name'] == 'Subject')
sender = next(h['value'] for h in headers if h['name'] == 'From')
date_str = next(h['value'] for h in headers if h['name'] == 'Date')
size = int(message.get('sizeEstimate', 0))
return {
'id': message_id,
'subject': subject,
'from': sender,
'date': date_str,
'size_bytes': size,
'size_mb': size / 1024 / 1024
}
except Exception as e:
print(f"❌ Error: {e}")
return None
💻 Coding Challenges
Challenge 1: Connect to Gmail API
Create gmail_utils.py with:
- Write
get_gmail_service()function - Handle OAuth authentication and token caching
- Return authenticated Gmail service object
Goal: Master Gmail API authentication.
Challenge 2: Query Emails
Create functions to find:
- Emails older than N days
- Emails from specific sender
- Emails with attachments larger than N MB
Goal: Practice Gmail search queries.
Challenge 3: Parse and Report
Create a script that:
- Finds 10 old emails
- Gets details for each (subject, sender, date, size)
- Displays them in a table
- Calculates total storage
Goal: End-to-end email discovery and reporting.