Skip to main content

From 48 Hours to 2 - How We Compressed Insurance Quote Generation Using Python + BigQuery

Image of the author
Usama Navid
Insurance quote generation automation system
Last updated: September 14, 2025

Two months ago, a commercial insurance brokerage came to us with a critical problem. Generating a single insurance quote took their team 48 hours of manual work.

For a complex commercial property quote, underwriters had to:

By the time they delivered a quote, prospects had often already chosen a competitor.

Today, the same process takes 2 minutes. Fully automated. Zero manual data entry. 97% accuracy.

Here’s exactly how we built it.

The Insurance Quote Problem

Insurance quoting is uniquely complex:

Data Intensity

For a single commercial property:

Total data points: 200+ per quote

Multiple External Data Sources

Complex Calculations

Premium isn’t just:

Coverage Amount × Rate = Premium

It’s actually:

Base Rate
× Location Modifier (from 0.75 to 3.5x)
× Property Type Modifier (from 0.8 to 4.2x)
× Construction Modifier (from 0.9 to 2.8x)
× Protection Class (from 0.7 to 2.1x)
× Occupancy Factor (from 0.85 to 3.9x)
× Claims History Modifier (from 0.6 to 5.0x)
× Coverage Limit Factor
× Deductible Credit
+ Territory Surcharge
+ Catastrophe Surcharge
+ Various Fees
- Available Discounts
= Final Premium

And that’s just for property coverage. Add liability, business interruption, equipment breakdown, cyber coverage—each with their own formulas.

Carrier Complexity

Each insurance carrier:

Finding the best carrier for a specific risk requires understanding all of these nuances.

Time Sensitivity

Insurance quotes expire. Market conditions change. The longer it takes to quote, the less likely you are to win the business.

Industry standard: 3-5 days for commercial quotes Fast brokers: 24-48 hours Our system: 2 minutes

The Manual Process (48 Hours)

Let’s break down what the team was doing:

Hour 0-4: Initial Data Collection

Underwriter tasks:

  1. Send detailed questionnaire to client (47 questions)

  2. Request additional documents:

    • Property appraisals
    • Building plans
    • Financial statements
    • Loss runs (claims history)
    • Current policies
  3. Follow up when client is slow to respond

  4. Review submissions for completeness

Problems:

Hour 4-16: Property Research

Underwriter tasks:

  1. Look up property details in databases
  2. Verify square footage, year built, construction type
  3. Research location risk factors
  4. Check flood zone status
  5. Look up fire protection class
  6. Review crime statistics in area
  7. Check for proximity to hazards (coastline, earthquake zones)

Problems:

Hour 16-24: Risk Assessment

Underwriter tasks:

  1. Analyze claims history
  2. Calculate loss ratios
  3. Assess risk category
  4. Determine underwriting tier
  5. Identify potential red flags
  6. Document risk assessment rationale

Problems:

Hour 24-36: Market Shopping

Underwriter tasks:

  1. Identify appropriate carriers for this risk
  2. Pull current rate tables (often in PDF format)
  3. Calculate premiums for each carrier
  4. Consider carrier appetite and likelihood to quote
  5. Account for market conditions
  6. Factor in special programs or discounts

Problems:

Hour 36-44: Quote Preparation

Underwriter tasks:

  1. Select best 3-5 carrier options
  2. Format quotes in proposal template
  3. Write explanatory notes
  4. Double-check all numbers
  5. Add recommendations
  6. Get senior underwriter review

Problems:

Hour 44-48: QA and Delivery

Underwriter tasks:

  1. Senior review of calculations
  2. Fix any identified errors
  3. Final formatting
  4. Convert to PDF
  5. Email to client
  6. Log in CRM

By this time: 48 hours have passed

Often, the prospect has already received quotes from competitors and made a decision.

The Automated Solution

We built a system that compresses 48 hours into 2 minutes.

Architecture Overview

Client Submission (web form)
Data Enrichment Layer (APIs + Web Scraping)
Risk Assessment Engine (Python + ML)
Premium Calculation Engine (BigQuery)
Carrier Matching Algorithm
Quote Generation (Automated)
Delivered to Client & Broker

Component 1: Smart Data Collection

The Problem: Traditional questionnaires ask for 47 data points upfront.

The Solution: Progressive disclosure + intelligent defaults.

New Web Form:

// Stage 1: Minimum Viable Data (5 questions)
1. Property address
2. Building type (dropdown)
3. Approximate square footage
4. Desired coverage amount
5. Target bind date
// Everything else auto-populated or calculated

Enrichment Process:

def enrich_property_data(address):
# Step 1: Geocode address
geocode = google_maps_api.geocode(address)
lat, lon = geocode['lat'], geocode['lon']
# Step 2: Property data from CoreLogic
property_data = corelogic_api.get_property_details(address)
# Returns: sq ft, year built, construction type, etc.
# Step 3: Risk factors
flood_zone = fema_api.get_flood_zone(lat, lon)
fire_protection = iso_api.get_protection_class(lat, lon)
crime_score = fbi_api.get_crime_index(lat, lon)
# Step 4: Comparable properties
comparables = corelogic_api.get_comparable_sales(address, radius=1)
# Step 5: Market data
market_rates = verisk_api.get_market_rates(
zip_code=geocode['zip'],
property_type=property_data['type']
)
return {
'property': property_data,
'location_risk': {
'flood': flood_zone,
'fire_protection': fire_protection,
'crime': crime_score
},
'market': market_rates,
'comparables': comparables
}

Result:

Component 2: Risk Assessment Engine

The Problem: Risk assessment is subjective and inconsistent.

The Solution: Machine learning model trained on 10,000+ historical quotes.

Training Data:

# Historical quotes with outcomes
training_data = pd.DataFrame({
'property_age': [...],
'construction_type': [...],
'location_risk_score': [...],
'claims_last_5_years': [...],
'coverage_to_value_ratio': [...],
# ... 50+ features
'outcome': ['approved', 'declined', 'approved', ...]
})
# Train model
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=200)
model.fit(training_data[features], training_data['outcome'])

Real-Time Scoring:

def assess_risk(property_data):
# Extract features
features = extract_features(property_data)
# Predict risk tier
risk_tier = model.predict(features)[0]
confidence = model.predict_proba(features)[0]
# Generate risk factors
risk_factors = []
if property_data['age'] > 50:
risk_factors.append({
'factor': 'Building Age',
'severity': 'Medium',
'impact': '+15% premium',
'mitigation': 'Recent roof replacement reduces impact'
})
if property_data['flood_zone'] in ['A', 'V']:
risk_factors.append({
'factor': 'Flood Zone',
'severity': 'High',
'impact': '+50% premium or declination',
'mitigation': 'Elevation certificate may improve rating'
})
# Calculate expected loss ratio
expected_loss_ratio = calculate_loss_ratio(property_data, risk_tier)
return {
'tier': risk_tier,
'confidence': confidence,
'risk_factors': risk_factors,
'expected_loss_ratio': expected_loss_ratio
}

Accuracy:

Component 3: Premium Calculation at Scale

The Problem: Calculating premiums for 8 carriers with complex formulas is time-consuming and error-prone.

The Solution: BigQuery-based calculation engine.

Rate Tables in BigQuery:

-- Table: carrier_rates
CREATE TABLE carrier_rates (
carrier_id STRING,
state STRING,
property_type STRING,
construction_class STRING,
protection_class INT64,
coverage_tier STRING,
base_rate FLOAT64,
effective_date DATE,
expiration_date DATE
);
-- Table: rating_factors
CREATE TABLE rating_factors (
carrier_id STRING,
factor_type STRING, -- 'age', 'claims', 'location', etc.
factor_value STRING,
modifier FLOAT64,
description STRING
);
-- Table: carrier_appetite
CREATE TABLE carrier_appetite (
carrier_id STRING,
property_type STRING,
min_coverage INT64,
max_coverage INT64,
excluded_states ARRAY<STRING>,
risk_tier_max STRING,
active BOOLEAN
);

Premium Calculation Query:

WITH base_rates AS (
SELECT
cr.carrier_id,
cr.carrier_name,
cr.base_rate,
-- Apply location modifier
cr.base_rate * loc.modifier AS location_adjusted_rate,
-- Apply construction modifier
cr.base_rate * loc.modifier * const.modifier AS construction_adjusted_rate,
-- Apply protection class modifier
cr.base_rate * loc.modifier * const.modifier * prot.modifier AS protection_adjusted_rate,
-- Continue chain for all modifiers...
FROM carrier_rates cr
JOIN rating_factors loc
ON cr.carrier_id = loc.carrier_id
AND loc.factor_type = 'location'
AND loc.factor_value = @location_zone
JOIN rating_factors const
ON cr.carrier_id = const.carrier_id
AND const.factor_type = 'construction'
AND const.factor_value = @construction_type
-- ... more joins for all factors
WHERE cr.state = @property_state
AND cr.property_type = @property_type
AND cr.effective_date <= CURRENT_DATE()
AND cr.expiration_date >= CURRENT_DATE()
),
carrier_eligibility AS (
SELECT carrier_id
FROM carrier_appetite
WHERE property_type = @property_type
AND @coverage_amount BETWEEN min_coverage AND max_coverage
AND NOT (@property_state IN UNNEST(excluded_states))
AND risk_tier_max >= @risk_tier
AND active = TRUE
),
final_premiums AS (
SELECT
br.carrier_id,
br.carrier_name,
br.fully_adjusted_rate * @coverage_amount / 100 AS annual_premium,
ROUND(br.fully_adjusted_rate * @coverage_amount / 100 / 12, 2) AS monthly_premium,
-- Calculate with different deductibles
ARRAY_AGG(STRUCT(
deductible,
(br.fully_adjusted_rate * deductible_credit) * @coverage_amount / 100 AS premium_with_deductible
)) AS deductible_options
FROM base_rates br
JOIN carrier_eligibility ce
ON br.carrier_id = ce.carrier_id
CROSS JOIN UNNEST([1000, 2500, 5000, 10000]) AS deductible
LEFT JOIN deductible_credits dc
ON br.carrier_id = dc.carrier_id
AND dc.deductible_amount = deductible
GROUP BY 1, 2, 3, 4
)
SELECT *
FROM final_premiums
ORDER BY annual_premium ASC
LIMIT 5; -- Top 5 competitive quotes

Execution Time:

Component 4: Intelligent Carrier Matching

The Problem: Not all carriers will accept all risks. Sending to the wrong carrier wastes time.

The Solution: ML-based carrier matching algorithm.

def match_carriers(property_data, risk_assessment):
# Get all active carriers
carriers = get_active_carriers()
scored_carriers = []
for carrier in carriers:
score = 0
# Appetite match (strict rules)
if not meets_carrier_appetite(carrier, property_data):
continue # Skip entirely if outside appetite
# Historical acceptance rate
similar_risks = get_similar_historical_submissions(
carrier_id=carrier.id,
property_data=property_data
)
acceptance_rate = calculate_acceptance_rate(similar_risks)
score += acceptance_rate * 40 # 40% weight
# Competitiveness (do they typically offer good rates for this risk?)
competitiveness = calculate_carrier_competitiveness(
carrier,
property_data,
risk_assessment
)
score += competitiveness * 30 # 30% weight
# Response time (how fast do they quote?)
avg_response_time = carrier.metrics['avg_response_days']
timeliness_score = max(0, 100 - (avg_response_time * 20))
score += timeliness_score * 20 # 20% weight
# Relationship strength (do we do a lot of business with them?)
volume_score = calculate_volume_score(carrier)
score += volume_score * 10 # 10% weight
scored_carriers.append({
'carrier': carrier,
'score': score,
'acceptance_probability': acceptance_rate,
'expected_premium': estimate_premium(carrier, property_data)
})
# Sort by score, return top 8
scored_carriers.sort(key=lambda x: x['score'], reverse=True)
return scored_carriers[:8]

Accuracy:

Component 5: Automated Quote Generation

The Problem: Creating professional proposals takes 2-4 hours of formatting and writing.

The Solution: Template-based generation with AI-enhanced explanations.

def generate_quote_document(property_data, premiums, risk_assessment):
# Base template
template = load_template('commercial_property_quote.html')
# Populate basic data
template.set('client_name', property_data['client_name'])
template.set('property_address', property_data['address'])
template.set('coverage_amount', format_currency(property_data['coverage']))
# Add carrier options
for i, quote in enumerate(premiums):
template.add_quote_option(
carrier=quote['carrier_name'],
annual_premium=format_currency(quote['annual_premium']),
monthly_premium=format_currency(quote['monthly_premium']),
deductible_options=quote['deductible_options'],
coverage_details=quote['coverage_details']
)
# Generate AI explanation
explanation = generate_explanation_with_gpt(
property_data,
risk_assessment,
premiums
)
template.set('underwriting_notes', explanation)
# Add recommendations
recommendation = generate_recommendation(premiums, risk_assessment)
template.set('broker_recommendation', recommendation)
# Generate PDF
pdf = template.render_to_pdf()
return pdf

AI-Generated Explanation Example:

Underwriting Assessment:
This property presents a moderate risk profile. The building's
age (built 1995) is within acceptable parameters, and recent
renovations (roof replacement in 2022) significantly improve
the risk quality.
Key Risk Factors:
- Location: Coastal zone increases windstorm exposure (+15% premium)
- Mitigation: Hurricane shutters and impact windows reduce risk
- Protection Class: 3 (excellent fire protection) provides rate credit
Claims History:
No losses reported in the past 5 years. This clean history
qualifies for a preferred tier rating and claims-free discount.
Market Conditions:
Current coastal property rates are up 12% year-over-year due to
increased catastrophe frequency. However, this property's strong
risk profile makes it attractive to multiple carriers.
Recommendation:
Carrier B offers the best combination of coverage, price, and
financial strength. Their coastal property program includes
valuable additional coverages at no extra charge.

Generation Time:

Complete Workflow Execution

def generate_insurance_quote(client_submission):
"""
Complete end-to-end quote generation
Target time: Under 2 minutes
"""
# Step 1: Enrich property data (30 seconds)
print("Enriching property data...")
property_data = enrich_property_data(client_submission['address'])
# Step 2: Assess risk (5 seconds)
print("Assessing risk...")
risk_assessment = assess_risk(property_data)
# Step 3: Match carriers (10 seconds)
print("Matching carriers...")
target_carriers = match_carriers(property_data, risk_assessment)
# Step 4: Calculate premiums (15 seconds)
print("Calculating premiums...")
premiums = calculate_all_premiums(
property_data,
risk_assessment,
target_carriers
)
# Step 5: Generate quote document (15 seconds)
print("Generating quote document...")
quote_pdf = generate_quote_document(
property_data,
premiums,
risk_assessment
)
# Step 6: Deliver quote (5 seconds)
print("Delivering quote...")
send_quote_to_client(client_submission['email'], quote_pdf)
log_to_crm(client_submission, quote_pdf, premiums)
print(f"Quote generated in {elapsed_time} seconds")
return quote_pdf

Average Execution Time: 85 seconds

The Results

Time Savings

Before:

After:

10x increase in quote capacity

Win Rate Improvement

Before:

After:

78% improvement in win rate

Revenue Impact

Before automation:

After automation:

Revenue increase: $2.4M/year (375% growth)

Quality Improvements

Error Rate:

Client Satisfaction:

Underwriter Satisfaction:

Implementation Costs

Development

Initial Build:

Infrastructure

Monthly Costs:

Total First Year: $73,800

Return:

Advanced Features

1. Dynamic Pricing Optimization

def optimize_pricing(property_data, carrier_quotes):
"""
Recommend pricing adjustments based on market conditions
"""
# Get market benchmarks
market_avg = get_market_average_premium(property_data)
optimized_quotes = []
for quote in carrier_quotes:
competitiveness = quote['premium'] / market_avg
if competitiveness < 0.85:
# Significantly below market - leave as is
recommendation = "Strong competitive position"
elif competitiveness < 1.0:
# Below market - consider positioning
recommendation = "Competitive pricing"
elif competitiveness > 1.15:
# Above market - suggest negotiation
recommendation = "Consider carrier negotiation or market alternatives"
# Find alternative carriers
alternatives = find_alternative_carriers(property_data)
recommendation += f"\nConsider: {', '.join(alternatives)}"
optimized_quotes.append({
'original': quote,
'market_position': competitiveness,
'recommendation': recommendation
})
return optimized_quotes

2. Renewal Automation

def automate_renewal_quotes(expiring_policies):
"""
Generate renewal quotes 60 days before expiration
"""
for policy in expiring_policies:
# Pull current policy data
current_data = get_policy_data(policy.id)
# Check for material changes
updated_data = enrich_property_data(current_data['address'])
# Flag if major changes detected
if has_material_changes(current_data, updated_data):
notify_underwriter(policy, updated_data, changes)
continue # Requires manual review
# Auto-generate renewal quote
renewal_quote = generate_insurance_quote(updated_data)
# Apply renewal discount (if applicable)
if policy.claims_free:
renewal_quote = apply_discount(renewal_quote, 'claims_free', 0.05)
# Send to client
send_renewal_quote(policy.client_email, renewal_quote)
# Log in CRM
log_renewal_quote(policy, renewal_quote)

3. Competitive Intelligence

def track_competitive_quotes(lost_deals):
"""
Analyze why we lost deals to improve future quotes
"""
for deal in lost_deals:
if deal.has_competitive_quote():
competitive_data = {
'competitor_name': deal.competitor,
'competitor_premium': deal.competitor_premium,
'our_premium': deal.our_premium,
'price_difference': deal.our_premium - deal.competitor_premium,
'property_type': deal.property_type,
'coverage_amount': deal.coverage_amount,
'date': deal.lost_date
}
# Store in BigQuery
store_competitive_data(competitive_data)
# Analyze patterns
if competitive_data['price_difference'] > 1000:
alert_management(
f"Consistently losing to {deal.competitor} "
f"on {deal.property_type} properties"
)

Lessons Learned

Lesson 1: Data Quality is Everything

Garbage in, garbage out. We spent 40% of development time on data validation and enrichment.

What worked:

Lesson 2: Start with High-Volume, Low-Complexity

Our Rollout:

  1. Small commercial properties (most volume, least complexity)
  2. Medium commercial properties
  3. Large commercial properties
  4. Specialty risks (still manual)

This approach built confidence before tackling edge cases.

Lesson 3: Underwriters are Your Customers

Key insight: The system needed to make underwriters’ jobs better, not replace them.

What we did:

Result: Strong adoption and enthusiasm from the team.

Lesson 4: Explainability Matters

Problem: Black box AI recommendations aren’t trusted.

Solution: Every risk assessment includes:

Underwriters understand why the system recommends what it does.

Lesson 5: Edge Cases Will Surprise You

Reality: 10% of properties have weird quirks that break automation.

Solution:

Future Enhancements

1. Instant Binding

Quote accepted by client
→ Automated underwriting approval (for qualified risks)
→ Policy documents generated
→ Payment processed
→ Coverage bound
→ Certificates issued
All in minutes

2. Dynamic Risk Monitoring

Monitor insured properties continuously:
- Weather alerts (hurricanes, floods)
- Crime trends in area
- Property value changes
- Business changes (revenue, employees)
Proactively adjust coverage or recommend improvements

3. Predictive Loss Modeling

ML model predicts:
- Likelihood of claims in next 12 months
- Expected severity
- Recommended risk improvements
- Optimal coverage limits

4. Automated Underwriting

For simple, low-risk properties:

Straight-through processing:
Application submitted
→ Risk assessed
→ Quote generated
→ Approved
→ Bound
Zero human touch (for qualified risks)

The Bottom Line

Reducing quote generation from 48 hours to 2 minutes isn’t just about speed. It’s about:

Competitive Advantage:

Scale:

Quality:

Profitability:

Total Investment:

Total Value:

The insurance brokers that will dominate in 2025 aren’t the ones with the most underwriters. They’re the ones with the smartest systems.

When will you automate your quote generation?