research

Gmail vs Outlook Inbox Placement: A 12-Month Deliverability Study

We monitored 400 Gmail and 400 Outlook mailboxes over 12 months, measuring inbox placement rates at 1, 3, 6, and 12-month intervals. Gmail averaged 93.2% inbox placement at month 12; Outlook averaged 89.7%. But the gap varied significantly depending on mailbox age and sending pattern.

By Sarah Mitchell • March 8, 2026 • 16 min read

Study Overview

This study tracked inbox placement rates (IPR) for 800 mailboxes — 400 on Google Workspace and 400 on Microsoft 365 — over a continuous 12-month period from March 2025 through February 2026. The goal was to produce a longitudinal comparison of deliverability performance between the two dominant business email providers, controlling for warmup procedure, sending volume, and authentication configuration.

Existing comparisons between Gmail and Outlook deliverability tend to be snapshot measurements: single-point-in-time tests that cannot capture how reputation develops over months of sustained sending. This study was designed to address that gap by measuring the same mailboxes at regular intervals throughout their first year of active use.

Methodology

Sample Design

800 mailboxes were provisioned in March 2025:

  • Gmail group: 400 mailboxes across 100 Google Workspace accounts (4 mailboxes per workspace). Domains aged 6–18 months at study start, purchased from a mix of registrars (Namecheap, Cloudflare, Google Domains).
  • Outlook group: 400 mailboxes across 100 Microsoft 365 tenants (4 mailboxes per tenant). Domain age distribution matched the Gmail group (6–18 months, median 11 months).

All 800 domains had SPF, DKIM, and DMARC (p=none initially, upgraded to p=quarantine at month 3) configured before first send. No domain had prior email sending history.

Sending Protocol

All mailboxes followed an identical lifecycle:

  • Months 1–1.5: Automated warmup only (peer-to-peer warmup emails, ramping from 5 to 25/day)
  • Months 1.5–12: Mixed sending — 15 warmup emails/day + 20 cold outreach emails/day to real B2B prospects

Cold outreach emails were personalized (first name, company name, industry-specific pain points) and sent to verified email addresses with less than 2% expected bounce rate. Subject lines and body content varied but followed a consistent template structure across both provider groups.

Measurement

Inbox placement was measured using two methods:

  • Seed-list testing: Each mailbox sent 5 daily test emails to a 40-address seed panel (20 Gmail, 12 Outlook, 8 Yahoo/other). Placement classified as inbox, spam, or missing.
  • Pixel tracking: Open rates on cold outreach emails served as a proxy signal for inbox placement (emails in spam are rarely opened).

IPR measurements were aggregated at four time points: month 1, month 3, month 6, and month 12.

Results

Inbox Placement Rate by Provider and Mailbox Age

Time PointGmail IPR (mean)Gmail IPR (median)Outlook IPR (mean)Outlook IPR (median)Difference (mean)
Month 178.4%80.1%73.6%74.8%+4.8 pp
Month 388.7%90.3%83.9%85.1%+4.8 pp
Month 692.1%93.4%87.8%88.6%+4.3 pp
Month 1293.2%94.6%89.7%90.9%+3.5 pp

Gmail maintained a consistent advantage of 3.5–4.8 percentage points across all time points. The gap narrowed slightly over time (from 4.8 pp at month 1 to 3.5 pp at month 12), suggesting that Outlook's reputation system rewards long-term consistent sending behavior but builds reputation more slowly during the initial months.

Inbox Placement Rate: Warm vs. Cold Sending

We measured IPR separately for warmup emails (peer-to-peer, high engagement) and cold outreach emails (to real prospects, lower engagement):

Email TypeGmail IPR (month 12)Outlook IPR (month 12)
Warmup emails only97.1%94.3%
Cold outreach emails89.8%85.4%
Combined (all email)93.2%89.7%

Cold outreach emails had lower inbox placement than warmup emails on both providers, which is expected given lower recipient engagement rates. The drop from warm to cold was 7.3 percentage points on Gmail and 8.9 percentage points on Outlook, suggesting that Outlook penalizes low-engagement sending more aggressively.

Variance and Distribution

At month 12, the standard deviation for Gmail IPR was 4.2 percentage points versus 6.8 for Outlook. This indicates that Gmail produces more consistent deliverability outcomes, while Outlook shows greater mailbox-to-mailbox variation.

Distribution at month 12:

  • Gmail: 87.3% of mailboxes above 90% IPR; 4.5% below 80% IPR
  • Outlook: 71.8% of mailboxes above 90% IPR; 9.3% below 80% IPR

The 9.3% of Outlook mailboxes below 80% IPR at month 12 were disproportionately concentrated among domains that had experienced at least one deliverability incident (temporary spike in bounces or complaints) during the study period. Gmail showed greater resilience to temporary incidents, with faster reputation recovery (median 6 days vs. 13 days on Outlook).

DMARC Policy Impact

At month 3, all domains were upgraded from DMARC p=none to p=quarantine. We observed the following IPR changes in the 30 days following the upgrade:

  • Gmail: +1.3 pp average improvement (from 88.7% to 90.0%)
  • Outlook: +2.1 pp average improvement (from 83.9% to 86.0%)

Outlook appeared to reward stricter DMARC policies more than Gmail, though the effect was modest in both cases. A subset of 50 domains that moved directly to p=reject at month 6 showed an additional +0.8 pp improvement on both providers.

Sending Volume Sensitivity

At month 6, we tested the effect of volume increases on a subset of 100 mailboxes (50 Gmail, 50 Outlook). These mailboxes increased cold outreach from 20 to 40 emails/day over a two-week ramp.

  • Gmail: IPR dropped 2.1 pp during the ramp period, then recovered to prior levels within 8 days of stabilizing at the new volume.
  • Outlook: IPR dropped 4.7 pp during the ramp and required 19 days to recover. 3 of 50 Outlook mailboxes (6%) experienced sustained IPR below 80% for 30+ days after the volume increase.

This suggests Gmail is significantly more tolerant of volume increases, while Outlook requires more gradual scaling.

Open Rate Proxy Analysis

As a secondary validation, we compared open rates on cold outreach emails between the two providers. Open rates serve as an imperfect proxy for inbox placement: emails that reach the inbox are opened more frequently than those in spam.

Time PointGmail Open RateOutlook Open Rate
Month 347.2%41.8%
Month 649.6%44.3%
Month 1251.3%46.1%

Open rate differences (approximately 5 pp) were consistent with, but slightly larger than, the IPR differences measured via seed testing. This is expected because inbox placement affects the opportunity to open, creating a compounding effect.

Reputation Recovery After Deliverability Incidents

During the 12-month study, 143 mailboxes (78 Gmail, 65 Outlook) experienced at least one "deliverability incident" — defined as a drop of 15+ percentage points in IPR within a 7-day period. These incidents were typically triggered by a temporary spike in bounce rates (from a bad list segment), a complaint rate increase, or sending pattern anomalies (weekend vs. weekday volume shifts).

Recovery timelines differed substantially by provider:

Recovery MetricGmail (n=78)Outlook (n=65)
Median days to recover prior IPR613
25th percentile recovery3 days7 days
75th percentile recovery11 days22 days
Mailboxes that never fully recovered4 (5.1%)9 (13.8%)
Mean IPR drop at incident peak-19.3 pp-23.7 pp

Gmail's faster recovery (median 6 vs. 13 days) and lower permanent damage rate (5.1% vs. 13.8%) suggest a more forgiving reputation system. Outlook's longer memory for negative signals means that deliverability incidents carry greater long-term cost on Microsoft infrastructure. Teams using Outlook should invest more heavily in preventive measures (list verification, volume consistency) rather than relying on post-incident recovery.

Spam Folder vs. Missing (Blocked) Analysis

When emails did not reach the inbox, the failure mode differed by provider. Among non-inbox placements at month 6:

  • Gmail: 82.3% went to spam folder, 17.7% were missing (silently dropped or blocked). Gmail prefers to deliver to spam rather than block outright.
  • Outlook: 64.1% went to Junk folder, 35.9% were missing. Outlook blocks (returns 550 errors or silently drops) more aggressively than Gmail.

This distinction matters for diagnosis: Gmail senders can often detect deliverability problems through seed testing (emails appear in spam), while Outlook senders may see emails simply vanish without a clear error signal, making problems harder to identify and resolve.

Limitations

  • Sample homogeneity: All domains were US-registered with US-based sending infrastructure. International domains and sending IPs may perform differently on each provider.
  • Confounding variables: While we controlled for warmup protocol, domain age, and authentication, we could not control for recipient behavior differences. Gmail recipients may interact with email differently than Outlook recipients, affecting engagement-based reputation signals.
  • DMARC baseline: Starting with p=none and upgrading to p=quarantine at month 3 means early-period measurements reflect a weaker authentication posture. Teams deploying with p=reject from day one may see different initial trajectories.
  • Warmup dependency: All mailboxes used the same automated warmup system. Results reflect the interaction between this specific warmup method and each provider's reputation algorithms. Different warmup approaches could yield different provider comparisons.
  • Pixel tracking limitations: Open rate measurements rely on pixel loading, which is affected by email client image-loading policies. Apple Mail Privacy Protection and Outlook's external content blocking may inflate or deflate open rate proxies unevenly.
  • Survivor bias: 11 mailboxes (7 Outlook, 4 Gmail) were excluded from month 12 analysis due to account suspension or domain expiration. This may slightly inflate the month 12 averages for both groups.

Conclusions

Gmail consistently outperformed Outlook on inbox placement by 3.5–4.8 percentage points across all 12 months of this study. The advantage was most pronounced during the first three months and narrowed modestly over time. Gmail also demonstrated faster reputation recovery from deliverability incidents (6 vs. 13 days median) and greater tolerance for volume increases.

However, Outlook showed greater sensitivity to DMARC policy strictness, rewarding p=quarantine and p=reject adoption with larger IPR improvements than Gmail. This suggests that teams sending primarily to Outlook recipients should prioritize strict DMARC policies early.

For cold email teams choosing infrastructure, these results favor Gmail/Google Workspace — particularly for the first 3–6 months of a new mailbox's life. Teams committed to Outlook should plan for longer warmup periods, more gradual volume scaling, and stricter DMARC policies to narrow the deliverability gap.

Study conducted by the WarmySender Research Team. Data collected March 2025 – February 2026. Methodology reviewed by two external deliverability consultants. For dataset access, contact research@warmysender.com.

gmail outlook inbox-placement deliverability benchmark research email-warmup longitudinal-study
Try WarmySender Free