Skip to content

propublica/tx-pregnancy-loss-transfusions-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Identifying blood transfusions in first-trimester pregnancy-loss emergency department encounters

This is the code for an analysis ProPublica published as part of our July 2025 story "A 'Striking' Trend: After Texas Banned Abortion, More Women Nearly Bled to Death During Miscarriage." We found that after Texas made performing abortions a felony in August 2022, the number of blood transfusions during emergency room visits for first-trimester miscarriage shot up by 54%. We also found that the number of emergency room visits for early miscarriage also rose, by 25%, compared with the three years before the COVID-19 pandemic — a sign that women who didn’t initially receive a dilation and curettage procedure, which protects against a life-threatening hemorrhage but is also used to end pregnancies, may be returning to hospitals in worse condition.

For this analysis, we purchased seven years (2017-2023) of Texas inpatient and outpatient hospital discharge data from the Texas Department of State Health Services. The structure and contents are outlined in the inpatient and outpatient data dictionaries. Those wishing to re-create our Texas analysis will need to obtain the data from the department. Hospital discharge data is available for many states and is often structured similarly.

In addition to this repository, to fully understand our analysis please refer to our methodology: “Miscarriage Is Increasingly Dangerous for Women in Texas, Our Analysis Shows. Here’s How We Did It.

This methodology also builds on our prior work, which identified sepsis in second-trimester pregnancy-loss hospitalizations.

Questions about the code in this repo? Email andrea.suozzo@propublica.org.

Step 0: Data processing and setup

We used Poetry to install Python libraries.

Before we could work with the data, we had to combine and transform the tab-delimited files we received. This repository does not include the code we used for this step, because this data transformation is bespoke to the format of the raw data.

We converted the files into Parquet format, creating a main file called discharge, partitioned by year for faster queries. Each hospital stay had a unique record_id that identified that visit.

Each row had up to 36 ICD-10-CM diagnosis codes, including an admitting diagnosis, a primary diagnosis, up to 24 other diagnoses and up to 10 external cause codes. We removed those variables from the main table and transformed them from wide to long format, putting those rows into a file called discharge_diagnosis. Each row had a record_id field that we could use to link it back to its corresponding record in the discharge file.

Similarly, each row also had up to 25 ICD-10-PCS procedure codes. We removed those variables from the main table and transformed them to a long format in a table called discharge_procedure, again linked to discharge by the record_id format.

We used the Python package DuckDB to run queries on these files. We set up the files for import in scripts/utils.py; to set this up, replace the [PATH_TO_HOSPITAL_DATA] strings with the path to your data.

Step 1: Generate a coded file

In the file scripts/1_code_pregnancy_ends.ipynb, we generated a coded file with hospitalizations we were interested in and any relevant data for each hospitalization.

We followed a methodology maternal health researchers use to identify all encounters, across both the inpatient and outpatient data, with "abortive outcomes" — instances of pregnancy loss at less than 20 weeks, which includes diagnoses like ectopic pregnancy and miscarriage.

We then identified whether an encounter originated in the emergency department, had a code indicating that it was an ectopic or molar pregnancy, or had procedure code for a blood transfusion.

We also identified gestational week codes that indicated how far along a pregnancy was. Not all encounters had a gestational week code, but experts told us that an emergency department doctor would likely be able to establish a gestational age over the course of treatment in a second-trimester pregnancy. To focus on first-trimester encounters, we excluded any visit with a gestational age of 13 weeks or more and included any visit that did not have a gestational age code.

Lastly, we filtered our visits to ones where the patient was female and between the ages of 10 and 54, to exclude rows with potential errors.

We then wrote this file out to a CSV.

Step 2: Analyze second-trimester pregnancy loss

In the file scripts/2_analyze_outcomes.ipynb, we used the coded file to explore first-trimester pregnancy-loss emergency department encounters and visits within those that included a transfusion.

Additional data sources

File: icd_transfusion_codes.csv

ICD-10-PCS procedure codes from the Centers for Disease Control and Prevention definition of codes corresponding to severe maternal morbidity: https://www.cdc.gov/maternal-infant-health/php/severe-maternal-morbidity/icd.html

About

A ProPublica analysis of blood transfusions in first-trimester pregnancy-loss emergency department encounters after Texas' near-total abortion ban went into effect

Resources

License

Stars

Watchers

Forks

Contributors