Package 'card'

Title: Cardiovascular Applications in Research Data
Description: A collection of cardiovascular research datasets and analytical tools, including methods for cardiovascular procedural data, such as electrocardiography, echocardiography, and catheterization data. Additional methods exist for analysis of procedural billing codes.
Authors: Anish S. Shah [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-9729-1558>), Reese Fuller [ctb]
Maintainer: Anish S. Shah <[email protected]>
License: MIT + file LICENSE
Version: 0.1.1
Built: 2026-05-17 20:22:28 UTC
Source: https://github.com/shah-in-boots/card

Help Index


Augment data with information from a cosinor object

Description

Augment accepts a cosinor model object and adds information about each observation in the dataset. This includes the predicted values in the .fitted column and the residuals in the .resid column. New columns always begin with a . prefix to avoid overwriting columns in original dataset.

Usage

## S3 method for class 'cosinor'
augment(x, ...)

Arguments

x

A cosinor object created by cosinor()

...

For extensibility

Value

a tibble object

See Also

Other cosinor: cosinor(), ggcosinor()


Complication Definitions for Cardiac Procedure Adverse Event Adjudication

Description

A flat list of complication categories for adjudicating cardiac electrophysiology procedure adverse event narratives. Each entry represents a single complication type with a clinical definition and subcategory schema. The built-in definitions are written for catheter-based cardiac electrophysiology procedures and are most directly informed by AF ablation literature. Users can subset or replace categories to fit other cardiac procedures and can also supply their own complication lists in the same format.

Usage

complication_definitions

Format

A named list of complication categories. Each element is a list with three components:

  • title: Character. The full clinical name of the complication.

  • definition: Character. A clinical definition written for use by an LLM or human adjudicator to identify the complication in adverse event narrative text.

  • classification: A named list or named character vector. Names are short subcategory codes; values are scalar character definitions of each subcategory. Codes may reflect subtype, acuity, intervention, or outcome. Within a selected category, subcategories are not necessarily mutually exclusive. Categories may also be co-assigned when more than one mechanism or outcome is plausible from the narrative. Every category includes an "insufficient_info" level for narratives that lack sufficient detail to classify more specifically.

Details

Derived from:

  • 2024 EHRA/HRS/APHRS/LAHRS Expert Consensus Statement on Catheter and Surgical Ablation of AF (Tzeis et al., Europace 2024;26:euae043)

  • Procedure-Related Complications of Catheter Ablation for AF (Tzeis et al., JACC 2023;82:1524-1536)

  • 2023 ACC/AHA/ACCP/HRS Guideline for Diagnosis and Management of AF (Joglar et al., JACC 2024;83:109-279)

  • MANIFEST-17K: Multinational Survey on Safety of Postapproval Clinical Use of Pulsed Field Ablation (Ekanem et al., Circulation 2024)

  • Considerations Regarding Safety with PFA for AF (Heart Rhythm O2, 2024;5:e01169)

Source

Tzeis S, Gerstenfeld EP, Kalman J, et al. 2024 European Heart Rhythm Association/Heart Rhythm Society/Asia Pacific Heart Rhythm Society/Latin American Heart Rhythm Society expert consensus statement on catheter and surgical ablation of atrial fibrillation. Europace. 2024;26(4):euae043.

Examples

# Access a single complication
complication_definitions$pericardial$title
complication_definitions$pericardial$definition
names(complication_definitions$pericardial$classification)

# List all complication category names
names(complication_definitions)

# Get all titles
vapply(complication_definitions, \(x) x$title, character(1))

# Select a subset relevant to AF ablation
af_ablation_complications <- complication_definitions[c(
  "pericardial", "stroke", "vascular", "pv_stenosis", "esophageal",
  "phrenic", "arrhythmia", "coronary", "hemolysis", "respiratory",
  "infection", "death", "device_malfunction", "no_harm", "other"
)]

# Define a custom complication list in the same format
my_complications <- list(
  lead_dislodgement = list(
    title = "Lead Dislodgement",
    definition = "Displacement of a pacemaker or ICD lead from its
      implant site, resulting in loss of capture, sensing failure, or
      change in pacing threshold. The narrative may describe lead
      repositioning, revision surgery, or new pacing parameters.",
    classification = list(
      reprogrammed = "Lead dislodgement managed by device reprogramming
        without surgical intervention.",
      repositioned = "Lead surgically repositioned or replaced.",
      insufficient_info = "Lead dislodgement suspected but insufficient
        detail to determine management."
    )
  )
)

Fit a cosinor

Description

cosinor() fits a regression model of a time variable to a continuous outcome use trigonometric features. This approaches uses the linearization of the parameters to assess their statistics and distribution.

Usage

cosinor(t, ...)

## Default S3 method:
cosinor(t, ...)

## S3 method for class 'data.frame'
cosinor(t, y, tau, population = NULL, ...)

## S3 method for class 'matrix'
cosinor(t, y, tau, population = NULL, ...)

## S3 method for class 'formula'
cosinor(formula, data, tau, population = NULL, ...)

## S3 method for class 'recipe'
cosinor(t, data, tau, population = NULL, ...)

Arguments

t

Represents the ordered time indices that provide the positions for the cosine wave. Depending on the context:

  • A ⁠data frame⁠ of a time-based predictor/index.

  • A matrix of time-based predictor/index.

  • A recipe specifying a set of preprocessing steps created from recipes::recipe().

...

Not currently used, but required for extensibility.

y

When t is a ⁠data frame⁠ or matrix, y is the outcome specified as:

  • A ⁠data frame⁠ with 1 numeric column.

  • A matrix with 1 numeric column.

  • A numeric vector.

tau

A vector that determines the periodicity of the time index. The number of elements in the vector determine the number of components (e.g. single versus multiple cosinor).

  • A vector with a single element = single-component cosinor, e.g. period = c(24)

  • A vector with multiple elements = multiple-component cosinor, e.g. period = c(24, 12)

population

Represents the population to be analyzed with a population-mean cosinor. Defaults to NULL, assuming individual cosinors are being generated. When a recipe or formula is used, population is specified as:

  • A character name of the column contained in data that contains identifiers for each subject. Every row will have a subject name which should be duplicated for each time index given.

When a ⁠data frame⁠ or matrix is used, population is specified as:

  • A vector of the same length as t, with values representing each subject at the correct indices.

formula

A formula specifying the outcome terms on the left-hand side, and the predictor terms on the right-hand side.

data

When a recipe or formula is used, data is specified as:

  • A ⁠data frame⁠ containing both the predictors and the outcome.

Value

A cosinor object.

See Also

Other cosinor: augment.cosinor(), ggcosinor()

Examples

# Data setup
data("twins")

# Formula interface
model <- cosinor(rDYX ~ hour, twins, tau = 24)

Area of Ellipse

Description

Formulas for creating the area of the ellipse to identify confidence intervals, direction, and graphing purposes.

Usage

cosinor_area(object, level = 0.95, ...)

Arguments

object

Model of class cosinor

level

Confidence level requested

...

Not currently used, but required for extensibility.

Value

Area of potential cosinor for graphical analysis as matrix stored in a list.


Multiple Component Cosinor Features

Description

Extract the special/global features of a multiple component cosinor. In a multiple component model, there are specific parameters that are not within the model itself, but must be extracted from the model fit. When extracted, can be used to improve the plot of a multiple component cosinor. However, this is only possible if the cosinor is harmonic (see details). For single-component models, the orthophase is the same as the acrophase and the global amplitude

  • Global Amplitude (Ag) = the overall amplitude is defined as half the difference between the peak and trough values

  • Orthophase (Po) = the lag until the peak time

  • Bathyphase (Pb) = the lag until the trough time

Usage

cosinor_features(object, population = TRUE, ...)

Arguments

object

Model of class cosinor with multiple periods

population

If the object is a population cosinor, should the features be calculated for the individual cosinors or for the population-cosinors. Default is TRUE. This has no effect on "Individual" cosinor objects.

  • If TRUE, then will calculate features for entire population.

  • If FALSE, then will calculate features for every individual cosinor in the population.

...

For extensibility

Details

These calculations can only occur if the periods of the cosinor are harmonic - as in, the longest period is a integer multiple of the smallest period (known as the fundamental frequency). Otherwise, these statistics are not accurate or interpretable.

Value

When returning the cosinor features for a single model, will return an object of class list. When returning the cosinor features for every individual in a population cosinor, will return an object of class tibble.

Examples

data(twins)
model <- cosinor(rDYX ~ hour, twins, c(24, 8), "patid")
results <- cosinor_features(model, population = FALSE)
head(results)

Goodness of Fit of Cosinor

Description

Goodness of fit of a cosinor from data that has multiple collections at different timepoints or from multiple cycles. The RSS is partitioned into pure error (SSPE) and lack of fit (SSLOF). An F-test compares the SSPE and SSLOF to detect appropriateness of model.

SSLOF=RSSSSPESSLOF = RSS - SSPE

SSPE=il(YilYi)2SSPE = \sum_{i} \sum_{l} ( Y_{il} - \overline{Y}_{i} )^2

The fitted values for each time point are:

Yi=lYilni\overline{Y}_{i} = \frac{ \sum_{l} Y_{il} }{ n_{i}}

Usage

cosinor_goodness_of_fit(object, level = 0.95, ...)

Arguments

object

requires cosinor model generated with cosinor to calculate statistics.

level

confidence level desired

...

additional parameters may be needed for extensibility

Value

f-statistic as result of goodness of fit


General Interface for Cosinor Regression Models

Description

cosinor_reg() is a parsnip friendly method for specification of cosinor regression model before fitting.

Usage

cosinor_reg(mode = "regression", period = NULL)

## S3 method for class 'cosinor_reg'
update(object, period = NULL, fresh = FALSE, ...)

## S3 method for class 'cosinor_reg'
print(x, ...)

Arguments

mode

A character string that describes the type of model. In this case, it only supports type of "regression".

period

A non-negative number or vector of numbers that represent the expected periodicity of the data to be analyzed.

object

Cosinor model specification

fresh

A logical for whether the arguments should be modified in place or replaced altogether

...

Extensible

x

Cosinor model specification

Examples

library(parsnip)
cosinor_reg(period = c(24, 8)) |>
	parsnip::set_engine("card") |>
	parsnip::set_mode("regression")

Zero Amplitude Test

Description

Zero amplitude test assesses how well the circadian pattern fits the data, essentially detecting the present of a rhythm to the data.

Usage

cosinor_zero_amplitude(object, level = 0.95)

Arguments

object

model of class cosinor

level

confidence level

Value

Returns a list of test statistics, as well prints out a report of analysis.


Extract Echocardiogram Measurements

Description

A set of functions to extract common echocardiogram measurements from free text reports. These functions use regular expressions to identify and extract both qualitative descriptions and quantitative measurements.

The following measurements can be extracted:

  • Left atrial (LA) size (qualitative description)

  • Left atrial diameter (quantitative measurement in cm)

  • Left ventricular ejection fraction (LVEF, percentage)

  • Left ventricular internal diameter in diastole (LVIDd, in cm)

  • A compact set of clinically important findings from a full report

Usage

extract_la_size(text)

extract_lvef(text)

extract_lvidd(text)

extract_la_diameter(text, min_val = 1, max_val = 10)

extract_echo_findings(text)

Arguments

text

Character string containing the echo report text

min_val

Minimum plausible value for measurements (default varies by function)

max_val

Maximum plausible value for measurements (default varies by function)

Details

These functions use regular expressions to parse unstructured text from echo reports. They handle common variations in terminology and units. Measurements outside plausible ranges are returned as NA.

Value

  • extract_la_size(): Character string describing LA size ("normal", "mild", etc.)

  • extract_la_diameter(): Numeric LA diameter in cm

  • extract_lvef(): Numeric LVEF percentage

  • extract_lvidd(): Numeric LVIDd in cm

  • extract_echo_findings(): Named list of key structure/function findings

Examples

report <- "The left atrium is mildly dilated. LVEF is 55%."
extract_la_size(report)      # Returns "mild"
extract_lvef(report)         # Returns 55

GEH parameters in a large clinical cohort

Description

Used in the model-building examples for repeat testing.

Usage

geh

Format

A tibble


ggplot of cosinor model

Description

ggplot of cosinor model that can visualize a variety of cosinor model subtypes, including single-component, multiple-component, individual, and population cosinor models, built using cosinor. For single component cosinor, the following values are plotted:

  • M = midline estimating statistic of rhythm

  • A = amplitude

  • P = phi or acrophase (shift from 0 to peak)

If using a multiple-component cosinor, the terms are different. If the periods or frequencies resonate or are harmonic, then the following are calculated. If the periods are not harmonic, the values are just descriptors of the curve.

  • M = midline estimating statistic of rhythm

  • Ag = global amplitude, which is the distance between peak and trough (this is the same value as the amplitude from single component)

  • Po = orthophase (the equivalent of the acrophase in a single component), the lag time to peak value

  • Pb = bathyphase, the lag time to trough value

Usage

ggcosinor(object, labels = TRUE, ...)

Arguments

object

Model of class cosinor. If instead of a single cosinor model, multiple objects are to be plotted, can provide a list of cosinor models. Plotting multiple models simultaneously is preferred if the outcome variable is similar in scale.

labels

Logical value if annotations should be placed on plot, default = TRUE. The labels depend on the type of plot. The labels are attempted to be placed "smartly" using the ggrepel::geom_label_repel() function.

...

For extensibility. This function will use different implementations based on the type of model (single or multiple component). Attributes of the object will be passed down, or calculated on the fly.

Value

Object of class ggplot that can be layered

See Also

Other cosinor: augment.cosinor(), cosinor()

Examples

data(triplets)
m1 <- cosinor(rDYX ~ hour, twins, tau = 24)
m2 <- cosinor(rDYX ~ hour, twins, tau = c(24, 12))
ggcosinor(m1, labels = FALSE)
ggcosinor(m2)
ggcosinor(list(single = m1, multiple = m2))

Graphical Assessment of Amplitude and Acrophase

Description

This is a ggplot-styled graphical representation of the ellipse region generated by the cosinor analysis. It requires the same data used by cosinor model to be fit with the model cosinor. This includes the amplitude, acrophase,

Usage

ggellipse(object, level = 0.95, ...)

Arguments

object

Requires a cosinor model to extract the correct statistics to generate the plot.

level

Confidence level for ellipse

...

Additional parameters may be needed for extensibility

Value

Object of class ggplot to help identify confidence intervals

Examples

data("twins")
m <- cosinor(rDYX ~ hour, twins, tau = 24)
ggellipse(m)

Output from MATLAB HRV Toolbox

Description

Data is a single patient data output from HRV Toolbox. It contains granular data of calculated HRV in 5-second sliding windows.

Usage

hrv

Format

An tibble data frame


Load FDA MAUDE Coding Resources by Annex

Description

Load Medical Device Report (MDR) adverse event coding tables published by the FDA. The annex argument selects which FDA annex to load. The interface is designed to support additional annexes over time as more code tables are added to the package data.

Usage

load_maude_codes(annex)

Arguments

annex

A single character identifying the FDA annex to load: "A", "E", or "F". Case-sensitive.

Details

The FDA publishes MDR adverse event codes as annexed code tables. This function returns the annex-specific table bundled with the package. Supported annexes:

  • A: Device problem codes

  • E: Clinical signs, symptoms, or conditions

  • F: Health impact codes

Value

A tbl_df of codes and related metadata for the requested annex. All returned annexes share the same core columns: annex, imdrf_code, fda_code, ncit_code, term, level_1, level_2, level_3, and definition.

References

FDA MDR Adverse Event Codes: Coding Resources for Medical Device Reports https://www.fda.gov/medical-devices/mdr-adverse-event-codes/coding-resources-medical-device-reports

Examples

# Load Annex E health effects codes
annex_e <- load_maude_codes("E")

Adjudicate MAUDE adverse events with a structured ellmer chat

Description

maude_adjudicate() uses reported MAUDE problem terms to narrow the candidate complication families, then sends each selected family to an ellmer chat object one at a time for structured adjudication against the supplied event narrative.

Usage

maude_adjudicate(
  terms,
  delimiter = ";",
  event_narrative,
  chat_object,
  definitions = complication_definitions,
  index = maude_complication_index
)

Arguments

terms

A character vector (not a list) of MAUDE problem terms or a single delimiter-separated string of MAUDE terms. Terms are normalized for matching against the bundled complication index.

delimiter

The character that separates MAUDE terms when terms is supplied as a single concatenated string. Defaults to ";".

event_narrative

A single adverse-event narrative, usually the event_narrative or comparable free-text field returned by maude_query(). This is the clinical text that the LLM adjudicates.

chat_object

An ellmer chat object. The object is cloned and reset before each complication-family request, so prior turns are not reused. The function uses ⁠$clone()⁠, ⁠$set_turns()⁠, ⁠$set_system_prompt()⁠, and ⁠$chat_structured()⁠ on this object. See ellmer::chat() for further details.

definitions

Named list of complication definitions. Names should be complication identifiers. Each element must contain a definition entry and a named classification entry. Defaults to complication_definitions.

index

Named list mapping complication identifiers to normalized MAUDE problem terms. Names must be a subset of the names in definitions, with optional "not_indexed" allowed as a residual bucket. Defaults to maude_complication_index.

Details

Supply chat_object as an ellmer chat object created with a provider-specific constructor such as ellmer::chat_openai(), ellmer::chat_anthropic(), or another ⁠ellmer::chat_*()⁠ backend. Users are responsible for supplying their own provider credentials or API key configuration when creating that chat object. maude_adjudicate() does not accept API keys directly; secrets should stay in the provider configuration layer, typically via environment variables such as OPENAI_API_KEY, before constructing the chat object.

The function uses a fixed internal system prompt. In summary, the prompt tells the LLM to act as an expert clinical adjudicator with experienced physician-level judgment, treat the MAUDE terms and narrative as untrusted text, ignore instructions embedded in the narrative, use only the supplied complication family definition and classification definitions, default all classifications to FALSE unless the narrative directly supports them, and return only the structured response.

For each selected complication family, the function builds an ellmer::type_object() schema with one optional ellmer::type_boolean() field per classification branch. ⁠$chat_structured()⁠ then returns an R list that matches that schema, which this function converts into explicit 0/1 flags in the complication-family output shape.

Each complication family is adjudicated in a fresh cloned chat with prior turns removed and tools cleared when supported, so clinical text from one request is not retained in the next request.

Allowed chat objects are ellmer Chat objects created by ellmer::chat() or provider-specific constructors such as ellmer::chat_openai() and ellmer::chat_anthropic(). In practice, the object should support ⁠$chat_structured()⁠, ⁠$clone()⁠, ⁠$set_turns()⁠, and ⁠$set_system_prompt()⁠.

Value

A named list of complication families. Each family contains named integer flags (0/1) for its adjudication classifications.

Examples

## Not run: 
# Assumes OPENAI_API_KEY is set in ~/.Renviron or the current environment.
chat_object <- ellmer::chat_openai(
  model = "gpt-4.1-mini"
)

maude_adjudicate(
  terms = "Pericardial Effusion; Low blood pressure / hypotension",
  event_narrative = paste(
    "Small pericardial effusion noted at case end without hemodynamic",
    "compromise. Observed overnight without drainage."
  ),
  chat_object = chat_object
)

## End(Not run)

MAUDE Complication Index for Cardiovascular Procedure Adverse Events

Description

A named list of normalized MAUDE problem terms organized into the complication categories defined in complication_definitions, plus a residual not_indexed bucket. The list is maintained explicitly in data-raw/complications.R as hand-written term vectors. Overlap between categories is expected.

Usage

maude_complication_index

Format

An object of class list of length 16.

Examples

maude_complication_index$pericardial
maude_complication_index$stroke
maude_complication_index$device_malfunction
maude_complication_index$not_indexed

Query the FDA MAUDE Database

Description

maude_query() queries the Manufacturer and User Facility Device Experience (MAUDE) database using the openFDA API. This is the recommended interface for most users, providing automatic pagination, date range handling, and input validation.

maude_fda_api_call() is the lower-level function that makes direct API calls. Use this for advanced scenarios requiring manual pagination control or pre-constructed query strings.

Usage

maude_query(
  search = NULL,
  device_generic_name = NULL,
  device_brand_name = NULL,
  manufacturer_name = NULL,
  event_type = NULL,
  report_number = NULL,
  device_problem = NULL,
  patient_problem = NULL,
  ...,
  limit = 100,
  date_start = NULL,
  date_end = NULL,
  api_key = NULL,
  descriptions_from_web = FALSE,
  verbose = interactive()
)

maude_fda_api_call(
  search_query,
  limit,
  skip,
  api_key,
  sort = NULL,
  max_retries = 3,
  verbose = FALSE
)

Arguments

search

Character string specifying the search query or NULL. For maude_query(), this can be a simple term (e.g., "pacemaker") or a field-specific query (e.g., "device.generic_name:pacemaker"). Use the other maude_query() arguments to add additional filters.

device_generic_name

Optional character vector of device generic names to filter on (device.generic_name).

device_brand_name

Optional character vector of device brand names to filter on (device.brand_name).

manufacturer_name

Optional character vector of manufacturer names to filter on (device.manufacturer_d_name).

event_type

Optional character vector of event types to filter on (event_type).

report_number

Optional character vector of MDR report numbers to filter on (report_number).

device_problem

Optional character vector of device problem codes to filter on (device.device_problem_codes).

patient_problem

Optional character vector of patient problems to filter on (patient.patient_problems).

...

Additional named search fields and values to include in the query. Names should match the openFDA searchable fields list.

limit

Integer specifying the maximum number of records to return. For maude_query(), defaults to 100 and requests exceeding 1000 are automatically paginated. Due to openFDA skip limits, maude_query() currently supports up to 26,000 records per call. For maude_fda_api_call(), maximum per request is 1000 per openFDA limits.

date_start

Optional start date for filtering by date_received. Prefer a Date object, such as as.Date("2026-01-01"). POSIXt, "YYYYMMDD", and "YYYY-MM-DD" values are also accepted and converted to Date. Only used by maude_query().

date_end

Optional end date for filtering by date_received. Prefer a Date object, such as as.Date("2026-01-31"). POSIXt, "YYYYMMDD", and "YYYY-MM-DD" values are also accepted and converted to Date. Only used by maude_query().

api_key

Optional character string containing your openFDA API key. Not required, but recommended for heavy usage to avoid rate limiting. Register at: https://open.fda.gov/apis/authentication/

descriptions_from_web

Logical. If TRUE, after the API call maude_query() backfills missing event_description values in two passes: first from FDA's bulk MAUDE narrative archives (⁠foitext{YYYY}.zip⁠, foitextadd.zip, foitextchange.zip, foitextthru1995.zip), then by scraping the FDA MAUDE detail page for any rows still missing. Disabled by default because the web pass adds one HTTP request per remaining missing report. The bulk archives are downloaded once and cached locally. Requires the rvest and xml2 packages for the web pass.

verbose

Logical. If TRUE, prints progress messages for pagination, retries, and total records retrieved. Defaults to interactive().

search_query

Fully constructed query string to send to the openFDA API (advanced use). This should include any date or field filters that you want applied exactly as written.

skip

Integer specifying the number of records to skip for pagination. Only used by maude_fda_api_call(). Combined with limit, allows fetching records in pages (e.g., skip=0 gets records 1-1000, skip=1000 gets 1001-2000).

sort

Optional sort specification for the openFDA API (e.g., "date_received:desc"). Only used by maude_fda_api_call().

max_retries

Integer giving the number of retry attempts for transient API failures in maude_fda_api_call() (HTTP 429/5xx). Defaults to 3.

Details

Database Coverage: The openFDA device adverse event endpoint contains reports from mandatory reporters (manufacturers, importers, and device user facilities) and voluntary reporters (healthcare professionals, patients, and consumers). Data covers publicly releasable records from approximately 1992 to present and is updated weekly.

Rate Limits: The openFDA API allows approximately 240 requests per minute (4 per second) without an API key, and 240 requests per minute with a key. Large queries are automatically paginated in batches of up to 1000 records.

Result Order: maude_query() requests results sorted in reverse chronological order by date_received (date_received:desc). This provides deterministic pagination for large requests.

Search Syntax: The search parameter uses Elasticsearch query syntax. Common patterns include:

  • Simple term: "pacemaker"

  • Field-specific: "device.generic_name:pacemaker"

  • Multiple terms: "device.generic_name:pacemaker AND event_type:malfunction"

  • Date range: "date_received:[20200101 TO 20201231]"

  • Exact phrase: ⁠"device.brand_name:\"Medtronic\""⁠

Building Queries in maude_query(): maude_query() is designed to help you build a valid search string without writing the full query yourself. Use the common field arguments (e.g., device_generic_name, event_type) to add structured filters, and pass any additional fields through .... The ... names should match openFDA searchable fields, documented at: https://open.fda.gov/apis/device/event/searchable-fields/.

API Response Handling: The openFDA API returns HTTP 404 for queries with no results (rather than an empty array). Both functions handle this by returning an empty tibble instead of throwing an error.

When to use maude_fda_api_call(): Most users should use maude_query(). The lower-level maude_fda_api_call() is useful when you need:

  • Direct control over skip for custom pagination strategies

  • Pre-constructed query strings with complex Elasticsearch syntax

  • Integration into custom retry/error-handling logic

  • Full control over search_query, sort, and other openFDA parameters

Value

A tbl_df containing device adverse event reports with columns:

report_number

MDR report number (unique identifier)

event_type

Type of event (e.g., "Malfunction", "Injury", "Death")

date_received

Date the report was received by FDA

device_generic_name

Generic name of the device

device_brand_name

Brand name of the device

manufacturer_name

Name of the device manufacturer

event_description

Narrative description of the adverse event

patient_problem

Reported problems affecting the patient

device_problem

Reported problems with the device

Returns an empty tibble if no results are found.

References

openFDA Device Adverse Event API: https://open.fda.gov/apis/device/event/

MAUDE Database Overview: https://open.fda.gov/data/maude/

openFDA API Query Parameters: https://open.fda.gov/apis/query-parameters/

Examples

## Not run: 
# Search for pacemaker-related adverse events
pacemaker_events <- maude_query("pacemaker", limit = 10)

# Search by device generic name
results <- maude_query(device_generic_name = "defibrillator", limit = 50)

# Filter by received-date range using Date objects
results <- maude_query(
  search = "pacemaker",
  date_start = as.Date("2026-01-01"),
  date_end = as.Date("2026-01-31"),
  limit = 50
)

# Add an extra searchable field via ...
results <- maude_query(
  device_generic_name = "infusion pump",
  device.product_code = "LVP",
  limit = 50
)

## End(Not run)

Matching MAUDE terms to complications

Description

maude_term_to_complication() groups MAUDE problem terms into the complication categories defined by a user-supplied matching index. Input terms are normalized before matching, so differences in capitalization, punctuation, and spacing do not affect the result.

A term may appear in more than one complication category if the underlying MAUDE index overlaps across categories. Returned values preserve the original user-supplied terms rather than the normalized versions used for matching. Terms that do not match any index entry are returned under "not_indexed".

Usage

maude_term_to_complication(
  term,
  definitions = complication_definitions,
  index = maude_complication_index
)

Arguments

term

Character vector of MAUDE problem terms to group into complications.

definitions

Named list of complication definitions. Defaults to complication_definitions.

index

Named list mapping complication identifiers to normalized MAUDE problem terms. Defaults to maude_complication_index.

Value

A named list. Each list name is a complication identifier, or "not_indexed" for unmatched terms, and each element is a character vector of the original input terms that matched that complication. Empty complication groups are omitted.

Examples

maude_term_to_complication(c(
  "Cardiac Tamponade",
  "Low blood pressure / hypotension",
  "No Health Consequences or Impact"
),
definitions = complication_definitions,
index = maude_complication_index)

Predict from a cosinor

Description

Predict from a cosinor

Usage

## S3 method for class 'cosinor'
predict(object, new_data, type = "numeric", ...)

Arguments

object

A cosinor object.

new_data

A data frame or matrix of new predictors.

type

A single character. The type of predictions to generate. Valid options are:

  • "numeric" for numeric predictions.

...

Additional arguments passed to the prediction function

Value

A tibble of predictions. The number of rows in the tibble is guaranteed to be the same as the number of rows in new_data.


Center for Medicare and Medicaid Services (CMS) Procedure Codes

Description

This is a generative function used to call CMS procedure codes. It is used to create a dataset that can be generally used to map procedure codes to their descriptions, allowing for understanding of interventions performed. The currently supported codes are explained in the details.

The following procedure codes are currently supported:

  • ICD9 procedure codes, most recently updated on 2014-10-01

  • ICD10 procedure codes, most recently updated on 2023-01-11

  • HCPCS prcoedure codes, most recently updated on 2023-11-29

  • CPT procedure codes, most recently updated on 2023-11-29

Usage

procedure_codes(format, version)

Arguments

format

The format of the procedure codes, written as a character. Currently supported formats are: c("icd9", "icd10", "hcpcs", "cpt") (case-insensitive).

version

The version of the procedure codes, which are generally written as a year. Currently supported: c("2014", "2023")

Details

CMS will usually release updated version of these codes on an annual basis. Each dataset that is supported below can be identified by the year it was published (not the go-live date, but the publically-available date). The previous versions that are included in the package are as below.

  • ICD9: 2014

  • ICD10: 2023

  • HCPCS: 2023

  • CPT: 2023

Value

A tbl_df with two columns: code and description. The code refers to the procedure code, while the description refers to the description of the procedure.

Examples

# Procedure codes from the 2014 version of ICD-9
icd9 <- procedure_codes(format = "icd9", version = 2014)

Summarize Genetic Variants by Gene

Description

Takes variant-level results from query_genetic_variants() and aggregates them to unique genes with summary statistics about associated variants.

Usage

query_genes_by_phenotype(
  phenotype,
  database = "clinvar",
  api_key = NULL,
  max_results = 500,
  genes = NULL,
  clean_gene_symbols = TRUE
)

Arguments

phenotype

Character string specifying the phenotype or disease condition to search for (same as query_genetic_variants()).

database

Character string specifying which database to query.

api_key

Optional character string containing your NCBI API key.

max_results

Integer specifying maximum number of variants to query before aggregation. Default is 500. Large queries will automatically paginate in 500-record batches and may take several minutes due to API rate limits. There is no hard maximum.

genes

Optional character vector of gene symbols to filter results.

clean_gene_symbols

Logical indicating whether to clean gene symbols. Default is TRUE.

Value

A tibble with one row per unique gene, containing:

gene_symbol

Gene symbol

n_variants

Total number of variants for this gene

n_pathogenic

Number of pathogenic/likely pathogenic variants

n_benign

Number of benign/likely benign variants

n_vus

Number of variants of uncertain significance

phenotypes

Unique phenotypes associated with this gene (collapsed)

chromosomes

Chromosome(s) where gene is located

database

Source database


Query Genetic Variant Databases by Phenotype

Description

Query online genetic variant databases (ClinVar and others) to retrieve gene-disease associations based on a phenotype or clinical condition.

Usage

query_genetic_variants(
  phenotype,
  database = "clinvar",
  api_key = NULL,
  max_results = 100,
  genes = NULL,
  clean_gene_symbols = TRUE
)

Arguments

phenotype

Character string specifying the phenotype or disease condition to search for (e.g., "atrial fibrillation", "hypertrophic cardiomyopathy").

database

Character string specifying which database to query. Currently supports "clinvar" (default). Additional databases may be added in future versions.

api_key

Optional character string containing your NCBI API key. Providing an API key increases the rate limit from 3 to 10 requests per second. Get a key at: https://www.ncbi.nlm.nih.gov/account/

max_results

Integer specifying maximum number of results to return. Default is 100. Large queries will automatically paginate in 500-record batches and may take several minutes due to API rate limits. There is no hard maximum, but consider the total available (shown in the message) when requesting large numbers.

genes

Optional character vector of gene symbols to filter results. If provided, only variants in these genes will be returned. Default is NULL (return all genes).

clean_gene_symbols

Logical indicating whether to clean/normalize gene symbol values (remove/replace pseudogenes such as LOC*, LINC*, MIR*). Default is TRUE.

Details

The function queries the specified genetic variant database and returns a standardized table of results. For ClinVar, it uses the NCBI E-utilities API with automatic rate limiting to comply with NCBI's usage policies (3 requests/second without API key, 10 requests/second with API key).

The search is performed using disease/phenotype field matching and may return variants for the exact condition as well as related conditions (e.g., searching for "autism" may return "autism spectrum disorder").

Value

A tibble with the following columns:

gene_symbol

Gene symbol (e.g., "TTN")

variant_id

Database-specific variant identifier

variant_name

Human-readable variant name (HGVS notation when available)

chromosome

Chromosome location

position

Genomic position

clinical_significance

Clinical interpretation (e.g., "Pathogenic", "Benign")

review_status

Level of expert review

phenotypes

Associated phenotypes/conditions

molecular_consequence

Effect on protein/transcript

database

Source database

Examples

## Not run: 
af_variants <- query_genetic_variants("atrial fibrillation")

## End(Not run)

Read VEP Output File

Description

Read a VEP output file into a tidy tibble.

Usage

read_vep_data(
  file,
  format = c("tab", "vcf"),
  columns = NULL,
  parse_extra = TRUE
)

Arguments

file

Path to the VEP output file. Can be uncompressed or gzipped.

format

Character string specifying the output format. One of:

"tab"

Default VEP tab-delimited output format (default)

"vcf"

VCF format output (produced with --vcf flag)

columns

Character vector of column names to include in the output. Default is NULL (returns all columns). Can reference both fixed VEP columns (e.g., "Gene", "Consequence") and Extra/CSQ fields (e.g., "SYMBOL", "IMPACT", "SIFT", "LoF").

parse_extra

Logical. Whether to parse the Extra column (tab) or CSQ field (VCF) into individual columns. Default is TRUE.

Details

read_vep_data() parses tab-delimited and VCF-style VEP outputs and returns one row per annotated consequence record. By default it expands Extra (tab) or CSQ (VCF) into separate columns for easier downstream filtering.

This function uses read_vep_header to infer structure and then reads the data body in a single pass.

Tab-delimited format:

  • 14 columns: 13 fixed fields plus "Extra" column

  • Extra contains semicolon-delimited key=value pairs

  • Standalone flags are converted to YES (for example, CANONICAL)

VCF format:

  • Standard VCF columns with CSQ in INFO field

  • CSQ contains pipe-delimited values ordered by the CSQ header definition

The function automatically:

  • Detects and handles gzipped files

  • Skips metadata + header lines

  • Replaces VEP dash placeholders ("-") with NA

  • Preserves metadata for provenance tracking

Value

A tibble with one row per variant-transcript annotation. If parse_extra is TRUE, the Extra/CSQ column is replaced by its parsed key-value pairs as individual columns. VEP dashes ("-") are replaced with NA. The returned tibble has the following attributes:

vep_header

The complete header structure from read_vep_header()

source_file

Normalized path to the source VEP file

See Also

read_vep_header for reading just the header information

Examples

## Not run: 
# Read all columns from tab-delimited VEP output
df <- read_vep_data("sample.filtered")

# Read only specific columns of interest
df <- read_vep_data(
  "sample.filtered",
  columns = c("Uploaded_variation", "Gene", "Consequence",
              "SYMBOL", "IMPACT", "SIFT", "PolyPhen", "LoF")
)

# Read without parsing Extra (faster, returns raw Extra string)
df <- read_vep_data("sample.filtered", parse_extra = FALSE)

# Access metadata
attr(df, "vep_header")$meta$vep_version
attr(df, "vep_header")$meta$assembly

# Read gzipped VCF format
df <- read_vep_data("sample.vep.vcf.gz", format = "vcf")

## End(Not run)

Read VEP File Header

Description

Read only the metadata section of a VEP output file.

Usage

read_vep_header(file, format = c("tab", "vcf"), n_max = 1000)

Arguments

file

Path to the VEP output file. Can be uncompressed or gzipped.

format

Character string specifying the output format. One of:

- _tab_ = Default VEP tab-delimited output format

- _vcf_ = VCF format output (produced with `--vcf` flag)
n_max

Maximum number of lines to scan for header content. Default is 1000, which should be sufficient for most VEP files.

Details

This is useful when you want to inspect VEP version, assembly, parsed field definitions, or column layout before loading the full annotation table.

VEP can output results in different formats depending on command-line flags. This function supports the two most common modes:

Tab-delimited format (default or --tab):

  • Metadata lines begin with ⁠##⁠

  • Column header line begins with single ⁠#⁠

  • Annotation fields are described under "Extra column keys"

VCF format (--vcf):

  • Standard VCF headers with ⁠##⁠ metadata

  • Consequence annotations are stored in INFO/CSQ

  • CSQ field order is declared in ⁠##INFO=<ID=CSQ,...>⁠ header line

The returned structure is consistent across formats. If no annotation fields are found, annotations is returned as NULL.

Value

A named list containing:

- *meta* = List of file metadata including format, VEP version, assembly, command line, and raw metadata lines

- *columns* = Named character vector where names are column names and values are descriptions (if available)

- *annotations* = Named list where names are annotation field names and values are their definitions from the header. For tab format these are Extra column fields; for VCF format these are CSQ fields (with NA values since VCF doesn't include descriptions). NULL if no annotations.}

See Also

https://www.ensembl.org/info/docs/tools/vep/vep_formats.html for VEP output format documentation

Examples

## Not run: 
# Read header from tab-delimited VEP output
header <- read_vep_header("variants.vep.txt", format = "tab")
names(header$columns)
names(header$annotations)

# Get description for a specific annotation
header$annotations[["CLIN_SIG"]]

# Read header from VCF format VEP output
header <- read_vep_header("variants.vep.vcf", format = "vcf")

## End(Not run)

Recurrent event sample data

Description

Data is from a outcomes study on cardiovascular outcomes. It contains the first visit date, the last known date, and times of various events that have happened. They document death at right censoring as well. These events are non-ordered.

Usage

stress

Format

An tibble data frame


Tidy a(n) cosinor object

Description

Tidy summarizes information about the components of a cosinor model.

Usage

## S3 method for class 'cosinor'
tidy(x, conf.int = FALSE, conf.level = 0.95, ...)

Arguments

x

A cosinor object created by cosinor()

conf.int

Logical indicating whether or not to include confidence

interval in tidied output

conf.level

The confidence level to use if conf.int = TRUE. Must be

between 0 and 1, with default to 0.95 (the 95% confidence interval).

...

For extensibility

Details

cosinor objects do not necessarily have a T-statistic as the standard error is not based on a mean value, but form a joint-confidence interval. The standard error is generated using Taylor series expansion as the object is a subspecies of harmonic regressions.

Value

a tibble object


Hourly time series data with clinical covariates

Description

Clinical data is also available for visualization and comparison. Other HRV measures are used here for comparison and testing out functions.

Usage

triplets

Format

A tbl_df


Hourly time series data with clinical covariates

Description

Data is from an algorithm that generates a summary HRV measure using the Poincare phase-space plot, generated from kurtoses of the x and y axis. Clinical data is also available for visualization and comparison. There are repeat rows for each hour that Dyx was taken.

Usage

twins

Format

An tibble data frame


Validate that a list is a named list where each element has is named

Description

Validate that a list is a named list where each element has is named

Usage

validate_named_list(x)

Zipcodes with Associated Latitude and Longitude

Description

This is a dataset from the archived/orphaned zipcode package.

Usage

zipcode

Format

A data frame with character vector zipcodes and latitude/longitude