daiR: Interface with Google Cloud Document AI API

R interface for the Google Cloud Services 'Document AI API' <https://cloud.google.com/document-ai/> with additional tools for output file parsing and text reconstruction. 'Document AI' is a powerful server-based OCR processor that extracts text and tables from images and PDF files with high accuracy. 'daiR' gives R users programmatic access to this processor and additional tools to handle and visualize the output. See the package website <https://dair.info/> for more information and examples.

Version: 0.9.9
Depends: R (≥ 4.2.0)
Imports: base64enc, beepr, data.table, fs, gargle, glue, googleCloudStorageR, graphics, grDevices, httr, jsonlite, magick, pdftools, purrr, readtext, stats, stringr, utils, xml2
Suggests: knitr, ngram, rmarkdown, testthat (≥ 3.1.10)
Published: 2023-09-07
Author: Thomas Hegghammer ORCID iD [aut, cre]
Maintainer: Thomas Hegghammer <hegghammer at gmail.com>
BugReports: https://github.com/Hegghammer/daiR/issues
License: MIT + file LICENSE
URL: https://github.com/Hegghammer/daiR, https://dair.info
NeedsCompilation: no
Materials: README NEWS
CRAN checks: daiR results

Documentation:

Reference manual: daiR.pdf
Vignettes: Basic processing
Complex file and folder management
Extracting tables
Correcting text output from Google Document AI
Setting up a Google Storage bucket
Using Google Document AI with R

Downloads:

Package source: daiR_0.9.9.tar.gz
Windows binaries: r-devel: daiR_0.9.9.zip, r-release: daiR_0.9.9.zip, r-oldrel: daiR_0.9.9.zip
macOS binaries: r-release (arm64): daiR_0.9.9.tgz, r-oldrel (arm64): daiR_0.9.9.tgz, r-release (x86_64): daiR_0.9.9.tgz, r-oldrel (x86_64): daiR_0.9.9.tgz
Old sources: daiR archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=daiR to link to this page.