Type: Package Package: bdpar Title: Big Data Preprocessing Architecture Version: 3.1.0 Authors@R: c(person(given = "Miguel", family = "Ferreiro-Díaz", role = c("aut","cre"), email = "miguel.ferreiro.diaz@gmail.com"), person(given = "David", family = "Ruano-Ordás", role = c("aut","ctr"), email = "drordas@uvigo.es"), person(given = "Tomás R.", family= "Cotos-Yañez", role = c("aut","ctr"), email = "cotos@uvigo.es"), person(given = "José Ramón", family= "Méndez Reboredo", role = c("aut","ctr"), email = "moncho.mendez@uvigo.es"), person(given = "University of Vigo", role = c("cph"))) Description: Provide a tool to easily build customized data flows to pre-process large volumes of information from different sources. To this end, 'bdpar' allows to (i) easily use and create new functionalities and (ii) develop new data source extractors according to the user needs. Additionally, the package provides by default a predefined data flow to extract and pre-process the most relevant information (tokens, dates, ... ) from some textual sources (SMS, Email, YouTube comments). Date: 2023-12-11 License: GPL-3 URL: https://github.com/miferreiro/bdpar BugReports: https://github.com/miferreiro/bdpar/issues Depends: R (>= 3.5.0) Imports: digest, parallel, R6, rlist, tools, utils Suggests: cld2, knitr, rex, rjson, rmarkdown, stringi, stringr, testthat (>= 2.3.1), tuber VignetteBuilder: knitr RoxygenNote: 7.2.3 SystemRequirements: Python (>= 2.7 or >= 3.6) Encoding: UTF-8 NeedsCompilation: no Collate: 'AbbreviationPipe.R' 'bdpar.log.R' 'wrapper.R' 'Bdpar.R' 'BdparOptions.R' 'Connections.R' 'ContractionPipe.R' 'DefaultPipeline.R' 'DynamicPipeline.R' 'ExtractorEml.R' 'ExtractorFactory.R' 'ExtractorSms.R' 'ExtractorYtbid.R' 'File2Pipe.R' 'FindEmojiPipe.R' 'FindEmoticonPipe.R' 'FindHashtagPipe.R' 'FindUrlPipe.R' 'FindUserNamePipe.R' 'GenericPipe.R' 'GenericPipeline.R' 'GuessDatePipe.R' 'GuessLanguagePipe.R' 'Instance.R' 'InterjectionPipe.R' 'MeasureLengthPipe.R' 'ResourceHandler.R' 'SlangPipe.R' 'StopWordPipe.R' 'StoreFileExtPipe.R' 'TargetAssigningPipe.R' 'TeeCSVPipe.R' 'ToLowerCasePipe.R' 'bdpar.Options.R' 'bdparData.R' 'eml.R' 'emojisData.R' 'operator-pipe.R' 'runPipeline.R' 'zzz.R' Config/pak/sysreqs: libxml2-dev python3 Repository: https://miferreiro.r-universe.dev Date/Publication: 2023-12-12 17:29:06 UTC RemoteUrl: https://github.com/miferreiro/bdpar RemoteRef: HEAD RemoteSha: e92df857b09e83ee4f197e68577b8f486dfebf8c Packaged: 2026-06-23 06:09:44 UTC; root Author: Miguel Ferreiro-Díaz [aut, cre], David Ruano-Ordás [aut, ctr], Tomás R. Cotos-Yañez [aut, ctr], José Ramón Méndez Reboredo [aut, ctr], University of Vigo [cph] Maintainer: Miguel Ferreiro-Díaz