Help for package ProjectTemplate

Type: Package

Title: Automates the Creation of New Statistical Analysis Projects

Version: 0.11.1

Date: 2025年08月30日

Description: Provides functions to automatically build a directory structure for a new R project. Using this structure, 'ProjectTemplate' automates data loading, preprocessing, library importing and unit testing.

License: GPL-3 | file LICENSE

Language: en-US

LazyLoad: yes

Encoding: UTF-8

Depends: R (≥ 2.7), digest, tibble

Imports: methods

Suggests: foreign, feather, reshape, plyr, formatR, qs, stringr, ggplot2, lubridate, log4r (≥ 0.1-5), DBI, RMySQL, RSQLite, gdata, RODBC, RJDBC, readxl, xlsx, tuneR, pixmap, data.table, RPostgreSQL, GetoptLong, whisker, testthat (≥ 3.0.0), reticulate

URL: http://projecttemplate.net

BugReports: https://github.com/KentonWhite/ProjectTemplate/issues

Collate: 'ProjectTemplate-package.R' 'add.config.R' 'preinstalled.readers.R' 'add.extension.R' 'addins.R' 'arff.reader.R' 'get.project.R' 'cache.R' 'cache.name.R' 'cache.project.R' 'clean.variable.name.R' 'clear.R' 'clear.cache.R' 'translate.dcf.R' 'config.R' 'create.project.R' 'create.project.rstudio.R' 'create.template.R' 'csv.reader.R' 'csv2.reader.R' 'db.reader.R' 'dbf.reader.R' 'epiinfo.reader.R' 'feather.reader.R' 'file.reader.R' 'list.data.R' 'load.project.R' 'migrate.project.R' 'migrate.template.R' 'mp3.reader.R' 'mtp.reader.R' 'octave.reader.R' 'ppm.reader.R' 'project.config.R' 'r.reader.R' 'rdata.reader.R' 'rds.reader.R' 'reload.project.R' 'require.package.R' 'run.project.R' 'show.project.R' 'spss.reader.R' 'sql.reader.R' 'stata.reader.R' 'stopifnotproject.R' 'stub.tests.R' 'systat.reader.R' 'test.project.R' 'tsv.reader.R' 'url.reader.R' 'wsv.reader.R' 'xls.reader.R' 'xlsx.reader.R' 'xport.reader.R'

RoxygenNote: 7.3.1

Config/testthat/edition: 3

NeedsCompilation: no

Packaged: 2025年08月30日 19:56:58 UTC; kwhite

Author: Aleksandar Blagotic [ctb], Diego Valle-Jones [ctb], Jeffrey Breen [ctb], Joakim Lundborg [ctb], John Myles White [aut, cph], Josh Bode [ctb], Kenton White [ctb, cre], Kirill Mueller [ctb], Matteo Redaelli [ctb], Noah Lorang [ctb], Patrick Schalk [ctb], Dominik Schneider [ctb], Gerold Hepp [ctb], Zunaira Jamil [ctb], Glen Falk [ctb]

Maintainer: Kenton White <jkentonwhite@gmail.com>

Repository: CRAN

Date/Publication: 2025年08月30日 20:10:02 UTC

ProjectTemplate: Automates the Creation of New Statistical Analysis Projects

Description

Provides functions to automatically build a directory structure for a new R project. Using this structure, 'ProjectTemplate' automates data loading, preprocessing, library importing and unit testing.

Author(s)

Maintainer: Kenton White jkentonwhite@gmail.com [contributor]

Authors:

John Myles White [copyright holder]

Other contributors:

Aleksandar Blagotic [contributor]
Diego Valle-Jones [contributor]
Jeffrey Breen [contributor]
Joakim Lundborg [contributor]
Josh Bode [contributor]
Kirill Mueller [contributor]
Matteo Redaelli [contributor]
Noah Lorang [contributor]
Patrick Schalk [contributor]
Dominik Schneider [contributor]
Gerold Hepp [contributor]
Zunaira Jamil [contributor]
Glen Falk [contributor]

Associate a reader function with an extension.

Description

This function will associate an extension with a custom reader function.

Usage

.add.extension(extension, reader)

Arguments

extension

The extension of the new data file.

reader

The function to use when reading the data file. It should accept three arguments: data.file, filename and variable.name (in that order). The function should read the contents of the file filename, and save it into the workspace under the name variable.name. The data.file argument is just a relative file name and can be ignored.

Value

No value is returned; this function is called for its side effects.

Warning

This interface should not be considered as stable and is likely to be replaced by a different mechanism in a forthcoming version of this package.

Examples

## Not run: .add.extension('foo', foo.reader)

Attach a package or add a namespace

Description

Internal method to attach a package or only add the namespace.

Usage

.attach.or.add.namespace(package.name, attach)

Arguments

package.name

name of the package to load, as a character vector

attach

boolean indicating whether to attach the package in the global namespace

Value

Boolean indicating whether the package was successfully loaded

Construct the file names for the cache and hash

Description

Construct the file names for the cache and hash

Usage

.cache.filename(variable, cache_format)

Arguments

variable

Variable name for which to construct file names

cache_format

expression as returned by .cache.format

Details

The returned object is a list with two fields:

data: The path to the file in which the variable contents will be saved;
hash: The path to the file in which the cache metadata will be stored.

Value

A list with file names

Get configured cache file format strategy

Description

Get configured cache file format strategy

Usage

.cache.format()

Value

A named object of class expression .

Calculate the hash of the data stored in a variable

Description

Calculate the hash of the data stored in a variable

Usage

.cache.hash(variables, env = .TargetEnv)

Arguments

variables

character vector of variable names

env

environment from which to load the variable

Details

The hashes are calculated using the digest::digest function.

Value

data.frame with the variable names and the corresponding hashes

Print the current cache status

Description

Print the current cache status

Usage

.cache.status()

Value

No value is returned; this function is called for its side effects.

List all cached variables

Description

List all variables for which files are available in the cache. The info is purely based on the files in the cache directory. There is no guarantee the variable can actually be loaded from the cache.

Usage

.cached.variables()

Value

Character vector of cached variables

Compare the project version with the current ProjectTemplate version

Description

Compare the project version with the current ProjectTemplate version

Usage

.check.version(config, warn.migrate = TRUE)

Arguments

config

Project configuration

warn.migrate

Logical indicating whether a warning should be raised if the project version is older than the installed version of ProjectTemplate.

Value

0 if the numbers are equal, -1 if b is later and 1 if a is later (analogous to the C function strcmp).

Gives an R error on malformed inputs.

Convert one or more data sets to data.tables

Description

Converts all base::data.frames referred to in the input to data.tables. The resulting data set is stored in the .TargetEnv.

Usage

.convert.to.data.table(data.sets)

Arguments

data.sets

A character vector of variable names.

Value

No value is returned; this function is called for its side effects.

Convert one or more data sets to tibbles

Description

Converts all base::data.frames referred to in the input to tibbles. The resulting data set is stored in the .TargetEnv.

Usage

.convert.to.tibble(data.sets)

Arguments

data.sets

A character vector of variable names.

Value

No value is returned; this function is called for its side effects.

Create a data.frame with the cache metadata

Description

Create a data.frame with the cache metadata

Usage

.create.cache.hash(variable, depends, CODE)

Arguments

variable

Name of the variable to be cached

depends

Vector of variable names of dependencies for the variable to be cached, optional.

CODE

Code block to generate variable, registered as a dependency, optional.

Details

The hashes for the various objects are calculated using the .cache.hash function.

Value

data.frame containing the variable name and its dependencies, with the corresponding hashes appended.

Create a project structure

Description

.create.project.existing creates a project directory structure inside an existing directory with the default files from a given template.

.create.project.new first creates a new directory and then passes further control to .create.project.existing. In case the project creation fails, the newly created directory is cleaned up.

Usage

.create.project.existing(
 project.name,
 merge.strategy,
 template,
 rstudio.project
)
.create.project.new(project.name, template, rstudio.project)

Arguments

project.name

Character vector with the name of the project directory

merge.strategy

Character vector determining whether the directory should be empty or is allowed to contain non-conflicting files

template

Name of the template from which the project should be created

rstudio.project

Logical indicating whether an .Rproj file should be created

Value

No value is returned; this function is called for its side effects.

Check if a directory is empty

Description

Checks if the directory listing by .list.files.and.dirs is empty.

Usage

.dir.empty(path)

Arguments

path

Character vector containing the path to the directory to check.

Value

Logical indicating whether the passed directory was empty.

Run code and assign the results to variable

Description

Run code and assign the results to variable

Usage

.evaluate.code(variable, CODE)

Arguments

variable

variable name in which to store the result of CODE

CODE

code block that returns a result which can be stored in a variable

Details

No error handling is done on the executed code, nor is the

Get the location of a template from its name

Description

Checks the configured option('ProjectTemplate.templatedir') for the template. If no matching template was found the system templates are checked, and finally the current directory is checked. If no template was found with the given name an error is raised.

Usage

.get.template(template)

Arguments

template

Character vector containing the name of the template

Value

Character vector containing the location of the template. If no template was found by the given name an error is raised.

Check if the project was loaded

Description

Currently does a very basic check to see if the variable project.info exists in the .TargetEnv. No check is performed on the contents of the variable.

Usage

.has.project()

Value

Logical indicating whether the project was loaded.

Initialize the logger for the project

Description

Creates a log4r::logger and provides a default log file log/project.log.

Usage

.init.logger(config, my.project.info)

Arguments

config

Named list containing the project configuration

my.project.info

Named list containing the project information

Value

Returns my.project.info amended with the new information.

Test whether a given path is a ProjectTemplate project

Description

Test whether a given path is a ProjectTemplate project

Usage

.is.ProjectTemplate(path = getwd())

Arguments

path

Directory to check, defaults to the current working directory.

Value

Logical indicating whether the given path is a valid project.

Check whether the cache is empty

Description

Check whether the cache is empty

Usage

.is.cache.empty()

Value

Logical indicating whether the cache is empty

Check whether variables are cached

Description

Check whether variables are cached

Usage

.is.cached(varnames)

Arguments

varnames

Character vector of variable names

Value

Logical vector indicating whether the variable is in the cache.

Check if path is an existing directory

Description

Checks if a given path exists, and if so if it is a directory.

Usage

.is.dir(path)

Arguments

path

Character vector containing the path to the directory to check.

Value

Logical indicating a valid directory was passed.

Build the list of data available for loading into memory

Description

This function produces a data.frame of all data files in the project, with meta data on if and how the file will be loaded by load.project.

Usage

.list.data(config)

Arguments

config

List containing the configuration to use.

Details

The returned data.frame contains the following variables, with one observation per file in data/:

filename Character variable containing the filename relative to data/ directory.

varname Character variable containing the name of the variable into which the file will be imported. *

is_ignored Logical variable that indicates whether the file. is ignored through the data_ignore option in the configuration

is_directory Logical variable that indicates whether the file is a directory.

is_cached Logical variable that indicates whether the file is already available in the cache/ directory.

cached_only Logical variable that indicates whether the variable is only available in the cache/ directory. This occurs when calling the cache function with a code fragment in a munge script.

reader Character variable containing the name of the reader function that will be used to load the data. Contains a character(0) if no suitable reader was found.

* Note that some readers return more than one variable, usually with the listed variable name as prefix. This is true for for example the xls.reader and xlsx.reader.

Value

A data.frame listing the available data, with relevant meta data

List all files and directories, excluding .. and .

Description

Creates a directory listing of a given path, including hidden files and subdirectories, but excluding the .. and . aliases.

Usage

.list.files.and.dirs(path)

Arguments

path

Character vector indicating the path to the parent folder of which the contents should be listed.

Value

Directory listing of path

Load the data from the cache and data directories

Description

Gets the list of available variables in cache/ and data/ and loads the data in memory. Data from the cache is loaded first, then in alphabetical order.

Usage

.load.data(config, my.project.info)

Arguments

config

Named list containing the project configuration

my.project.info

Named list containing the project information

Value

Returns my.project.info amended with the new information.

Load the helper functions

Description

Sources all helper scripts in lib. If lib/globals.R exists this is loaded first, all other scripts are sourced in alphabetical order.

Usage

.load.helpers(config, my.project.info)

Arguments

config

Named list containing the project configuration

my.project.info

Named list containing the project information

Value

Returns my.project.info amended with the new information.

Load the libraries listed in the configuration into memory

Description

Load the libraries listed in the libraries entry in global.dcf and add the library names to the project.info.

Usage

.load.libraries(config, my.project.info)

Arguments

config

Named list containing the project configuration

my.project.info

Named list containing the project information

Value

Returns my.project.info amended with the new information.

Source all munge scripts

Description

Sources all munge scripts in the munge directory in alphabetical order.

Usage

.munge.data(config, my.project.info)

Arguments

config

Named list containing the project configuration

my.project.info

Named list containing the project information

Value

Returns my.project.info amended with the new information.

Get the current ProjectTemplate version

Description

Reads the installed version of ProjectTemplate from the DESCRIPTION file.

Usage

.package.version()

Value

Version as a character vector.

Match readers to the extensions of the data files

Description

Match readers to the extensions of the data files

Usage

.parse.extensions(data.files, config)

Arguments

data.files

a vector of paths to data files

Value

A list of readers and varnames

Prepare a regular expression for matching files to be ignored

Description

Constructs a single regular expression for matching file names in data that should not be imported. It can detect literal names, globs with wildcards and regular expressions.

Usage

.prepare.data.ignore.regex(ignore_files)

Arguments

ignore_files

A comma separated character vector that lists all patterns to be matched for ignoring

Value

A chained regular expression that matches all patterns in the ignore_files variable.

Make sure a required directory exists before usage

Description

Checks if the requested directory exists, and if not creates the directory. In the latter case a warning is raised.

Usage

.provide.directory(name)

Arguments

name

Character vector containing the name of the required directory.

Value

No value is returned; this function is called for its side effects.

Stop silently

Description

Temporarily disable option(show.error.messages) and stop execution.

Usage

.quietstop()

Value

No value is returned; this function is called for its side effects.

Read metadata for a variable in the cache

Description

Read metadata for a variable in the cache

Usage

.read.cache.info(variable)

Arguments

variable

Variable name for which to look up the metadata

Details

The returned object is a list with two fields:

in.cache: Logical indicating whether the requested variable was found in the cache
hash: A data.frame as was created by .create.cache.hash

Value

list with metadata, see Details for more info.

Remove variables to keep from a list of candidates for removal

Description

Remove variables to keep from a list of candidates for removal

Usage

.remove.sticky.vars(names, keep)

Arguments

names

character vector of variable names that are candidate for removal

keep

character vector of variable names that should not be removed

Details

If the sticky_variables option is part of the config variable the config variable itself is added to the list of variables to keep. Also all variables listed in config$sticky_variables in a comma separated list are added to keep.

Value

A character vector containing the variables to remove.

Require internal package

Description

Internal method to require a package that is necessary for the internal functioning of ProjectTemplate. Never attaches the package unless configured to do so in global.dcf (which throws a warning).

Usage

.require.package(package.name)

Arguments

package.name

name of the package to load, as a character vector

Value

No value is returned; this function is called for its side effects.

Return an RStudio project file as character vector

Description

Return an RStudio project file as character vector

Usage

.rstudioprojectfile()

Value

Character vector with the contents of an empty RStudio project file

Raise an error if given path is not a valid project

Description

Function to stop processing if the path is not a Project Template return the project name if it is a Project Template directory.

Usage

.stopifnotproject(additional_message = "", path = getwd())

Arguments

additional_message

Optional message to show if the given path is not a valid project

path

Path to check if it is a valid project

Value

Project name if it is a valid Project.

Raise an error if given path is a valid project

Description

Function to stop processing if the path is a Project Template.

Usage

.stopifproject(additional_message = "", path = getwd())

Arguments

additional_message

Optional message to show if the given path is not a valid project

path

Path to check if it is a valid project

Value

No value is returned; this function is called for its side effects

Unload the project variables keeping the data

Description

Removes the config, logger and project.info variables from memory, leaving all data variables in place.

Usage

.unload.project()

Value

No value is returned; this function is called for its side effects.

Compare sets of variable names

Description

Compare the variables (excluding functions) in the global env with a passed in string of names and return the set difference.

Usage

.var.diff.from(given.var.list = "", env = .TargetEnv)

Arguments

given.var.list

Character vector of variable names

env

Environment in which to compare the sets of variables

Write a variable and its metadata to cache

Description

Write a variable and its metadata to cache

Usage

.write.cache(cache.hash, ...)

Arguments

cache.hash

a data.frame with metadata about the variable, see details for more information.

...

extra parameters passed to save .

Details

cache.hash is a data frame with two columns: variable and hash.
Row name VAR is the name of the variable to save.
Row name CODE is the hash value of the code to compute variable.
Row name DEPENDS.* are the dependent variables that CODE depends on.c
The helper function .create.cache.hash creates a suitable dataframe

Value

No value is returned, this function is called for its side effects.

Add project specific config to the global config

Description

Enables project specific configuration to be added to the global config object. The allowable format is key value pairs which are appended to the end of the config object, which is accessible from the global environment.

Usage

add.config(..., apply.override = FALSE)

Arguments

...

A series of key-value pairs containing the configuration. The key is the name that gets added to the config object. These can be overridden at load time through the ... argument to load.project .

apply.override

A boolean indicating whether overrides should be applied. This can be used to add a setting disregarding arguments to load.project

Details

Once defined, the value can be accessed from any ProjectTemplate script by referencing config$my_project_var.

Examples

library('ProjectTemplate')
## Not run: 
add.config(
 keep_bigdata=TRUE, # Whether to keep the big data file in memory
 parse=7 # number of fields to parse
)
if (config$keep_bigdata) ...
## End(Not run)

Cache a data set for faster loading.

Description

This function will store a copy of the named data set in the cache directory. This cached copy of the data set will then be given precedence at load time when calling load.project . Cached data sets are stored as .RData or optionally as .qs files.

Usage

cache(variable = NULL, CODE = NULL, depends = NULL, tidyCODE = TRUE, ...)

Arguments

variable

A character string containing the name of the variable to be saved. If the CODE parameter is defined, it is evaluated and saved, otherwise the variable with that name in the global environment is used.

CODE

A sequence of R statements enclosed in {..} which produce the object to be cached. Requires suggested package formatR.

depends

A character vector of other global environment objects that the CODE depends upon. Caching will be forced if those objects have changed since last caching

tidyCODE

A logical scalar specifying if the CODE shall be tidied with the help of tidy_source . As, for example, whitespace changes do not change the meaning of the code and therefore should not invalidate the cache, this usually is a desired feature. However, in case the CODE contains, for example, complex SQL statements this might fail and skipping this step is an even more desirable feature.

...

Additional arguments passed on to save or optionally to qsave . See project.config for further information.

Details

Usually you will want to cache datasets during munging. This can be the raw data just loaded, or it can be the result of further processing during munge. Either way, it can take a while to cache large variables, so cache will only cache when it needs to. The clear.cache("variable") command can be run to flush individual items from the cache.

Calling cache() with no arguments returns the current status of the cache.

Value

No value is returned; this function is called for its side effects.

Examples

library('ProjectTemplate')
## Not run: create.project('tmp-project')
setwd('tmp-project')
dataset1 <- 1:5
cache('dataset1')
setwd('..')
unlink('tmp-project')
## End(Not run)

Translate a variable name into a file name for caching.

Description

This function will translate a variable name into a form that is suitable as a filename on most OS's.

Usage

cache.name(data.filename)

Arguments

data.filename

The variable name to be translated into a filename.

Value

A translated variable name.

Examples

library('ProjectTemplate')
## Not run: cache.name('example.1')

Cache a project's data sets in binary format.

Description

This function will cache all of the data sets that were loaded by the load.project function in a binary format that is easier to load quickly. This is particularly useful for data sets that you've modified during a slow munging process that does not need to be repeated.

Usage

cache.project()

Value

No value is returned; this function is called for its side effects.

Examples

library('ProjectTemplate')
## Not run: load.project()
cache.project()
## End(Not run)

Translate a file name into a valid R variable name.

Description

This function will translate a file name into a name that is a valid variable name in R. Non-alphabetic characters on the boundaries of the file name will be stripped; non-alphabetic characters inside of the file name will be replaced with dots.

Usage

clean.variable.name(variable.name, config = .load.config())

Arguments

variable.name

A character vector containing a variable's proposed name that should be standardized.

config

A list of configuration variables. Defaults to those loaded by load.project

Value

A translated variable name.

Examples

library('ProjectTemplate')
## Not run: clean.variable.name('example_1')

Clear objects from the global environment

Description

This function removes specific (or all by default) named objects from the global environment. If used within a ProjectTemplate project, then any variables defined in the config$sticky_variables will remain.

Usage

clear(..., keep = c(), force = FALSE)

Arguments

...

A sequence of character strings of the objects to be removed from the global environment. If none given, then all items except those in keep will be deleted. This includes items beginning with .

keep

A character vector of variables that should remain in the global environment

force

If TRUE, then variables will be deleted even if specified in keep or config$sticky_variables

Value

The variables kept and removed are reported

Examples

library('ProjectTemplate')
## Not run: 
clear("x", "y", "z")
clear(keep="a")
clear()
## End(Not run)

Clear data sets from the cache

Description

This function remove specific (or all by default) named data sets from the cache directory. This will force that data to be read in from the data directory next time load.project is called.

Usage

clear.cache(...)

Arguments

...

A sequence of character strings of the variables to be removed from the cache. If none given, then all items in the cache will be removed.

Value

Success or failure is reported

Examples

library('ProjectTemplate')
## Not run: 
clear.cache("x", "y", "z")
## End(Not run)

Create a new project.

Description

This function will create all of the scaffolding for a new project. It will set up all of the relevant directories and their initial contents. For those who only want the minimal functionality, the template argument can be set to minimal to create a subset of ProjectTemplate's default directories. For those who want to dump all of ProjectTemplate's functionality into a directory for extensive customization, the dump argument can be set to TRUE.

Usage

create.project(
 project.name = "new-project",
 template = "full",
 dump = FALSE,
 merge.strategy = c("require.empty", "allow.non.conflict"),
 rstudio.project = FALSE
)

Arguments

project.name

A character vector containing the name for this new project. Must be a valid directory name for your file system.

template

A character vector containing the name of the template to use for this project. By default a full and minimal template are provided, but custom templates can be created using create.template.

dump

A boolean value indicating whether the entire functionality of ProjectTemplate should be written out to flat files in the current project.

merge.strategy

What should happen if the target directory exists and is not empty? If "force.empty", the target directory must be empty; if "allow.non.conflict", the method succeeds if no files or directories with the same name exist in the target directory.

rstudio.project

A boolean value indicating whether the project should also be an 'RStudio Project'. Defaults to FALSE. If TRUE, then a 'projectname.Rproj' with usable defaults is added to the ProjectTemplate directory.

Details

If the target directory does not exist, it is created. Otherwise, it can only contain files and directories allowed by the merge strategy.

Value

No value is returned; this function is called for its side effects.

Examples

library('ProjectTemplate')
## Not run: create.project('MyProject')

Create a new template

Description

This function writes a skeleton directory structure for creating your own custom templates.

Usage

create.template(target, source = "minimal")

Arguments

target

Name of the new template. It is created under the directory specified by options('ProjectTemplate.templatedir'), or, when missing, in the current directory.

source

Name of an existing template to copy, defaults to the built in 'minimal' template.

Show information about the current project.

Description

This function will return all of the information that ProjectTemplate has about the current project. This information is gathered when load.project is called. At present, ProjectTemplate keeps a record of the project's configuration settings, all packages that were loaded automatically and all of the data sets that were loaded automatically. The information about autoloaded data sets is used by the cache.project function.

Usage

get.project()

Details

In previous releases this information has been available through the global variable project.info. Using this variable is now deprecated and will result in a warning.

Value

A named list.

Examples

library('ProjectTemplate')
## Not run: load.project()
get.project()
## End(Not run)

Listing the data for the current project

Description

This function produces a data.frame of all data files in the project, with meta data on if and how the file will be loaded by load.project.

Usage

list.data(...)

Arguments

...

Named arguments to override configuration from config/global.dcf and lib/global.R.