The following code is a Node-based Netlify plugin to generate Content-Security-Policy
rules based on inline styles and scripts.
Inputs
buildDir
- a string for setting the build directory.exclude
- an array of file paths to exclude from processing. (defaults[]
)policies
- an object of policies in camel case, for exampledefaultSrc
.setAllPolicies
- a boolean as to whether to set policies where none exists. (defaultsfalse
)
Code
const fs = require('fs')
const globby = require('globby')
const sha256 = require('js-sha256').sha256
const { JSDOM } = require('jsdom')
module.exports = {
onPostBuild: async ({ inputs }) => {
const { buildDir, exclude, policies, setAllPolicies } = inputs
const mergedPolicies = mergeWithDefaultPolicies(policies)
const htmlFiles = `${buildDir}/**/**.html`
const excludeFiles = (exclude || []).map((filePath) => `!${filePath.replace(/^!/, '')}`)
console.info(`Excluding ${excludeFiles.length} ${excludeFiles.length === 1 ? 'file' : 'files'}`)
const lookup = [htmlFiles].concat(excludeFiles)
const paths = await globby(lookup)
console.info(`Found ${paths.length} HTML ${paths.length === 1 ? 'file' : 'files'}`)
const file = paths.reduce((final, path) => {
const file = fs.readFileSync(path, 'utf-8')
const dom = new JSDOM(file)
const scripts = generateHashesFromElement(dom, 'script')
const styles = generateHashesFromElement(dom, 'style')
const inlineStyles = generateHashesFromStyle(dom, '[style]')
const webPath = path.replace(new RegExp(`^${buildDir}(.*)index\\.html$`), '1ドル')
const cspString = buildCSPArray(mergedPolicies, setAllPolicies, {
scriptSrc: scripts,
styleSrc: [...styles, ...inlineStyles],
}).join(' ')
final += `${webPath}\n Content-Security-Policy: ${cspString}\n`
return final
}, [])
fs.appendFileSync(`${buildDir}/_headers`, file)
console.info(`Done. Saved at ${buildDir}/_headers.`)
},
}
function mergeWithDefaultPolicies (policies) {
const defaultPolicies = {
defaultSrc: '',
childSrc: '',
connectSrc: '',
fontSrc: '',
frameSrc: '',
imgSrc: '',
manifestSrc: '',
mediaSrc: '',
objectSrc: '',
prefetchSrc: '',
scriptSrc: '',
scriptSrcElem: '',
scriptSrcAttr: '',
styleSrc: '',
styleSrcElem: '',
styleSrcAttr: '',
workerSrc: '',
}
return {...defaultPolicies, ...policies}
}
function generateHashesFromElement (dom, selector) {
const hashes = new Set()
for (const matchedElement of dom.window.document.querySelectorAll(selector)) {
if (matchedElement.innerHTML.length) {
const hash = sha256.arrayBuffer(matchedElement.innerHTML)
const base64hash = Buffer.from(hash).toString('base64')
hashes.add(`'sha256-${base64hash}'`)
}
}
return Array.from(hashes)
}
function generateHashesFromStyle (dom, selector) {
const hashes = new Set()
for (const matchedElement of dom.window.document.querySelectorAll(selector)) {
if (matchedElement.getAttribute('style').length) {
const hash = sha256.arrayBuffer(matchedElement.getAttribute('style'))
const base64hash = Buffer.from(hash).toString('base64')
hashes.add(`'sha256-${base64hash}'`)
}
}
return Array.from(hashes)
}
function buildCSPArray (allPolicies, setAllPolicies, hashes) {
const toKebabCase = (string) => string.replace(/([a-z0-9])([A-Z])/g, '1ドル-2ドル').toLowerCase()
return Object.keys(allPolicies).reduce((csp, key) => {
if (hashes[key] || allPolicies[key] || setAllPolicies) {
csp.push(
hashes[key]
? `${toKebabCase(key)} ${hashes[key].join(' ')} ${allPolicies[key]};`
: `${toKebabCase(key)} ${allPolicies[key]};`
)
}
return csp
}, [])
}
Queries
- The main query is the two
generateHashesFromElement
andgenerateHashesFromElement
functions, which are identical apart from one uses.innerHTML
and the other uses.getAttribute('style')
. If possible, I'd like to refactor these to be more DRY. - Should
generateHashesFromElement
be a closure? I initially built it as one, but the above issue forced me to split it up. - Is this code readable, or should I add comments?
- Is the code efficient, or could I make it faster? Run times are a thing in Netlify, so I need to optimise this if possible.
1 Answer 1
The main query is the two
generateHashesFromElement
andgenerateHashesFromElement
functions, which are identical apart from one uses.innerHTML
and the other uses.getAttribute('style')
. If possible, I'd like to refactor these to be more DRY.
There are 2 differences: the if
condition that gets checked, and the generation of the hash inside the if
block. You can make a general function by passing it one callback, one which maps the element to the value you want to check on it:
function generateHashes (dom, selector, getPropertyValue) {
const hashes = new Set()
for (const matchedElement of dom.window.document.querySelectorAll(selector)) {
const value = getPropertyValue(matchedElement)
if (value.length) {
const hash = sha256.arrayBuffer(value)
const base64hash = Buffer.from(hash).toString('base64')
hashes.add(`'sha256-${base64hash}'`)
}
}
return Array.from(hashes)
}
Then use it with
generateHashes(dom, 'script', element => element.innerHTML)
generateHashes(dom, '[style]', element => element.getAttribute('style'))
Is this code readable, or should I add comments?
It looks pretty good. The console
s are useful in that they not only inform the user what's currently going on, but they also show the reader of the code the intent of that section. My only significant question while reading was: what's the function signature of onPostBuild
? Maybe Netlify script-writers know at a glance, but I'm not one of them, and it wasn't immediately obvious what the type of the inputs
object's values are. You tell us what they are in the question text, but not in the code. (But it's probably fine if you document them elsewhere and have a comment pointing to it)
const { buildDir, exclude, policies, setAllPolicies } =
A bit of JSDoc describing parameters and a section of example output might help to make things clearer, for onPostBuild
and for buildCSPArray
.
Is the code efficient?
Not so much. The main culprit is:
const file = paths.reduce((final, path) => {
const file = fs.readFileSync(path, 'utf-8')
// do stuff with file
return final
}, [])
readFileSync
is blocking; the script waits for files to be read one-by-one, and once one is read, it has to finish processing it completely before starting to read the next file. If there are lots of files to be read, this could take a moderate amount of time. Consider processing files in parallel rather than in serial. To do this, use fs.promises
and Promise.all
to wait for each file to finish, probably something like:
const processFile = path => (file) => {
const dom = new JSDOM(file)
const scripts = generateHashes(dom, 'script', element => element.innerHTML)
const styles = generateHashes(dom, 'style', element => element.innerHTML)
const inlineStyles = generateHashes(dom, '[style]'element => element.getAttribute('style'))
const webPath = path.replace(new RegExp(`^${buildDir}(.*)index\\.html$`), '1ドル')
const cspString = buildCSPArray(mergedPolicies, setAllPolicies, {
scriptSrc: scripts,
styleSrc: [...styles, ...inlineStyles],
}).join(' ')
return { webPath, cspString };
}
const pathsAndCSPs = await Promise.all(
paths.map(
path => fs.promises.readFile(path).then(processFile(path))
)
);
const combinedCSP = pathsAndCSPs
.map(
({ webPath, cspString }) => `${webPath}\n Content-Security-Policy: ${cspString}`
)
.join('\n');
Other nitpicks:
const sha256 = require('js-sha256').sha256
simplifies toconst { sha256 } = require('js-sha256')
- Depends on the OS, but make sure that
buildDir
doesn't have any backslashes. Or, if it does have backslashes, make sure that they get double-escaped, else there will be problems when passed tonew RegExp
. - The code works, but
toKebabCase
requires an input of something incamelCase
, maybe call itcamelCaseToKebabCase
to be explicit.
buildCSPArray: I don't think reduce
is appropriate when the accumulator being passed through each iteration is the same object. Here's a discussion by Chrome developers on the subject. I'd prefer declaring an array outside, then iterating with for..of
. Or, even better, given what's going on inside the loop here, you could use filter
and map
- also, use Object.entries
to retrieve both the key and value at once, instead of using Object.keys
followed by obj[key]
lookup. In addition, since the left and right side of the string being pushed are the same, and because optional chaining exists, you can construct the string in a much more concise manner:
return Object.entries(allPolicies)
.filter(([key, defaultPolicy]) => hashes[key] || defaultPolicy || setAllPolicies)
.map(([key, defaultPolicy]) => {
const policy = `${hashes[key]?.join(' ') || ''} ${defaultPolicy};`;
return `${camelCaseToKebabCase(key)} ${policy};`
})
-
\$\begingroup\$ Great review! The function idea for
generateHashesFrom*
is perfect. That initial reduce with[]
is because I was pushing it to an array and then reducing it down (had some queries about whether I needed to parse existing files). What's the policy on accepting answers? I assume wait a bit. \$\endgroup\$marcellothearcane– marcellothearcane2020年09月22日 15:12:19 +00:00Commented Sep 22, 2020 at 15:12 -
\$\begingroup\$ Oops, I misread the indentation, you're right. (I like 4-space indentation for longer scripts, it makes those sorts of mistakes much harder to make) Feel free to wait a while to see if anyone else has ideas \$\endgroup\$CertainPerformance– CertainPerformance2020年09月22日 15:18:44 +00:00Commented Sep 22, 2020 at 15:18
-
\$\begingroup\$ Question about your
processFile
function. It needs to accesspath
(seepath.replace(blah)
) but I can't see where it gets passed in. Wouldfs.promises.readFile()
also return the path into the promise? \$\endgroup\$marcellothearcane– marcellothearcane2020年09月22日 16:04:49 +00:00Commented Sep 22, 2020 at 16:04 -
\$\begingroup\$ Mind blank -
path
is there in thatmap
, so I've made a closure. \$\endgroup\$marcellothearcane– marcellothearcane2020年09月22日 16:11:07 +00:00Commented Sep 22, 2020 at 16:11 -
\$\begingroup\$ No, you're right,
path
needs to be passed along. I wanted to putprocessFile
in a separate function to clearly separate it from thePromise.all
and.map
sections, but it needs an extra variable, so I madeprocessFile
a HOF instead so it can see thepath
\$\endgroup\$CertainPerformance– CertainPerformance2020年09月22日 16:16:02 +00:00Commented Sep 22, 2020 at 16:16