4
\$\begingroup\$

The following code is a Node-based Netlify plugin to generate Content-Security-Policy rules based on inline styles and scripts.

Inputs

  • buildDir - a string for setting the build directory.
  • exclude - an array of file paths to exclude from processing. (defaults [])
  • policies - an object of policies in camel case, for example defaultSrc.
  • setAllPolicies - a boolean as to whether to set policies where none exists. (defaults false)

Code

const fs = require('fs')
const globby = require('globby')
const sha256 = require('js-sha256').sha256
const { JSDOM } = require('jsdom')
module.exports = {
 onPostBuild: async ({ inputs }) => {
 const { buildDir, exclude, policies, setAllPolicies } = inputs
 const mergedPolicies = mergeWithDefaultPolicies(policies)
 const htmlFiles = `${buildDir}/**/**.html`
 const excludeFiles = (exclude || []).map((filePath) => `!${filePath.replace(/^!/, '')}`)
 console.info(`Excluding ${excludeFiles.length} ${excludeFiles.length === 1 ? 'file' : 'files'}`)
 const lookup = [htmlFiles].concat(excludeFiles)
 const paths = await globby(lookup)
 console.info(`Found ${paths.length} HTML ${paths.length === 1 ? 'file' : 'files'}`)
 const file = paths.reduce((final, path) => {
 const file = fs.readFileSync(path, 'utf-8')
 const dom = new JSDOM(file)
 const scripts = generateHashesFromElement(dom, 'script')
 const styles = generateHashesFromElement(dom, 'style')
 const inlineStyles = generateHashesFromStyle(dom, '[style]')
 
 const webPath = path.replace(new RegExp(`^${buildDir}(.*)index\\.html$`), '1ドル')
 const cspString = buildCSPArray(mergedPolicies, setAllPolicies, {
 scriptSrc: scripts,
 styleSrc: [...styles, ...inlineStyles],
 }).join(' ')
 
 final += `${webPath}\n Content-Security-Policy: ${cspString}\n`
 return final
 }, [])
 fs.appendFileSync(`${buildDir}/_headers`, file)
 console.info(`Done. Saved at ${buildDir}/_headers.`)
 },
}
function mergeWithDefaultPolicies (policies) {
 const defaultPolicies = {
 defaultSrc: '',
 childSrc: '',
 connectSrc: '',
 fontSrc: '',
 frameSrc: '',
 imgSrc: '',
 manifestSrc: '',
 mediaSrc: '',
 objectSrc: '',
 prefetchSrc: '',
 scriptSrc: '',
 scriptSrcElem: '',
 scriptSrcAttr: '',
 styleSrc: '',
 styleSrcElem: '',
 styleSrcAttr: '',
 workerSrc: '',
 }
 return {...defaultPolicies, ...policies}
}
function generateHashesFromElement (dom, selector) {
 const hashes = new Set()
 for (const matchedElement of dom.window.document.querySelectorAll(selector)) {
 if (matchedElement.innerHTML.length) {
 const hash = sha256.arrayBuffer(matchedElement.innerHTML)
 const base64hash = Buffer.from(hash).toString('base64')
 hashes.add(`'sha256-${base64hash}'`)
 }
 }
 return Array.from(hashes)
}
function generateHashesFromStyle (dom, selector) {
 const hashes = new Set()
 for (const matchedElement of dom.window.document.querySelectorAll(selector)) {
 if (matchedElement.getAttribute('style').length) {
 const hash = sha256.arrayBuffer(matchedElement.getAttribute('style'))
 const base64hash = Buffer.from(hash).toString('base64')
 hashes.add(`'sha256-${base64hash}'`)
 }
 }
 return Array.from(hashes)
}
function buildCSPArray (allPolicies, setAllPolicies, hashes) {
 const toKebabCase = (string) => string.replace(/([a-z0-9])([A-Z])/g, '1ドル-2ドル').toLowerCase()
 return Object.keys(allPolicies).reduce((csp, key) => {
 if (hashes[key] || allPolicies[key] || setAllPolicies) {
 csp.push(
 hashes[key]
 ? `${toKebabCase(key)} ${hashes[key].join(' ')} ${allPolicies[key]};`
 : `${toKebabCase(key)} ${allPolicies[key]};`
 )
 }
 return csp
 }, [])
}

Queries

  • The main query is the two generateHashesFromElement and generateHashesFromElement functions, which are identical apart from one uses .innerHTML and the other uses .getAttribute('style'). If possible, I'd like to refactor these to be more DRY.
  • Should generateHashesFromElement be a closure? I initially built it as one, but the above issue forced me to split it up.
  • Is this code readable, or should I add comments?
  • Is the code efficient, or could I make it faster? Run times are a thing in Netlify, so I need to optimise this if possible.
asked Sep 22, 2020 at 12:11
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

The main query is the two generateHashesFromElement and generateHashesFromElement functions, which are identical apart from one uses .innerHTML and the other uses .getAttribute('style'). If possible, I'd like to refactor these to be more DRY.

There are 2 differences: the if condition that gets checked, and the generation of the hash inside the if block. You can make a general function by passing it one callback, one which maps the element to the value you want to check on it:

function generateHashes (dom, selector, getPropertyValue) {
 const hashes = new Set() 
 for (const matchedElement of dom.window.document.querySelectorAll(selector)) { 
 const value = getPropertyValue(matchedElement)
 if (value.length) { 
 const hash = sha256.arrayBuffer(value)
 const base64hash = Buffer.from(hash).toString('base64') 
 hashes.add(`'sha256-${base64hash}'`) 
 } 
 } 
 return Array.from(hashes) 
}

Then use it with

generateHashes(dom, 'script', element => element.innerHTML)
generateHashes(dom, '[style]', element => element.getAttribute('style'))

Is this code readable, or should I add comments?

It looks pretty good. The consoles are useful in that they not only inform the user what's currently going on, but they also show the reader of the code the intent of that section. My only significant question while reading was: what's the function signature of onPostBuild? Maybe Netlify script-writers know at a glance, but I'm not one of them, and it wasn't immediately obvious what the type of the inputs object's values are. You tell us what they are in the question text, but not in the code. (But it's probably fine if you document them elsewhere and have a comment pointing to it)

 const { buildDir, exclude, policies, setAllPolicies } = 

A bit of JSDoc describing parameters and a section of example output might help to make things clearer, for onPostBuild and for buildCSPArray.

Is the code efficient?

Not so much. The main culprit is:

const file = paths.reduce((final, path) => {
 const file = fs.readFileSync(path, 'utf-8')
 // do stuff with file
 return final
}, [])

readFileSync is blocking; the script waits for files to be read one-by-one, and once one is read, it has to finish processing it completely before starting to read the next file. If there are lots of files to be read, this could take a moderate amount of time. Consider processing files in parallel rather than in serial. To do this, use fs.promises and Promise.all to wait for each file to finish, probably something like:

const processFile = path => (file) => {
 const dom = new JSDOM(file)
 const scripts = generateHashes(dom, 'script', element => element.innerHTML)
 const styles = generateHashes(dom, 'style', element => element.innerHTML)
 const inlineStyles = generateHashes(dom, '[style]'element => element.getAttribute('style'))
 const webPath = path.replace(new RegExp(`^${buildDir}(.*)index\\.html$`), '1ドル')
 const cspString = buildCSPArray(mergedPolicies, setAllPolicies, {
 scriptSrc: scripts,
 styleSrc: [...styles, ...inlineStyles],
 }).join(' ')
 return { webPath, cspString };
}
const pathsAndCSPs = await Promise.all(
 paths.map(
 path => fs.promises.readFile(path).then(processFile(path))
 )
);
const combinedCSP = pathsAndCSPs
 .map(
 ({ webPath, cspString }) => `${webPath}\n Content-Security-Policy: ${cspString}`
 )
 .join('\n');

Other nitpicks:

  • const sha256 = require('js-sha256').sha256 simplifies to const { sha256 } = require('js-sha256')
  • Depends on the OS, but make sure that buildDir doesn't have any backslashes. Or, if it does have backslashes, make sure that they get double-escaped, else there will be problems when passed to new RegExp.
  • The code works, but toKebabCase requires an input of something in camelCase, maybe call it camelCaseToKebabCase to be explicit.

buildCSPArray: I don't think reduce is appropriate when the accumulator being passed through each iteration is the same object. Here's a discussion by Chrome developers on the subject. I'd prefer declaring an array outside, then iterating with for..of. Or, even better, given what's going on inside the loop here, you could use filter and map - also, use Object.entries to retrieve both the key and value at once, instead of using Object.keys followed by obj[key] lookup. In addition, since the left and right side of the string being pushed are the same, and because optional chaining exists, you can construct the string in a much more concise manner:

return Object.entries(allPolicies)
 .filter(([key, defaultPolicy]) => hashes[key] || defaultPolicy || setAllPolicies)
 .map(([key, defaultPolicy]) => {
 const policy = `${hashes[key]?.join(' ') || ''} ${defaultPolicy};`;
 return `${camelCaseToKebabCase(key)} ${policy};`
 })
answered Sep 22, 2020 at 14:52
\$\endgroup\$
7
  • \$\begingroup\$ Great review! The function idea for generateHashesFrom* is perfect. That initial reduce with [] is because I was pushing it to an array and then reducing it down (had some queries about whether I needed to parse existing files). What's the policy on accepting answers? I assume wait a bit. \$\endgroup\$ Commented Sep 22, 2020 at 15:12
  • \$\begingroup\$ Oops, I misread the indentation, you're right. (I like 4-space indentation for longer scripts, it makes those sorts of mistakes much harder to make) Feel free to wait a while to see if anyone else has ideas \$\endgroup\$ Commented Sep 22, 2020 at 15:18
  • \$\begingroup\$ Question about your processFile function. It needs to access path (see path.replace(blah)) but I can't see where it gets passed in. Would fs.promises.readFile() also return the path into the promise? \$\endgroup\$ Commented Sep 22, 2020 at 16:04
  • \$\begingroup\$ Mind blank - path is there in that map, so I've made a closure. \$\endgroup\$ Commented Sep 22, 2020 at 16:11
  • \$\begingroup\$ No, you're right, path needs to be passed along. I wanted to put processFile in a separate function to clearly separate it from the Promise.all and .map sections, but it needs an extra variable, so I made processFile a HOF instead so it can see the path \$\endgroup\$ Commented Sep 22, 2020 at 16:16

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.