Overview
I have created a bash script (triggered via GitHub Actions) that does the following:
- Parse a list of YouTube channel IDs and nicknames.
- Fetch their metadata via YouTube's Channel API.
- Build up Markdown tables using this metadata.
- Load a Markdown template, and replace a placeholder with the generated Markdown.
Additional functionality implemented is:
- Display optional arbitrary emoji next to specific channel names (
${ARRAY_LINE[2]}
). - Format numbers to be human readable (e.g. 1200 -> 1.2K).
- Log channels being processed.
Whilst I've run the code through ShellCheck and made other improvements, I suspect there are weaknesses around:
- Parsing
output.json
5x, fetching a different field each time. - Replacing the placeholder text.
Code
The script itself youtube-update.sh
:
#!/bin/bash
HEADER_PREFIX="#### "
PLACEHOLDER_TEXT="dynamic-channel-data"
OUTPUT=""
# Convert list of channels into Markdown tables
while read -r LINE; do
if [[ ${LINE} == ${HEADER_PREFIX}* ]]; then
echo "Adding header ${LINE}"
OUTPUT="${OUTPUT}\n${LINE}\n\n"
OUTPUT="${OUTPUT}| Channel | # Videos | Subscribers | Views |\n| --- | --- | --- | --- |\n"
else
IFS=';' read -r -a ARRAY_LINE <<< "${LINE}" # Split line by semi-colon
echo "Adding channel ${ARRAY_LINE[1]} (${ARRAY_LINE[0]})"
curl "https://youtube.googleapis.com/youtube/v3/channels?part=statistics,snippet&id=${ARRAY_LINE[0]}&key=${API_KEY}" \
--header 'Accept: application/json' \
-fsSL -o output.json
# Pull channel data out of response if possible
if [[ $(jq -r '.pageInfo.totalResults' output.json) == 1 ]]; then
TITLE=$(jq -r '.items[0].snippet.title' output.json)
URL=$(jq -r '.items[0].snippet.customUrl' output.json)
VIDEO_COUNT=$(jq -r '.items[0].statistics.videoCount' output.json | numfmt --to=si)
SUBSCRIBER_COUNT=$(jq -r '.items[0].statistics.subscriberCount' output.json | numfmt --to=si)
VIEW_COUNT=$(jq -r '.items[0].statistics.viewCount' output.json | numfmt --to=si)
echo "Added ${TITLE}: ${VIDEO_COUNT} videos (${VIEW_COUNT} views)"
OUTPUT="${OUTPUT}| ${ARRAY_LINE[2]}[${TITLE}](https://youtube.com/${URL}) | ${VIDEO_COUNT} | ${SUBSCRIBER_COUNT} | ${VIEW_COUNT} |\n"
else
echo "Failed! Bad response received: $(<output.json)"
exit 1
fi
fi
done < "${WORKSPACE}/automation/channels.txt"
# Replace placeholder in template with output, updating the README
TEMPLATE_CONTENTS=$(<"${WORKSPACE}/automation/template.md")
echo -e "${TEMPLATE_CONTENTS//${PLACEHOLDER_TEXT}/${OUTPUT}}" > "${WORKSPACE}/README.md"
# Debug
cat "${WORKSPACE}/README.md"
For additional context, this script is triggered via a GitHub actions workflow (metadata-update.yml
):
name: Update YouTube stats
on:
schedule:
- cron: '0 8 * * *'
workflow_dispatch:
jobs:
metadata-update:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Checkout channel config file
uses: actions/[email protected]
with:
sparse-checkout: |
automation/*
README.md
sparse-checkout-cone-mode: false
- name: Update YouTube data
run: |
chmod +x ./automation/youtube-update.sh
./automation/youtube-update.sh
env:
API_KEY: ${{ secrets.API_KEY }}
WORKSPACE: ${{ github.workspace }}
- name: Save changes
uses: stefanzweifel/git-auto-commit-action@v4
with:
commit_message: Updated YouTube statistics
commit_author: GitHub Actions <[email protected]>
file_pattern: 'README.md'
The YouTube API response (truncated to relevant fields) looks like:
{
"pageInfo": {
"totalResults": 1
},
"items": [
{
"snippet": {
"title": "Google for Developers",
"customUrl": "@googledevelopers"
},
"statistics": {
"viewCount": "234466180",
"subscriberCount": "2300000",
"videoCount": "5807"
}
}
]
}
Examples
A typical run might convert:
#### Stream Archives
UC2oWuUSd3t3t5O3Vxp4lgAA;2018-streams;πΆ
UC4ik7iSQI1DZVqL18t-Tffw;2016-2018streams
UCjyrSUk-1AGjALTcWneRaeA;2016-2017streams
into:
#### Stream Archives
| Channel | # Videos | Subscribers | Views |
| --- | --- | --- | --- |
| πΆ[Jerma Stream Archive](https://youtube.com/@jermastreamarchive) | 770 | 274K | 88M |
| [Ster/Jerma Stream Archive](https://youtube.com/@sterjermastreamarchive) | 972 | 47K | 20M |
| [starkiller201096x](https://youtube.com/@starkiller201096x) | 79 | 2.9K | 1.5M |
1 Answer 1
Nice script!
Read multiple values by name rather than into an array
Instead of:
IFS=';' read -r -a ARRAY_LINE <<< "${LINE}" # Split line by semi-colon
You could get the values directly into variables with descriptive names:
IFS=';' read -r channel_id channel_name emoji <<< "${line}"
Read multiple values from a single jq
call
You could read multiple values with a single call by making jq
print all the relevant fields, and using read
, for example:
{
read -r title
read -r url
} < <(jq -r '.items[0].snippet.title, .items[0].snippet.customUrl' < output.json)
To avoid the line becoming too long, I would put the fields into an array like this:
jq_fields=(
'.items[0].snippet.title'
'.items[0].snippet.customUrl'
'.items[0].statistics.videoCount'
'.items[0].statistics.subscriberCount'
'.items[0].statistics.viewCount'
)
{
read -r title
read -r url
read -r video_count
read -r subscriber_count
read -r view_count
} < <(IFS=','; jq -r "${jq_fields[*]}" < output.json)
Accumulate lines in an array
The way you accumulated the lines in the string value OUTPUT
is ok.
I prefer to use arrays in situations like this, it would look something like this:
output=()
# ...
output+=("${line}")
output+=("")
output+=("| Channel | # Videos | Subscribers | Views |\n| --- | --- | --- | --- |")
# ...
output+=("| ${emoji}[${title}](https://youtube.com/${url}) | ${video_count} | ${subscriber_count} | ${view_count} |")
# ...
(
IFS=$'\n'
echo "${template_content//${placeholder_text}/${output[*]}}"
) >"${WORKSPACE}/README.md"
Do not use ALL_CAPS names for your variables
To avoid conflict and confusion with system environment variables, it's recommended to not use ALL_CAPS names in script variables.
By making the script's own variables lowercase, it becomes clear what is expected to be present in the environment and what belongs to the script, which improves readability.
Put repeatedly used constant values into variables
output.json
is referenced in multiple places.
To leave open the option to use a different name or at a different path,
I would put this into a variable.
This way code editors could also help you avoid typos.
Define important constants in variables early in the file
The paths "${WORKSPACE}/automation/channels.txt"
and "${WORKSPACE}/README.md"
are very important key pieces in the behavior of the script.
To make them easy to see (and adjust), I would put these values in variables, defined near the top of the file, right alongside header_prefix
and placeholder_text
.
-
\$\begingroup\$ This is absolutely excellent feedback, thank you! Really appreciate the effort, will work on making the suggested changes, having the explanation for each point helps a lot too. \$\endgroup\$Jake Lee– Jake Lee2023εΉ΄09ζ08ζ₯ 21:57:11 +00:00Commented Sep 8, 2023 at 21:57