1
0
Fork
You've already forked FOSSAlt
0
A python script to scrape opensourcealternative.to and locally search the data
Python 100%
2024年01月24日 12:35:53 -05:00
components move non-entrypoint files to a submodule 2023年11月07日 14:12:57 -05:00
.gitignore ignore index file 2023年08月05日 13:49:17 -04:00
cli.py add default path for index file 2023年11月07日 14:13:22 -05:00
LICENSE.md GPL 2023年08月05日 13:48:59 -04:00
Pipfile create an index of all the open source alternatives 2023年08月05日 11:57:48 -04:00
Pipfile.lock create an index of all the open source alternatives 2023年08月05日 11:57:48 -04:00
README.md fix README typo in filename 2023年11月07日 14:13:10 -05:00
scrape.py add some prints during the process to inform the user 2024年01月24日 12:35:53 -05:00

This is a scraper that (may eventually be used) to combine data from many different "open source alternatives to proprietary software" lists into one open dataset.

Currently supports pulling in data from:

  • opensourcealternative.to

Running

Both scripts use argparse, and are also documented through --help flags.

Running the scraper

pipenv install
pipenv run python3 ./scrape.py

this might take some time to run as there are some fairly substantial delays added when downloading individual project data so as not to annoy the webmasters of the services being scraped

Running the search tool

requires the index to be generated by the above step first

pipenv install
pipenv run python3 ./cli.py --index <path to index.json> --search "<search term>"