Skip to main content
Code Review

Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/
Source Link

Performing this task in parallel would help greatly although PHP is not the best language to do so in.

I would do so by spawning PHP processes by POSTing to other PHP scripts (asynchronously asynchronously), possibly posting to them the links they should get. You would need to store whether the line is available or not in a database, I would use SQLite in this case (it doesn't need a server to be setup).

So a possible setup could look something like:

Master Process: Splits the main link file into n parts then POSTs the parts to the child pages. It would need to know when the child processes are finished, it could do so by having a polling loop that checks if the number of rows in the database is equal to the number of links, you would need to make sure you put in a sleep function so it doesn't poll too often. When the child processes are done and the polling breaks out, you could then take the data in the database and convert it to CSV.

Child Processes: This page receives a set of links via POST, it then goes through each one checking if it contains the string and marking the result down in the database.

I would not use this for production code, PHP is not made for this and there are lots of things that could go wrong. If it was possible I would do this sort of thing in languages with built in parallelism such as Golang or Clojure.

Performing this task in parallel would help greatly although PHP is not the best language to do so in.

I would do so by spawning PHP processes by POSTing to other PHP scripts (asynchronously), possibly posting to them the links they should get. You would need to store whether the line is available or not in a database, I would use SQLite in this case (it doesn't need a server to be setup).

So a possible setup could look something like:

Master Process: Splits the main link file into n parts then POSTs the parts to the child pages. It would need to know when the child processes are finished, it could do so by having a polling loop that checks if the number of rows in the database is equal to the number of links, you would need to make sure you put in a sleep function so it doesn't poll too often. When the child processes are done and the polling breaks out, you could then take the data in the database and convert it to CSV.

Child Processes: This page receives a set of links via POST, it then goes through each one checking if it contains the string and marking the result down in the database.

I would not use this for production code, PHP is not made for this and there are lots of things that could go wrong. If it was possible I would do this sort of thing in languages with built in parallelism such as Golang or Clojure.

Performing this task in parallel would help greatly although PHP is not the best language to do so in.

I would do so by spawning PHP processes by POSTing to other PHP scripts (asynchronously), possibly posting to them the links they should get. You would need to store whether the line is available or not in a database, I would use SQLite in this case (it doesn't need a server to be setup).

So a possible setup could look something like:

Master Process: Splits the main link file into n parts then POSTs the parts to the child pages. It would need to know when the child processes are finished, it could do so by having a polling loop that checks if the number of rows in the database is equal to the number of links, you would need to make sure you put in a sleep function so it doesn't poll too often. When the child processes are done and the polling breaks out, you could then take the data in the database and convert it to CSV.

Child Processes: This page receives a set of links via POST, it then goes through each one checking if it contains the string and marking the result down in the database.

I would not use this for production code, PHP is not made for this and there are lots of things that could go wrong. If it was possible I would do this sort of thing in languages with built in parallelism such as Golang or Clojure.

Source Link
Jessie
  • 186
  • 4

Performing this task in parallel would help greatly although PHP is not the best language to do so in.

I would do so by spawning PHP processes by POSTing to other PHP scripts (asynchronously), possibly posting to them the links they should get. You would need to store whether the line is available or not in a database, I would use SQLite in this case (it doesn't need a server to be setup).

So a possible setup could look something like:

Master Process: Splits the main link file into n parts then POSTs the parts to the child pages. It would need to know when the child processes are finished, it could do so by having a polling loop that checks if the number of rows in the database is equal to the number of links, you would need to make sure you put in a sleep function so it doesn't poll too often. When the child processes are done and the polling breaks out, you could then take the data in the database and convert it to CSV.

Child Processes: This page receives a set of links via POST, it then goes through each one checking if it contains the string and marking the result down in the database.

I would not use this for production code, PHP is not made for this and there are lots of things that could go wrong. If it was possible I would do this sort of thing in languages with built in parallelism such as Golang or Clojure.

lang-php

AltStyle によって変換されたページ (->オリジナル) /