This is a wrapper for an API. For now all it does is get a certain document or search for relevant documents from a document store -- although I plan to build out more features later. I'm new to PHP and want to both improve my PHP skills and develop "idiomatic" PHP style. It depends on HTTPFUL
, which, for now, is being loaded from code that uses the class.
<?php
class PHPDocumentCloud
{
// property declaration
public $var = 'a default value';
public function get($id){
$url = "http://www.documentcloud.org/api/documents/" . $id . ".json";
$answer = \Httpful\Request::get($url)->send()->raw_body;
$answer = (json_decode($answer)->document);
return $answer;
}
private function get_search_page($query, $page, $perpage){
$url = "http://www.documentcloud.org/api/search.json?q=" . $query . '&page=' . $page . '&per_page=' . $perpage. '&mentions=3';
$answer = \Httpful\Request::get($url)->send();
$answer = (json_decode($answer));
$docs = $answer->documents;
return $docs;
}
public function search($query){
//check for &page=2
$page = 1;
$arr = array();
$document_list = array();
$loop = True;
while ($loop) {
$results = $this->get_search_page($query, $page, 1000);
if (!empty($results)){
$document_list = array_merge($results, $document_list);
$page++;
}else{
$loop = False;
}
}
return $document_list;
}
}
?>
2 Answers 2
Ok, the general critiques first:
- Please, subscribe to the coding standards. It may seem irrelevant to you now. But believe me when I say that coding style is as important as unit tests once you start collaborating on a project.
- Add
use
statements, to see all of the class' dependencies at a glance - Be consistent, both in method/property names as in the naming of local variables.
- Don't use a closing
?>
tag, if a script contains nothing but PHP code. The official docs explain why - Sanitize method arguments, don't simply string them together.
- Use type-hints and inject dependencies whenever you can.
- Doc-blocks save lives. Add them to your code ASAP. Use a decent IDE, and you'll soon find you can't do without these comments.
Now what does this all mean, applied to your code? Quite simply that you have to use that most valuable trait of all: common sense. A quick re-write of your code, applying most of these critiques.
<?php
use Httpful\Request;//add use statement
/**
* @package DocumentCloudAPI
* @author <your name>
*/
class PHPDocumentCloud
{
//immutable part of url is just that: a constant value
const BASE_URL = 'http://www.documentcloud.org/api/';
/**
* @var string
*/
public $var = 'a default value';
/**
* @var Request
*/
protected $requester = null;
/**
* Constructor, allows users to pass child class of
* dependency as parameter, assign to property
* @param Request $requester
*/
public function __construct(Request $requester)
{
$this->requester = $request;
}
/**
* @param int $id
* @return \stdClass
* @throws \InvalidArgumentException
* @throws \RuntimeException
*/
public function getById($id)
{//next line, after more descriptive method name
if (!is_numeric($id) || $id != (int) $id)//check numeric && not float
throw new \InvalidArgumentException(
sprintf(
'%s expects argument to be an int, instead saw %s',
__METHOD__,
gettype($id)
)
);
$url = sprintf(
'%sdocuments/%d.json',
self::BASE_URL,
(int) $id
);//construct request url
$response = $this->requester::get($url)->send();
//no response, or no raw_body property => exception
if (!$response || !property_exists($response, 'raw_body'))
throw new \RuntimeException(
sprintf(
'Request failed: %s',
json_encode($response)//force string
)
);
//check for json errors
$response = json_decode($response->raw_body);
$err = json_last_error();
if ($err === \JSON_ERROR_NONE)//No errors?
return $response->document;
//error, throw exception
throw new \RuntimeException(
sprintf(
'%d - %s',
$err,
json_last_error_msg()
)
);
}
/**
* Accepts assoc array, with keys q, page, per_page and mentions
* @param array $params
* @return array|\stdClass (don't know what type documents is
* @throws \InvalidArgumentException
* @throws \RuntimeExcpetion
*/
private function getSearchPage(array $params)
{
$url = self::BASE_URL.'search.json?';
//TODO: move this to a custom, more generic filterValidKeys method
//along the lines of: filterValid(array $params, array $keys)
//keys array could be assoc, where values are names of validation callbacks
$keys = array('q', 'page', 'per_page', 'mentions');
foreach ($keys as $key)
{
if (isset($params[$key]))
{
if ($key === 'q')
//in real life, check if the string isn't already url-encoded
$params[$key] = urlencode($params[$key]);
else//given that other params are numeric
$params[$key] = (int) $params[$key]
$url .= $key.'='.$params[$key].'&';//add & here
}
}
if (substr($url, -1) === '?')//if last char of url is ?, no params were set
throw new \InvalidArgumentException(
sprintf(
'%s requires at least one of %s keys to be set',
__METHOD__,
implode(',',$keys)
);
$response = $this->requester::get($url)->send();
if (!$response)
throw new \RuntimeException(
sprintf(
'Request to %s failed',
$url
)
);
$response = json_decode($answer);
$err = json_last_error();
if ($err !== \JSON_ERROR_NONE || !property_exists($response, 'documents'))
{
if ($err === \JSON_ERROR_NONE)
$msg = sprintf('%d - %s', $err, json_last_error_msg());
else
$msg = sprintf('documents not found in %s', json_encode($response));
throw new \RuntimeException($msg);
}
if (empty($response->documents))
return null;//return null if we reached the end
return $response->documents;
}
/**
* Public alias for getSearchPage, with loop
* @param string $query
* @return array
*/
public function search($query)
{
$params = array(
'q' => $query,
'page' => 1,
'per_page' => 1000,
'mentions' => 3
);
$list = array();
while ($results = $this->getSearchPage($params))
{
$list = array_merge($list, $results);
++$params['page'];//get next page
}
return $list;
}
}
Disclaimer:
I wrote this off the top of my head, so there might be some syntax errors and oversights. This code is not meant to be copy-pastable at all. It's just an example of how to use doc-blocks, type-hinting, data sanitation, exceptions and all coding standards.
Onwards
Now, at first glance, you may think this code over-complicates things hideously. And truth be told, it does add some more complexity to the mix. What do you get in return?
Again at first glance: a class that throws exceptions whenever it encounters something that isn't quite right. But that's how a class should behave. A class is meant to perform one (and just one) task. If the data that is provided does not fit the criteria, that class's job is not to make do. The code that is calling the class needs to be fixed. An exception represents an emergency.
The calling code should handle the exception, or should be fixed. You can't ignore exceptions.
In the long run, though, this class offers a safer API, it's quite clear-cut. As long as you (the person calling the code) provide the correct data, you know what to expect. You can inject your own implementation of this Httpful\Request
class to work with, and adding simple methods to perform certain calls are very easy to add. You just have to, basically, follow the same pattern:
- Build url, starting with
self::BASE_URL
- use
$this->requester::get($url)->send();
to get the response - check the response, and the result of
json_encode
. Return, or throw exceptions accordingly.
If you don't quite understand all of the functions used here, then don't worry: php.net/<function here> will tell you all you need to know.
Just one more thing:
In going over your code, and rewriting it, I couldn't help myself, but code like this:
$loop = True;
while ($loop) {
$results = $this->get_search_page($query, $page, 1000);
if (!empty($results)){
$document_list = array_merge($results, $document_list);
$page++;
}else{
$loop = False;
}
}
Is just all shades of wrong. A while(true)
loop (in PHP it's true
, as opposed to python's True
) is to be avoided as much as possible.
However, if you want to break an infinite loop, don't do that by setting a variable to another value, do so by using the cleverly named break;
keyword:
while(true)
{
if (empty($foo))
break;//ends loop
$foo = $this->doStuff();
}
But in most cases, ie: fetching rows from a result-set, it's best to perform a loop while you're fetching the data, and write it accordingly:
while($row = $stmt->fetch())
{
//do stuff with $row
}
This works, simply because an assignment is an expression, and like any expression, it can be evaluated to a singular value. PHP assignment expressions evaluate to the new value of the left operand. In other words:
if ($x = false)
{//$x is left operand, being assigned false, expression evaluates to false
echo 'This will never echo';
}
I am not a PHP developer so I can't give you an expect implementation but I do notice some things that will make handling the API easier.
PHPDocumentCloud
does not need to contain 'PHP' since that's a given for any class you create. Unless that is the actual name of the service, name it DocumentCloud
.
I don't know what $var
is and its content is also unclear. By using $
I know it's a variable; try to convey its purpose instead.
Keep this also in mind for your variables inside the search
function: $arr
doesn't tell me what the array is about and I already know that $loop
is used inside the loop.
Each request starts with the same baseurl; factor this out to a private field and use that private field in every method. This will give you an easier screening of what the exact API path being used is.
The correct terminology is request and response, not "answer".
json_decode()
calls? Their website says thesend()
call will return the json decoded response on its own. \$\endgroup\$