-
Notifications
You must be signed in to change notification settings - Fork 266
PHPLIB-80: Add IndexInfo helper methods for special index type options #521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ea84ff3 to
0892019
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
I have a few comments, questions, and suggestions. For some of the API decisions, we probably need to chat to @jmikola, as I am not fully aware with his decision considerations were, and I don't think I have a PR merged in the PHP library.
cheers,
Derick
src/Model/IndexInfo.php
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The [0] seems incorrect. I had to have a look at how this argument is even used — our server documentation isn't very clear on that. Through https://docs.mongodb.com/manual/core/wiredtiger/index.html#compression and https://docs.mongodb.com/manual/reference/method/db.createCollection/#create-collection-storage-engine-options I found out the syntax to create this:
> db.foo.createIndex( { test: '2dsphere' }, { storageEngine: { 'wiredTiger' : { 'configString': '' } }} );
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
The result of getIndexes() is:
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "test.foo"
},
{
"v" : 2,
"key" : {
"test" : "2dsphere"
},
"name" : "test_2dsphere",
"ns" : "test.foo",
"storageEngine" : {
"wiredTiger" : {
"configString" : ""
}
},
"2dsphereIndexVersion" : 3
}
]
Which implies that the storageEngine info is not a simple array, but again a document. I think you should add a functional test that hits the server for this.
src/Model/IndexInfo.php
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In line 200 (just below), the getVersion() method does a cast to an (integer) for the version number. Perhaps you want to do that here, as well as for get2dsphereIndexVersion too?
src/Model/IndexInfo.php
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self::DEFAULT_WEIGHT is not declared as a document (array) in the class definition, but as a simple number 1. I think it should return null if weights is empty, but happy to hear with @jmikola has to say on this too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next comment shows how weights looks like, when a text index exists:
src/Model/IndexInfo.php
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you can set the background option to false, this line is not right, as it would return true even if background is set to false, as in:
> db.foo1.createIndex( { test2: 'text' }, { background: false } );
{
"createdCollectionAutomatically" : true,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
> db.foo1.getIndexes();
[
...
{
"v" : 2,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "test2_text",
"ns" : "test.foo1",
"background" : false,
"weights" : {
"test2" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the documentation, empty() will return true if its argument is false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Teaching me PHP now huh ;-) You're absolutely right. My mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good that you test for default values in case elements in the IndexInfo struct are missing, but I also think the tests should cover the cases where the elements are present. In other comments I've shown a few possible cases (for simple text, and 2dsphere indexes), and I think it would be good to add tests cases for these as well.
src/Model/IndexInfo.php
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite quite sure whether we should return the defaults self::DEFAULT_LANGUAGE_OVERRIDE and self::DEFAULT_LANGUAGE that you have defined, or just null, as all non-text indexes will not have this information, and returning english/language might not be expected here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also questioning that. I don't think there's a good way to check whether the index is text or not so I will just have this and getLanguageOverride() return null if default_language/language_override is not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM. I leave it up to you to decide whether you want @jmikola to have a look too, as he knows this code a lot better than I do.
src/Model/IndexInfo.php
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Teaching me PHP now huh ;-) You're absolutely right. My mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably add one more test here that actually sets up the weights for different text fields, as is described here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments below about adding is<type>() methods for 2d, 2dsphere, geoHaystack, and text. It's been nearly three years since I first created PHPLIB-80 but I think we may come to regret introducing helpers for all of these specific options. Apologies for not giving this more thought before assigning the ticket.
I'm on the fence about partialFilterExpression (applicable to all types) and collation (applicable only to normal indexes and possibly "2dsphere" if I'm reading the note in the docs correctly). If we do add a getCollation() accessor, I think we may want to consider doing that in a new 1.4.0 ticket and include it for CollectionInfo as well. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any new methods should include .. versionadded:: 1.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this example differs from the approach in #523 where you created explicit IndexInfo objects. I have no qualms about that, but I'd like to see the example fully executable (sans require_once 'vendor/autoload.php';) if we go with this form. Borrowing from the createIndex() docs:
$collection = (new MongoDB\Client)->selectCollection('test', 'places'); $collection->createIndex(['pos' => '2dsphere']);
I think using "pos" for the field name would improve the context. That's what the server manual uses for geospatial indexes. We can do something similar for full text index examples, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is necessary to skip past the implicit _id index, correct? I imagine this might confuse readers, so I'd suggest this approach instead:
$collection = (new MongoDB\Client)->selectCollection('test', 'places'); $indexName = $collection->createIndex(['pos' => '2dsphere']); foreach ($collection->listIndexes() as $index) { if ($index->getName() === $indexName) { var_dump($index->get2dSphereIndexVersion()); } }
That should still output only a single value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change this documentation to demonstrate is2dSphere():
$collection = (new MongoDB\Client)->selectCollection('test', 'places'); $collection->createIndex(['pos' => '2dsphere']); foreach ($collection->listIndexes() as $index) { if ($index->is2dSphere()) { printf("%s has 2dsphereIndexVersion: %d\n", $index->getName(), $index['2dsphereIndexVersion']); } }
This should output:
pos_2dsphere has 2dsphereIndexVersion: 3
Side benefit here would be that we're also demonstrating how users can rely on offsetGet() to access type-specific index fields on their own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like https://docs.mongodb.com/manual/reference/command/listIndexes/ no longer contains any information about the documents returned from the server. It only discusses a cursor envelope and points to the shell's createIndex() documentation. I would suggest replacing all such "See Also" links for IndexModel methods with a link to:
:phpmethod:`MongoDB\\Collection::listIndexes()`
Note: this isn't an issue for CollectionInfo and DatabaseInfo:
- https://docs.mongodb.com/manual/reference/command/listCollections/#output
- https://docs.mongodb.com/manual/reference/command/listDatabases/#output
"See Also" links for those model classes can remain as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change this documentation to demonstrate isGeoHaystack():
$collection = (new MongoDB\Client)->selectCollection('test', 'places'); $collection->createIndex(['pos' => 'geoHaystack', 'x' => 1], ['bucketSize' => 5]); foreach ($collection->listIndexes() as $index) { if ($index->isGeoHaystack()) { printf("%s has bucketSize: %d\n", $index->getName(), $index['bucketSize']); } }
This should output:
pos_geoHaystack_x_1 has bucketSize: 5
src/Model/IndexInfo.php
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the methods introduced in this PR, I think it'd be more worthwhile to add "is" methods for the various index types. That would be both more maintainable and easier to document.
is2d()is2dSphere()isGeoHaystack()isText()
Consider the following shell session:
> db.foo.createIndex({title:"text",description:"text"})
{
"createdCollectionAutomatically" : true,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
> db.foo.createIndex({pos1:"2d"})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 2,
"numIndexesAfter" : 3,
"ok" : 1
}
> db.foo.createIndex({pos2:"2dsphere"})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 3,
"numIndexesAfter" : 4,
"ok" : 1
}
> db.foo.createIndex({pos3:"geoHaystack",x:1},{bucketSize:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 4,
"numIndexesAfter" : 5,
"ok" : 1
}
> db.foo.getIndexSpecs()
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "test.foo"
},
{
"v" : 2,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "title_text_description_text",
"ns" : "test.foo",
"weights" : {
"description" : 1,
"title" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
},
{
"v" : 2,
"key" : {
"pos1" : "2d"
},
"name" : "pos1_2d",
"ns" : "test.foo"
},
{
"v" : 2,
"key" : {
"pos2" : "2dsphere"
},
"name" : "pos2_2dsphere",
"ns" : "test.foo",
"2dsphereIndexVersion" : 3
},
{
"v" : 2,
"key" : {
"pos3" : "geoHaystack",
"x" : 1
},
"name" : "pos3_geoHaystack_x_1",
"ns" : "test.foo",
"bucketSize" : 1
}
]
>
A robust method to implement is<type>() for each of these four types might be:
return array_search('typeName', $this->getKey(), true) !== false;
This relies on getKey() to cast the "key" field to an array, since it is technically a document and could be a stdClass, and then uses array_search() to check if any field (i.e. key) has the type we're looking for. Roughly equivalent to:
foreach ($this->info['key'] as $type) { if ($type === 'typeName') { return true; } } return false;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, RST emits a warning if the character length of headings and lines are not equal. Three characters should be removed on this and the first line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newline after <?php for consistency with other examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation is a bit off here. Should be:
foreach ($collection->listIndexes() as $index) {
if ($index->is2dSphere()) {
printf("%s has 2dsphereIndexVersion: %d\n", $index->getName(), $index['2dsphereIndexVersion']);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove "or not" to be consistent with other documentation for "is" methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd propose deleting this method. I'm not sure it's relevant to users once the index has been created, and it's trivial to determine this via offsetGet().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider:
$expectedVersion = version_compare($this->getServerVersion(), '3.2.0', '<') ? 2 : 3;
$this->assertEquals($expectedVersion, $index['2dsphereIndexVersion']);
This ensures that if more assertions are later appended to this test, we limit the chances of a bug missing these for < 3.2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same idea as 2dsphere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as 2dsphere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as 2dsphere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the example specs I shared in #521 (comment), can you create tests based on the testTtlIndex() (and friends) for truthy return values? I think you can then also just add $this->assertFalse() for the new "is" methods to existing tests (see: $this->assertFalse($info->isSparse()); in testTtlIndex().
Since you already created functional tests, I think it's worth keeping those since we didn't have functional tests for creating all of these index types previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These operations aren't used, so we can remove these use statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not keen on using a base test class from another sibling namespace. That said, I changed this locally to extend MongoDB\Tests\FunctionalTestCase instead and it passed without issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're creating a collection in each test, let's drop it upon setup() and tearDown(). Consider MongoDB\Tests\Collection\FunctionalTestCase for prior art. In particular, we can leave the collection as-is in the event of a failure.
I see no reason we can't use the Collection methods, so $this->collection->drop(); is fine and we can skip the DropCollection operation class.
9b95c0a to
4b54760
Compare
https://jira.mongodb.org/browse/PHPLIB-80