-
Notifications
You must be signed in to change notification settings - Fork 49
github actions: run integration test on cassandra #339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4a5dea2
to
7aeeef4
Compare
we need to skip the serverless tests, cause they are scylla specific
7aeeef4
to
ac140e1
Compare
@mykaul
mykaul
Jul 4, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to enable sasi_indexes_enabled ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a clue, it's just same as the driver test was doing before for configuring cassandra, but in a new syntax.
here's the failing tests:
2024年07月04日T11:52:49.6219047Z FAILED tests/integration/standard/test_custom_protocol_handler.py::CustomProtocolHandlerTest::test_protocol_divergence_v5_fail_by_continuous_paging - Failed: Unexpected success
2024年07月04日T11:52:49.6219804Z FAILED tests/integration/standard/test_custom_protocol_handler.py::CustomProtocolHandlerTest::test_protocol_divergence_v5_fail_by_flag_uses_int - Failed: Unexpected success
2024年07月04日T11:52:49.6220556Z FAILED tests/integration/standard/test_custom_protocol_handler.py::CustomProtocolHandlerTest::test_protocol_v5_uses_flag_int - Failed: Unexpected success
2024年07月04日T11:52:49.6221541Z FAILED tests/integration/standard/test_ip_change.py::TestIpAddressChange::test_change_address_during_live_session - ccmlib.node.TimeoutError: 04 Jul 2024 11:34:19 [node3] Missing: ['Starting listening for CQL clients']:
2024年07月04日T11:52:49.6221750Z INFO [main] 2024年07月04日 11:24:20,202 YamlConfigura.....
2024年07月04日T11:52:49.6221851Z See system.log for remainder
2024年07月04日T11:52:49.6222250Z FAILED tests/integration/standard/test_metadata.py::TestCodeCoverage::test_case_sensitivity
2024年07月04日T11:52:49.6222548Z FAILED tests/integration/standard/test_metadata.py::TestCodeCoverage::test_replicas
2024年07月04日T11:52:49.6223767Z FAILED tests/integration/standard/test_rate_limit_exceeded.py::TestRateLimitExceededException::test_rate_limit_exceeded - cassandra.protocol.SyntaxException: <Error from server: code=2000 [Syntax error in CQL query] message="Unknown property 'per_partition_rate_limit'">
2024年07月04日T11:52:49.6226136Z FAILED tests/integration/standard/test_types.py::TypeTests::test_insert_collection_with_null_fails - cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 127.0.0.2:9042 dc1>: <Error from server: code=0000 [Server error] message="java.lang.NullPointerException">, <Host: 127.0.0.3:9042 dc1>: <Error from server: code=0000 [Server error] message="java.lang.NullPointerException">, <Host: 127.0.0.1:9042 dc1>: <Error from server: code=0000 [Server error] message="java.lang.NullPointerException">})
- test_rate_limit_exceeded : should be scylla only
- test_custom_protocol_handler: seems like they are now passing (guessing we are missing updates from upstream about that)
- test_ip_change: using scylla specific command line arguments
- test_metadata.py::TestCodeCoverage::test_replicas :test_case_sensitivity : not clear what's the failure maybe xfail ?
fe5d19e
to
83dcd61
Compare
6343573
to
c68f40c
Compare
last failures on cassandra runs:
FAILED tests/integration/standard/test_cluster.py::ClusterTests::test_refresh_schema_no_wait - cassandra.DriverException: Schema metadata was not refreshed. See log for details.
FAILED tests/integration/standard/test_metadata.py::MetadataTimeoutTest::test_timeout - AssertionError: query `SELECT * FROM system.peers_v2` does not contain `USING TIMEOUT 2000ms`
99a37c9
to
ab2b233
Compare
asyncio:
2024年11月13日T17:20:52.1674462Z FAILED tests/integration/standard/test_cluster.py::TestAddressTranslation::test_address_translator_basic - cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers', {'127.0.0.1:9042': ConnectionShutdown('CRC mismatch on header 8cd102. Received 8556", computed cc0464.')})
2024年11月13日T17:20:13.9750646Z FAILED tests/integration/standard/test_cluster.py::ClusterTests::test_refresh_schema - cassandra.DriverException: Schema metadata was not refreshed. See log for details.
2024年11月13日T17:20:13.9751429Z FAILED tests/integration/standard/test_cluster.py::HostStateTest::test_down_event_with_active_connection - AssertionError: False is not true : Host was not up on iteration 0
202
libev and asyncore:
2024年11月13日T17:19:27.9248641Z > start_cluster_wait_for_up(ccm_cluster)
2024年11月13日T17:19:27.9248893Z
2024年11月13日T17:19:27.9249081Z tests/integration/standard/test_authentication.py:44:
2024年11月13日T17:19:27.9249561Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2024年11月13日T17:19:27.9250073Z tests/integration/__init__.py:501: in start_cluster_wait_for_up
2024年11月13日T17:19:27.9250608Z cluster.start(wait_for_binary_proto=True, wait_other_notice=True)
2024年11月13日T17:19:27.9251233Z .test-venv/lib/python3.10/site-packages/ccmlib/cluster.py:489: in start
2024年11月13日T17:19:27.9252054Z p = node.start(update_pid=False, jvm_args=jvm_args, profile_options=profile_options, verbose=verbose, quiet_start=quiet_start)
2024年11月13日T17:19:27.9253011Z .test-venv/lib/python3.10/site-packages/ccmlib/node.py:590: in start
2024年11月13日T17:19:27.9253470Z common.check_socket_available(itf)
2024年11月13日T17:19:27.9254063Z .test-venv/lib/python3.10/site-packages/ccmlib/common.py:774: in check_socket_available
2024年11月13日T17:19:27.9254606Z pid = pid_listening_on(addr, port)
2024年11月13日T17:19:27.9255003Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2024年11月13日T17:19:27.9255325Z
2024年11月13日T17:19:27.9255453Z addr = '127.0.0.1', port = 7000
2024年11月13日T17:19:27.9255651Z
2024年11月13日T17:19:27.9255784Z def pid_listening_on(addr: str, port: int):
2024年11月13日T17:19:27.9256259Z '''get pid of the process listening on the given port'''
2024年11月13日T17:19:27.9256645Z pid = -1
2024年11月13日T17:19:27.9256921Z if lsof := shutil.which('lsof'):
2024年11月13日T17:19:27.9257415Z result = subprocess.run([lsof, '-t', '-i', f'tcp@{addr}:{port}'],
2024年11月13日T17:19:27.9257890Z stdout=subprocess.PIPE,
2024年11月13日T17:19:27.9258264Z check=True)
2024年11月13日T17:19:27.9258584Z if result.stdout:
2024年11月13日T17:19:27.9258866Z > pid = int(result.stdout)
2024年11月13日T17:19:27.9259420Z E ValueError: invalid literal for int() with base 10: b'33926\n33950\n33975\n'
2024年11月13日T17:19:27.9259804Z
2024年11月13日T17:19:27.9260108Z .test-venv/lib/python3.10/site-packages/ccmlib/common.py:741: ValueError
2024年11月13日T17:19:27.9260724Z _________ ERROR at setup of AuthenticationTests.test_connect_wrong_pwd _________
the last one seems like leftover from the first run session
0a46a34
to
4536002
Compare
multiple issues identified
one addressed in 4536002
and one in ccm (a by product of the first issue):
scylladb/scylla-ccm#626
now we have 1 failing across the board:
FAILED tests/integration/standard/test_query.py::LightweightTransactionTests::test_no_connection_refused_on_timeout - AssertionError: Unexpected exception ErrorMessage: CAS operation result is unknown - proposal accepted by 1 but not a quorum.
and those in some:
FAILED tests/integration/standard/test_rack_aware_policy.py::RackAwareRoundRobinPolicyTests::test_rack_aware - AssertionError: <cassandra.cluster.ResultSet object at 0x7f6fac5d2950> != [(0, 0, 0)]
FAILED tests/integration/standard/test_udts.py::UDTTests::test_can_insert_prepared_registered_udts - cassandra.cluster.UserTypeDoesNotExist: User type user does not exist in keyspace udttests
FAILED tests/integration/standard/test_cluster.py::HostStateTest::test_down_event_with_active_connection - AssertionError: False is not true : Host was not up on iteration 0
dd65e7a
to
374e0c7
Compare
left with asyncio, not being too stable:
FAILED tests/integration/standard/test_cluster.py::HostStateTest::test_down_event_with_active_connection - AssertionError: False is not true : Host was not up on iteration 0
FAILED tests/integration/standard/test_metadata.py::SchemaMetadataTests::test_refresh_schema_metadata - cassandra.DriverException: Schema metadata was not refreshed. See log for details.
FAILED tests/integration/standard/test_connection.py::HeartbeatTest::test_heart_beat_timeout - AssertionError: None is not true
FAILED tests/integration/standard/test_metadata.py::SchemaMetadataTests::test_refresh_schema_metadata - cassandra.DriverException: Schema metadata was not refreshed. See log for details.
FAILED tests/integration/standard/test_metadata.py::SchemaMetadataTests::test_table_extensions - KeyError: 'test_table_extensionsview'
FAILED tests/integration/standard/test_metadata.py::SchemaMetadataTests::test_metadata_pagination_keyspaces - cassandra.DriverException: Schema metadata was not refreshed. See log for details.
FAILED tests/integration/standard/test_metadata.py::SchemaMetadataTests::test_refresh_schema_metadata - cassandra.DriverException: Schema metadata was not refreshed. See log for details.
seems like the only lasting issue is with asyncio, maybe we leave it out from test with cassandra ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change it to 6.2.0
, it is way faster then 6.0.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is unrelated to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference, why did you replace it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned in the commit message, unittest.skip doesn't skip the setup/teardown code when using pytest as test runner, hence creating the cluster even for skipped tests.
for some of those test it was an issue, since the cassandra couldn't create the cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have some string or comment that will describe reason why it is only supported by scylla ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like test is setting api_address
which isn't legal for cassandra
I'll try fixing the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just removing it, works fine of both scylla and cassnadra
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please create an issue of that and link it in a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create and issue (in the driver repo) for that and reference it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you already adding it in other PR
https://github.com/scylladb/python-driver/pull/363/files#diff-10e4dc66833987dff353dedf3a30e98f8050a7e36b8cc6744d08c0f2552cbe11R1211
it was a scylla issue, that was fixed in 5.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefere to create an issue and investigate what is going on here, why on scylla driver get's different rack name ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's ccm code
scylla default to GossipingPropertyFileSnitch
and the logic there uses RAC1
cassandra default to PropertyFileSnitch
and the logic default to r1
we can align it on ccm, if you insist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have explanation why it is ran only on scylla ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a comment, basically seems this feature didn't cover cassandra specific queries (i.e. peers_v2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have explanation why to run it only on scylla ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RateLimitExceededException is a scylla only feature... it's error only scylla generate, and cassandra doesn't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it is None on scylla, is it expected ? If you don't want to look into it, can you please create an issue for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is an issue for it on the comment, you raise it yourself :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only on scylla ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test was introduce as part of fixing this issue:
#201
the behavior of scylla and cassandra isn't the same for what covered in this test
@Lorak-mmk might remember the details, as he introduced this test
374e0c7
to
ccb5e03
Compare
7aab14c
to
48666ad
Compare
@fruch, is it ready to be reviewed ?
@fruch, is it ready to be reviewed ?
yes it's ready
and now all tests are passing (but some group of test that was excluded failing with asyncio eventloop)
@mykaul
mykaul
Dec 26, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not 4.1.7 (latest 4.1.x) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cause the the version that was available when I started this effort
@mykaul
mykaul
Dec 26, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we ever test earlier than 2.2 versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so far we didn't test cassandra at all, for sure we haven't test anything else but 4.1.5
we are testing now.
to make sure we don't break compitability completly with cassandra introducing tests with one version of cassandra as well
those test can't be run ontop of cassandra and we should mark them in a why we can skip them
cause some of them are missing, we are getting to situations we fail to the the next test session on the github action builders
cassandra 4.1 and up, removed ms/min and so from name of config options, and it need to be specific in the value itself we have few tests using it, and need to be adapted
...n_timeout` seems like this test was fixed in upsteam to ignore `ErrorMessage` type I don't know the exact reason, but since it take us quite some time to sync with upstream, taking this change on it's own it happens only with cassandra (tested with 4.1) Ref: datastax@1b335d4
when we use `unittest.skip` on a test function and run via pytest, we still collect the test and run the setup/teardown even the test function would be skipped. when running with cassandra we get into situations where create cluster is also failing for all kind of reasons, which we don't need that cluster.
couldn't identified the root cause of the failures in those tests and they seem to be failing just on asyncio (which is still exprimatal for cassandra), so they are marked as xfail for now
48666ad
to
4b47f4d
Compare
to make sure we don't break compitability completly with cassandra introducing tests with one version of cassandra as well