1

We have an application with around 300k active users. We have a dedicated database server which has the following configurations. The problem we are facing is that randomly we get alot of load on our server and it gets hanged and we have to restart mysql service in order to continue. We have monyog installed on the database server to monitor which queries are taking much time but whenever the crash occurs it doesnt seems to show any unusual activity just that the queries start to stack up alot and even the normal ones start taking time more than 2 minutes. We have optimized our database in every way possible in our knowledge like adding index to the tables optimizing queries minimizing number of queries.

Below are our configurations

server.cnf
[mysqld]
 innodb_force_recovery = 0
 port = 3306
 socket = /var/lib/mysql/mysql.sock
 datadir=/var/lib/mysql
 skip-external-locking
 performance_schema=ON
 query_cache_limit=1M
 query_cache_size=16M
 query_cache_type=1
 max_user_connections=32000
 max_connections=32000
 interactive_timeout=180
 wait_timeout=180
 connect_timeout=10
 thread_cache_size=200
 max_connect_errors=999999
 max_allowed_packet=512M
 table_open_cache=90000
 innodb_file_per_table=1
 myisam_sort_buffer_size=64M
 tmp_table_size=32M
 max_heap_table_size=32M
 default-storage-engine = innodb
[mysqldump]
 quick
 max_allowed_packet = 16M
[mysql]
no-auto-rehash
safe-updates
[isamchk]
key_buffer = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M
[myisamchk]
key_buffer = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M
[mysqlhotcopy]
interactive-timeout

limits.conf

* hard nofile 199680
* soft nofile 16364

Application Architechture

language: Php5

Mysql Version

innodb_version 5.6.36-82.2
protocol_version 10
slave_type_conversions 
version 10.1.28-MariaDB
version_comment MariaDB Server
version_compile_machine x86_64
version_compile_os Linux
version_malloc_library system jemalloc
version_ssl_library OpenSSL 1.0.1e-fips 11 Feb 2013
wsrep_patch_version wsrep_25.20

its MariaDB

OS: CentOS

cat /proc/meminfo

MemTotal: 131729252 kB
 MemFree: 74397484 kB
 MemAvailable: 97158916 kB
 Buffers: 4884 kB
 Cached: 23181052 kB
 SwapCached: 0 kB
 Active: 53580240 kB
 Inactive: 1336380 kB
 Active(anon): 32095004 kB
 Inactive(anon): 279912 kB
 Active(file): 21485236 kB
 Inactive(file): 1056468 kB
 Unevictable: 0 kB
 Mlocked: 0 kB
 SwapTotal: 1048572 kB
 SwapFree: 1048572 kB
 Dirty: 76 kB
 Writeback: 0 kB
 AnonPages: 31730916 kB
 Mapped: 139488 kB
 Shmem: 644244 kB
 Slab: 1190592 kB
 SReclaimable: 897028 kB
 SUnreclaim: 293564 kB
 KernelStack: 15536 kB
 PageTables: 84744 kB
 NFS_Unstable: 0 kB
 Bounce: 0 kB
 WritebackTmp: 0 kB
 CommitLimit: 66913196 kB
 Committed_AS: 89544808 kB
 VmallocTotal: 34359738367 kB
 VmallocUsed: 550556 kB
 VmallocChunk: 34258290684 kB
 HardwareCorrupted: 0 kB
 AnonHugePages: 18024448 kB
 HugePages_Total: 0
 HugePages_Free: 0
 HugePages_Rsvd: 0
 HugePages_Surp: 0
 Hugepagesize: 2048 kB
 DirectMap4k: 281112 kB
 DirectMap2M: 14270464 kB
 DirectMap1G: 121634816 kB

the query to show status like '%abort%';

Aborted_clients 3098
Aborted_connects 141795
wsrep_local_bf_aborts 0

Below is my mysql_error.log but its only showing up the things after i restarted the mysql service

2017年10月26日 23:36:56 140677528500480 [Note] Plugin 'FEEDBACK' is disabled.
2017年10月26日 23:36:56 140677528500480 [Note] Server socket created on IP: '::'.
2017年10月26日 23:36:56 140677528500480 [Warning] 'user' entry '[email protected]' ignored in --skip-name-resolve mode.
2017年10月26日 23:36:56 140677528500480 [Warning] 'user' entry '@db.dbsuer.com' ignored in --skip-name-resolve mode.
2017年10月26日 23:36:56 140677528500480 [Warning] 'user' entry '[email protected]' ignored in --skip-name-resolve mode.
2017年10月26日 23:36:56 140677528500480 [Warning] 'proxies_priv' entry '@% [email protected]' ignored in --skip-name-resolve mode.
2017年10月26日 23:36:56 140677528058624 [Warning] Failed to load slave replication state from table mysql.gtid_slave_pos: 1146: Table 'mysql.gtid_slave_pos' doesn't exist
2017年10月26日 23:36:56 140677528500480 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.1.28-MariaDB' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
2017年10月26日 23:36:56 7ff205748b00 InnoDB: Error: Table "mysql"."innodb_table_stats" not found.
2017年10月27日 2:12:44 140588045855488 [Note] /usr/sbin/mysqld: Normal shutdown
2017年10月27日 2:13:04 140588045855488 [Warning] /usr/sbin/mysqld: Forcing close of thread 26950259 user: 'dbuser'
2017年10月27日 2:13:04 140588045855488 [Warning] /usr/sbin/mysqld: Forcing close of thread 26950237 user: 'dbuser'
2017年10月27日 2:13:04 140588045855488 [Warning] /usr/sbin/mysqld: Forcing close of thread 26908783 user: 'dbuser'
2017年10月27日 2:13:04 140588045855488 [ERROR] mysqld: Got an error writing communication packets
2017年10月27日 2:13:04 140588045855488 [Warning] /usr/sbin/mysqld: Forcing close of thread 6885924 user: 'root'
2017年10月27日 2:13:04 140588045855488 [Warning] /usr/sbin/mysqld: Forcing close of thread 2139 user: 'root'
2017年10月27日 2:13:14 140588619900672 [Note] InnoDB: FTS optimize thread exiting.
2017年10月27日 2:13:14 140588045855488 [Note] InnoDB: Starting shutdown...
2017年10月27日 02:14:54 7fafc7e6b900 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
2017年10月27日 2:14:54 140392949790976 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2017年10月27日 2:14:54 140392949790976 [Note] InnoDB: The InnoDB memory heap is disabled
2017年10月27日 2:14:54 140392949790976 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017年10月27日 2:14:54 140392949790976 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2
SHOW GLOBAL STATUS;
'Aborted_clients', '108'
'Aborted_connects', '5565'
'Access_denied_errors', '0'
'Acl_column_grants', '0'
'Acl_database_grants', '3'
'Acl_function_grants', '0'
'Acl_procedure_grants', '0'
'Acl_proxy_users', '1'
'Acl_role_grants', '0'
'Acl_roles', '0'
'Acl_table_grants', '0'
'Acl_users', '21'
'Aria_pagecache_blocks_not_flushed', '0'
'Aria_pagecache_blocks_unused', '15706'
'Aria_pagecache_blocks_used', '28'
'Aria_pagecache_read_requests', '553397'
'Aria_pagecache_reads', '6463'
'Aria_pagecache_write_requests', '71168'
'Aria_pagecache_writes', '0'
'Aria_transaction_log_syncs', '0'
'Binlog_bytes_written', '0'
'Binlog_cache_disk_use', '0'
'Binlog_cache_use', '0'
'Binlog_commits', '0'
'Binlog_group_commit_trigger_count', '0'
'Binlog_group_commit_trigger_lock_wait', '0'
'Binlog_group_commit_trigger_timeout', '0'
'Binlog_group_commits', '0'
'Binlog_snapshot_file', ''
'Binlog_snapshot_position', '0'
'Binlog_stmt_cache_disk_use', '0'
'Binlog_stmt_cache_use', '0'
'Busy_time', '0.000000'
'Bytes_received', '779126531'
'Bytes_sent', '7848821242'
'Com_admin_commands', '193376'
'Com_alter_db', '0'
'Com_alter_db_upgrade', '0'
'Com_alter_event', '0'
'Com_alter_function', '0'
'Com_alter_procedure', '0'
'Com_alter_server', '0'
'Com_alter_table', '0'
'Com_alter_tablespace', '0'
'Com_analyze', '0'
'Com_assign_to_keycache', '0'
'Com_begin', '0'
'Com_binlog', '0'
'Com_call_procedure', '25'
'Com_change_db', '2186510'
'Com_change_master', '0'
'Com_check', '0'
'Com_checksum', '0'
'Com_commit', '0'
'Com_compound_sql', '0'
'Com_create_db', '0'
'Com_create_event', '0'
'Com_create_function', '0'
'Com_create_index', '0'
'Com_create_procedure', '0'
'Com_create_role', '0'
'Com_create_server', '0'
'Com_create_table', '0'
'Com_create_temporary_table', '2'
'Com_create_trigger', '0'
'Com_create_udf', '0'
'Com_create_user', '0'
'Com_create_view', '0'
'Com_dealloc_sql', '0'
'Com_delete', '164'
'Com_delete_multi', '0'
'Com_do', '0'
'Com_drop_db', '0'
'Com_drop_event', '0'
'Com_drop_function', '0'
'Com_drop_index', '0'
'Com_drop_procedure', '0'
'Com_drop_role', '0'
'Com_drop_server', '0'
'Com_drop_table', '0'
'Com_drop_temporary_table', '4'
'Com_drop_trigger', '0'
'Com_drop_user', '0'
'Com_drop_view', '0'
'Com_empty_query', '0'
'Com_execute_sql', '0'
'Com_flush', '0'
'Com_get_diagnostics', '0'
'Com_grant', '0'
'Com_grant_role', '0'
'Com_ha_close', '0'
'Com_ha_open', '0'
'Com_ha_read', '0'
'Com_help', '0'
'Com_insert', '7530'
'Com_insert_select', '5'
'Com_install_plugin', '0'
'Com_kill', '1'
'Com_load', '0'
'Com_lock_tables', '0'
'Com_optimize', '0'
'Com_preload_keys', '0'
'Com_prepare_sql', '0'
'Com_purge', '0'
'Com_purge_before_date', '0'
'Com_release_savepoint', '0'
'Com_rename_table', '0'
'Com_rename_user', '0'
'Com_repair', '0'
'Com_replace', '0'
'Com_replace_select', '23'
'Com_reset', '0'
'Com_resignal', '0'
'Com_revoke', '0'
'Com_revoke_all', '0'
'Com_revoke_role', '0'
'Com_rollback', '0'
'Com_rollback_to_savepoint', '0'
'Com_savepoint', '0'
'Com_select', '2769447'
'Com_set_option', '1685912'
'Com_show_authors', '0'
'Com_show_binlog_events', '0'
'Com_show_binlogs', '0'
'Com_show_charsets', '0'
'Com_show_collations', '19'
'Com_show_contributors', '0'
'Com_show_create_db', '0'
'Com_show_create_event', '0'
'Com_show_create_func', '0'
'Com_show_create_proc', '0'
'Com_show_create_table', '1'
'Com_show_create_trigger', '0'
'Com_show_databases', '19'
'Com_show_engine_logs', '0'
'Com_show_engine_mutex', '0'
'Com_show_engine_status', '19'
'Com_show_errors', '0'
'Com_show_events', '0'
'Com_show_explain', '0'
'Com_show_fields', '2'
'Com_show_function_status', '0'
'Com_show_generic', '0'
'Com_show_grants', '0'
'Com_show_keys', '2'
'Com_show_master_status', '20'
'Com_show_open_tables', '1912'
'Com_show_plugins', '0'
'Com_show_privileges', '0'
'Com_show_procedure_status', '0'
'Com_show_processlist', '329'
'Com_show_profile', '0'
'Com_show_profiles', '0'
'Com_show_relaylog_events', '0'
'Com_show_slave_hosts', '0'
'Com_show_slave_status', '2'
'Com_show_status', '429'
'Com_show_storage_engines', '0'
'Com_show_table_status', '0'
'Com_show_tables', '5'
'Com_show_triggers', '0'
'Com_show_variables', '50'
'Com_show_warnings', '5'
'Com_shutdown', '0'
'Com_signal', '0'
'Com_start_all_slaves', '0'
'Com_start_slave', '0'
'Com_stmt_close', '1703'
'Com_stmt_execute', '2110'
'Com_stmt_fetch', '0'
'Com_stmt_prepare', '2110'
'Com_stmt_reprepare', '0'
'Com_stmt_reset', '0'
'Com_stmt_send_long_data', '0'
'Com_stop_all_slaves', '0'
'Com_stop_slave', '0'
'Com_truncate', '0'
'Com_uninstall_plugin', '0'
'Com_unlock_tables', '6263'
'Com_update', '23649'
'Com_update_multi', '0'
'Com_xa_commit', '0'
'Com_xa_end', '0'
'Com_xa_prepare', '0'
'Com_xa_recover', '0'
'Com_xa_rollback', '0'
'Com_xa_start', '0'
'Compression', 'OFF'
'Connection_errors_accept', '0'
'Connection_errors_internal', '0'
'Connection_errors_max_connections', '0'
'Connection_errors_peer_address', '0'
'Connection_errors_select', '0'
'Connection_errors_tcpwrap', '0'
'Connections', '2102550'
'Cpu_time', '0.000000'
'Created_tmp_disk_tables', '6776'
'Created_tmp_files', '6'
'Created_tmp_tables', '24053'
'Delayed_errors', '0'
'Delayed_insert_threads', '0'
'Delayed_writes', '0'
'Delete_scan', '26'
'Empty_queries', '971314'
'Executed_events', '0'
'Executed_triggers', '0'
'Feature_delay_key_write', '0'
'Feature_dynamic_columns', '0'
'Feature_fulltext', '0'
'Feature_gis', '0'
'Feature_locale', '0'
'Feature_subquery', '4177'
'Feature_timezone', '0'
'Feature_trigger', '0'
'Feature_xml', '0'
'Flush_commands', '1'
'Handler_commit', '1561805'
'Handler_delete', '132151'
'Handler_discover', '3'
'Handler_external_lock', '0'
'Handler_icp_attempts', '258439030'
'Handler_icp_match', '254844311'
'Handler_mrr_init', '0'
'Handler_mrr_key_refills', '0'
'Handler_mrr_rowid_refills', '0'
'Handler_prepare', '0'
'Handler_read_first', '280'
'Handler_read_key', '35395424'
'Handler_read_last', '16'
'Handler_read_next', '9410864982'
'Handler_read_prev', '81650'
'Handler_read_retry', '0'
'Handler_read_rnd', '28068980'
'Handler_read_rnd_deleted', '0'
'Handler_read_rnd_next', '59836719'
'Handler_rollback', '2'
'Handler_savepoint', '0'
'Handler_savepoint_rollback', '0'
'Handler_tmp_update', '694783'
'Handler_tmp_write', '5656429'
'Handler_update', '79769'
'Handler_write', '109219'
'Innodb_available_undo_logs', '128'
'Innodb_background_log_sync', '1115'
'Innodb_buffer_pool_bytes_data', '6600916992'
'Innodb_buffer_pool_bytes_dirty', '267239424'
'Innodb_buffer_pool_dump_status', 'Dumping buffer pool(s) not yet started'
'Innodb_buffer_pool_load_status', 'Loading buffer pool(s) not yet started'
'Innodb_buffer_pool_pages_data', '402888'
'Innodb_buffer_pool_pages_dirty', '16311'
'Innodb_buffer_pool_pages_flushed', '1'
'Innodb_buffer_pool_pages_free', '4154055'
'Innodb_buffer_pool_pages_lru_flushed', '0'
'Innodb_buffer_pool_pages_made_not_young', '0'
'Innodb_buffer_pool_pages_made_young', '320'
'Innodb_buffer_pool_pages_misc', '30513'
'Innodb_buffer_pool_pages_old', '149969'
'Innodb_buffer_pool_pages_total', '4587456'
'Innodb_buffer_pool_read_ahead', '63121'
'Innodb_buffer_pool_read_ahead_evicted', '0'
'Innodb_buffer_pool_read_ahead_rnd', '0'
'Innodb_buffer_pool_read_requests', '2157996469'
'Innodb_buffer_pool_reads', '337111'
'Innodb_buffer_pool_wait_free', '0'
'Innodb_buffer_pool_write_requests', '4816856'
'Innodb_checkpoint_age', '128276390'
'Innodb_checkpoint_max_age', '13914982934'
'Innodb_data_fsyncs', '1213'
'Innodb_data_pending_fsyncs', '0'
'Innodb_data_pending_reads', '0'
'Innodb_data_pending_writes', '0'
'Innodb_data_read', '6583128576'
'Innodb_data_reads', '402049'
'Innodb_data_writes', '32494'
'Innodb_data_written', '144898048'
'Innodb_dblwr_pages_written', '1'
'Innodb_dblwr_writes', '1'
'Innodb_deadlocks', '1'
'Innodb_defragment_compression_failures', '0'
'Innodb_defragment_count', '0'
'Innodb_defragment_failures', '0'
'Innodb_encryption_key_rotation_list_length', '0'
'Innodb_encryption_n_merge_blocks_decrypted', '0'
'Innodb_encryption_n_merge_blocks_encrypted', '0'
'Innodb_encryption_n_rowlog_blocks_decrypted', '0'
'Innodb_encryption_n_rowlog_blocks_encrypted', '0'
'Innodb_encryption_num_key_requests', '0'
'Innodb_encryption_rotation_estimated_iops', '0'
'Innodb_encryption_rotation_pages_flushed', '0'
'Innodb_encryption_rotation_pages_modified', '0'
'Innodb_encryption_rotation_pages_read_from_cache', '0'
'Innodb_encryption_rotation_pages_read_from_disk', '0'
'Innodb_have_atomic_builtins', 'ON'
'Innodb_have_bzip2', 'OFF'
'Innodb_have_lz4', 'OFF'
'Innodb_have_lzma', 'ON'
'Innodb_have_lzo', 'OFF'
'Innodb_have_snappy', 'OFF'
'Innodb_history_list_length', '130'
'Innodb_ibuf_discarded_delete_marks', '0'
'Innodb_ibuf_discarded_deletes', '0'
'Innodb_ibuf_discarded_inserts', '0'
'Innodb_ibuf_free_list', '64803'
'Innodb_ibuf_merged_delete_marks', '428176'
'Innodb_ibuf_merged_deletes', '123181'
'Innodb_ibuf_merged_inserts', '1178'
'Innodb_ibuf_merges', '4385'
'Innodb_ibuf_segment_size', '64805'
'Innodb_ibuf_size', '1'
'Innodb_log_waits', '0'
'Innodb_log_write_requests', '339325'
'Innodb_log_writes', '32398'
'Innodb_lsn_current', '614688828236'
'Innodb_lsn_flushed', '614688813752'
'Innodb_lsn_last_checkpoint', '614560551846'
'Innodb_master_thread_active_loops', '1115'
'Innodb_master_thread_idle_loops', '0'
'Innodb_max_trx_id', '8368506565'
'Innodb_mem_adaptive_hash', '1689900176'
'Innodb_mem_dictionary', '299754541'
'Innodb_mem_total', '78721843200'
'Innodb_mutex_os_waits', '6963'
'Innodb_mutex_spin_rounds', '1490291'
'Innodb_mutex_spin_waits', '1332026'
'Innodb_num_index_pages_written', '0'
'Innodb_num_non_index_pages_written', '32402'
'Innodb_num_page_compressed_trim_op', '0'
'Innodb_num_page_compressed_trim_op_saved', '0'
'Innodb_num_pages_decrypted', '0'
'Innodb_num_pages_encrypted', '0'
'Innodb_num_pages_page_compressed', '0'
'Innodb_num_pages_page_compression_error', '0'
'Innodb_num_pages_page_decompressed', '0'
'Innodb_oldest_view_low_limit_trx_id', '8368503775'
'Innodb_onlineddl_pct_progress', '0'
'Innodb_onlineddl_rowlog_pct_used', '0'
'Innodb_onlineddl_rowlog_rows', '0'
'Innodb_os_log_fsyncs', '1122'
'Innodb_os_log_pending_fsyncs', '0'
'Innodb_os_log_pending_writes', '0'
'Innodb_os_log_written', '144864256'
'Innodb_page_compression_saved', '0'
'Innodb_page_compression_trim_sect1024', '0'
'Innodb_page_compression_trim_sect16384', '0'
'Innodb_page_compression_trim_sect2048', '0'
'Innodb_page_compression_trim_sect32768', '0'
'Innodb_page_compression_trim_sect4096', '0'
'Innodb_page_compression_trim_sect512', '0'
'Innodb_page_compression_trim_sect8192', '0'
'Innodb_page_size', '16384'
'Innodb_pages0_read', '237'
'Innodb_pages_created', '1079'
'Innodb_pages_read', '401809'
'Innodb_pages_written', '1'
'Innodb_purge_trx_id', '8368504055'
'Innodb_purge_undo_no', '0'
'Innodb_read_views_memory', '2600'
'Innodb_row_lock_current_waits', '0'
'Innodb_row_lock_time', '950387'
'Innodb_row_lock_time_avg', '4658'
'Innodb_row_lock_time_max', '41902'
'Innodb_row_lock_waits', '204'
'Innodb_rows_deleted', '132151'
'Innodb_rows_inserted', '56069'
'Innodb_rows_read', '9463155799'
'Innodb_rows_updated', '79769'
'Innodb_s_lock_os_waits', '44208'
'Innodb_s_lock_spin_rounds', '3808201'
'Innodb_s_lock_spin_waits', '738548'
'Innodb_scrub_background_page_reorganizations', '0'
'Innodb_scrub_background_page_split_failures_missing_index', '0'
'Innodb_scrub_background_page_split_failures_out_of_filespace', '0'
'Innodb_scrub_background_page_split_failures_underflow', '0'
'Innodb_scrub_background_page_split_failures_unknown', '0'
'Innodb_scrub_background_page_splits', '0'
'Innodb_secondary_index_triggered_cluster_reads', '306821502'
'Innodb_secondary_index_triggered_cluster_reads_avoided', '0'
'Innodb_system_rows_deleted', '0'
'Innodb_system_rows_inserted', '0'
'Innodb_system_rows_read', '0'
'Innodb_system_rows_updated', '0'
'Innodb_truncated_status_writes', '0'
'Innodb_x_lock_os_waits', '23001'
'Innodb_x_lock_spin_rounds', '7705254'
'Innodb_x_lock_spin_waits', '94569'
'Key_blocks_not_flushed', '0'
'Key_blocks_unused', '214330'
'Key_blocks_used', '4'
'Key_blocks_warm', '0'
'Key_read_requests', '84'
'Key_reads', '4'
'Key_write_requests', '0'
'Key_writes', '0'
'Last_query_cost', '0.000000'
'Master_gtid_wait_count', '0'
'Master_gtid_wait_time', '0'
'Master_gtid_wait_timeouts', '0'
'Max_statement_time_exceeded', '0'
'Max_used_connections', '300'
'Memory_used', '445027144'
'Not_flushed_delayed_rows', '0'
'Open_files', '12'
'Open_streams', '0'
'Open_table_definitions', '263'
'Open_tables', '400'
'Opened_files', '27460'
'Opened_plugin_libraries', '0'
'Opened_table_definitions', '267'
'Opened_tables', '525'
'Opened_views', '0'
'Performance_schema_accounts_lost', '0'
'Performance_schema_cond_classes_lost', '0'
'Performance_schema_cond_instances_lost', '0'
'Performance_schema_digest_lost', '0'
'Performance_schema_file_classes_lost', '0'
'Performance_schema_file_handles_lost', '0'
'Performance_schema_file_instances_lost', '0'
'Performance_schema_hosts_lost', '0'
'Performance_schema_locker_lost', '0'
'Performance_schema_mutex_classes_lost', '0'
'Performance_schema_mutex_instances_lost', '0'
'Performance_schema_rwlock_classes_lost', '0'
'Performance_schema_rwlock_instances_lost', '0'
'Performance_schema_session_connect_attrs_lost', '0'
'Performance_schema_socket_classes_lost', '0'
'Performance_schema_socket_instances_lost', '0'
'Performance_schema_stage_classes_lost', '0'
'Performance_schema_statement_classes_lost', '0'
'Performance_schema_table_handles_lost', '0'
'Performance_schema_table_instances_lost', '0'
'Performance_schema_thread_classes_lost', '0'
'Performance_schema_thread_instances_lost', '0'
'Performance_schema_users_lost', '0'
'Prepared_stmt_count', '0'
'Qcache_free_blocks', '1918'
'Qcache_free_memory', '5902288'
'Qcache_hits', '1209722'
'Qcache_inserts', '1347881'
'Qcache_lowmem_prunes', '943761'
'Qcache_not_cached', '211725'
'Qcache_queries_in_cache', '5647'
'Qcache_total_blocks', '13458'
'Queries', '8782988'
'Questions', '8778832'
'Rows_read', '3758678584'
'Rows_sent', '9669002'
'Rows_tmp_read', '6550750'
'Rpl_status', 'AUTH_MASTER'
'Select_full_join', '9637'
'Select_full_range_join', '3'
'Select_range', '188883'
'Select_range_check', '0'
'Select_scan', '6116'
'Slave_connections', '0'
'Slave_heartbeat_period', '0.000'
'Slave_open_temp_tables', '0'
'Slave_received_heartbeats', '0'
'Slave_retried_transactions', '0'
'Slave_running', 'OFF'
'Slave_skipped_errors', '0'
'Slaves_connected', '0'
'Slaves_running', '0'
'Slow_launch_threads', '0'
'Slow_queries', '24'
'Sort_merge_passes', '0'
'Sort_priority_queue_sorts', '16808'
'Sort_range', '17758'
'Sort_rows', '376035'
'Sort_scan', '10237'
'Ssl_accept_renegotiates', '0'
'Ssl_accepts', '0'
'Ssl_callback_cache_hits', '0'
'Ssl_cipher', ''
'Ssl_cipher_list', ''
'Ssl_client_connects', '0'
'Ssl_connect_renegotiates', '0'
'Ssl_ctx_verify_depth', '0'
'Ssl_ctx_verify_mode', '0'
'Ssl_default_timeout', '0'
'Ssl_finished_accepts', '0'
'Ssl_finished_connects', '0'
'Ssl_server_not_after', ''
'Ssl_server_not_before', ''
'Ssl_session_cache_hits', '0'
'Ssl_session_cache_misses', '0'
'Ssl_session_cache_mode', 'NONE'
'Ssl_session_cache_overflows', '0'
'Ssl_session_cache_size', '0'
'Ssl_session_cache_timeouts', '0'
'Ssl_sessions_reused', '0'
'Ssl_used_session_cache_entries', '0'
'Ssl_verify_depth', '0'
'Ssl_verify_mode', '0'
'Ssl_version', ''
'Subquery_cache_hit', '0'
'Subquery_cache_miss', '5000'
'Syncs', '2'
'Table_locks_immediate', '1581465'
'Table_locks_waited', '0'
'Tc_log_max_pages_used', '0'
'Tc_log_page_size', '4096'
'Tc_log_page_waits', '0'
'Threadpool_idle_threads', '0'
'Threadpool_threads', '0'
'Threads_cached', '118'
'Threads_connected', '118'
'Threads_created', '309'
'Threads_running', '6'
'Update_scan', '5'
'Uptime', '1124'
'Uptime_since_flush_status', '1124'
'wsrep_cluster_conf_id', '18446744073709551615'
'wsrep_cluster_size', '0'
'wsrep_cluster_state_uuid', ''
'wsrep_cluster_status', 'Disconnected'
'wsrep_connected', 'OFF'
'wsrep_local_bf_aborts', '0'
'wsrep_local_index', '18446744073709551615'
'wsrep_provider_name', ''
'wsrep_provider_vendor', ''
'wsrep_provider_version', ''
'wsrep_ready', 'OFF'
'wsrep_thread_count', '0'

We have 5 databases with almost 200 million rows altogether. To my knowledge we can do some stuff but that would come under optimizing and reducing the load. Apart from any bad queries what can be done to prevent the crashes? We used to get alof of errors like

too many open files
too many connections 
cannot connect right now

But by researching and configuring right parameters we have overcome those but the crashing has been there for a while now and its become more frequent. There is nothing much is the mysql_error.log as well Any help would be great!

asked Oct 27, 2017 at 9:01
12
  • 1
    What does the /var/log/mysql/error.log display? Could you add that information to your question? Then you could have a look at the show status like '%abort%' in your mysql and trigger it to see which of the counters is increasing. This can give you a hint of what is happening: See B.5.2.11 Communication Errors and Aborted Connections Commented Oct 27, 2017 at 9:08
  • Why are you allowing any hacker 999999 login attempts to break in to your server with your max_connect_errors tolerance? How much RAM does your server have? Please post your php.ini and text results of the following A. SHOW GLOBAL STATUS; B. SHOW GLOBAL VARIABLES; C. SHOW ENGINE INNODB STATUS; for possible advice. Commented Oct 27, 2017 at 16:02
  • added the show global status result, RAM is listed 131729252 i.e. 128GB. What about max_connect_errors ? what should that be? Commented Oct 28, 2017 at 4:51
  • Anyone's tolerance for hackers should be limited to no more than 10 attempts. IMHO. You have 128GB RAM. I see your GLOBAL STATUS under error.log content. Today, can not see ini, B or C. Please post your php.ini and text results of the following B. SHOW GLOBAL VARIABLES; C. SHOW ENGINE INNODB STATUS; for possible advice. Commented Oct 28, 2017 at 19:34
  • My error earlier, this number indicates you have 128MB RAM. Please post whatever you are looking at as text. Commented Oct 28, 2017 at 20:14

1 Answer 1

2

The following suggestions need your research before implementing ONLY one item per day. Some may be applied dynamically. Suggested cfg/ini values follow, could be modify, add or remove.

innodb_print_all_deadlocks=1 # from OFF for daily review in error log
log_warnings=2 # for additional info on connect failures for research
these first two are DYNAMIC and could/should be done NOW, add to cfg/ini
review your PREPARE, EXECUTE, CLOSE code. STATUS indicates ~ 400 not closed
status threads_connected of 118 = resources not released, when client done
thread_cache_size=100 # from 200 to prevent OOM per V8
max_connections=1000 # from 32000, until stabilized, 300 max_used since start
table_open_cache=10000 # from 90000, 525 were opened since start
open_files_limit=30000 # from default for ratio of toc * 3

ensure ulimit will support requested volumes

Consider installing MySQLTuner.com and posting report in Question.

When GLOBAL VARIABLES become available, there will be additional opportunities. When appropriate, keep us informed, accept if it helps, please.

---- 2017 11 02 additional observation, with aborted_connections of 5555 and threads_connected, does your application have a convenient 'logoff' available? If not, please consider adding to allow releasing of resources and use of CLOSE functions.

---- 2017 11 05 how is it possible to have 'Connections', '2102550' in uptime of 1124 seconds? About ~1800 per second per posted status. With a master/slave this may be reasonable.

answered Oct 29, 2017 at 11:44
13
  • These tips are a good start. More can be had from VARIABLES and GLOBAL STATUS after running for a day. (MySQLTuner is usually not very useful.) Commented Oct 29, 2017 at 15:52
  • @rickjames MySQLTuner is quite helpful with version, storage engine statistics (table space used, and number of tables per engine), RAM, connections, threads, tables/files and other significant details in the Metrics groups, some with limits and counts that are very helpful. Recommendations must be carefully scrutinized. Sometime alarms are raised unnecessarily, i.e. 'Beware that open_files_limit (1024) variable should be greater than table_open_cache (400)' It is, so why not just add text of 'OK at this time.' to clarify that they know we are OK. Commented Oct 29, 2017 at 18:37
  • It almost always recommends OPTIMIZE TABLE, which is rarely useful. Commented Oct 30, 2017 at 1:53
  • @RickJames i cant post the whole thing here due to characters limit, would you mind giving me your email id or something? I need a mysql expert on this. Thanks Commented Oct 30, 2017 at 2:15
  • You can reach me at mysql at rjweb dot org . Commented Oct 30, 2017 at 12:20

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.