Monday, February 18, 2019
Calculating Sexy Primes, Prime Triplets and Sexy Prime Triplets in SQL Server
The other day I was reading something on Hackernews and someone posted a link to a Sexy Primes wikipedia article. I looked at that and then decided to do this in SQL Server because.. why not?
From that wikipedia link: https://en.wikipedia.org/wiki/Sexy_prime
In mathematics, sexy primes are prime numbers that differ from each other by six. For example, the numbers 5 and 11 are both sexy primes, because 11 minus 5 is 6.
The term "sexy prime" is a pun stemming from the Latin word for six: sex.
If p + 2 or p + 4 (where p is the lower prime) is also prime, then the sexy prime is part of a prime triplet.
Ok I did a couple of versions of this over the weekend, I also did a PostgreSQL version: Calculating Sexy Primes, Prime Triplets and Sexy Prime Triplets in PostgreSQL
So first we need a table that will just have the prime numbers
I decided to populate a table with numbers from 2 till 500 and then use the sieve of Eratosthenes method to delete the non primes
This will look like this
CREATE TABLE #PrimeNumbers(n int) INSERT INTO #PrimeNumbers SELECT number FROM master..spt_values WHERE type = 'P' AND number between 2 and 500 --Sieve method DECLARE @i INT SET @I = 2 WHILE @I <= SQRT(500) BEGIN DELETE FROM #PrimeNumbers WHERE N % @I = 0 AND N > @I SET @I = @I + 1 END SELECT * FROM #PrimeNumbers
Thinking about it a little more I decided to do it with a CTE instead of a loop with delete statements, if your tables will be big then the delete method is probably better... it's for you to test that out :-)
What we are doing is a NOT EXISTS query against the same cte and we are filtering out numbers that are greater than the number in the current row and are not divisible by the current number
IF OBJECT_ID('tempdb..#PrimeNumbers') IS NOT NULL DROP TABLE #PrimeNumbers CREATE TABLE #PrimeNumbers(n int) ;WITH cte AS ( SELECT number n FROM master..spt_values WHERE type = 'P' AND number between 2 and 500 ), PrimeNumbers as ( SELECT n FROM cte WHERE NOT EXISTS ( SELECT n FROM cte as cte2 WHERE cte.n > cte2.n AND cte.n % cte2.n = 0) ) INSERT #PrimeNumbers SELECT * FROM PrimeNumbers SELECT * FROM #PrimeNumbers
If we run that last select statement, we should have 95 rows
2
3
5
7
.....
.....
463
467
479
487
491
499
Now that we have our table filled with prime numbers till 500, it's time to run the queries
Sexy prime pairs
The sexy primes (sequences OEIS: A023201 and OEIS: A046117 in OEIS) below 500 are:
(5,11), (7,13), (11,17), (13,19), (17,23), (23,29), (31,37), (37,43), (41,47), (47,53), (53,59), (61,67), (67,73), (73,79), (83,89), (97,103), (101,107), (103,109), (107,113), (131,137), (151,157), (157,163), (167,173), (173,179), (191,197), (193,199), (223,229), (227,233), (233,239), (251,257), (257,263), (263,269), (271,277), (277,283), (307,313), (311,317), (331,337), (347,353), (353,359), (367,373), (373,379), (383,389), (433,439), (443,449), (457,463), (461,467).
Here is that query for the sexy prime pairs
-- 46 rows.. sexy primes SELECT t1.N,t2.N FROM #PrimeNumbers t1 join #PrimeNumbers t2 on t2.N - t1.N = 6 order by 1It's very simple.. a self join that returns rows where the number from one table alias and the number from the other table alias differ by 6
Prime triplets
The first prime triplets below 500 (sequence A098420 in the OEIS) are
(5, 7, 11), (7, 11, 13), (11, 13, 17), (13, 17, 19), (17, 19, 23), (37, 41, 43), (41, 43, 47), (67, 71, 73), (97, 101, 103), (101, 103, 107), (103, 107, 109), (107, 109, 113), (191, 193, 197), (193, 197, 199), (223, 227, 229), (227, 229, 233), (277, 281, 283), (307, 311, 313), (311, 313, 317), (347, 349, 353), (457, 461, 463), (461, 463, 467)
A prime triplet contains a pair of twin primes (p and p + 2, or p + 4 and p + 6), a pair of cousin primes (p and p + 4, or p + 2 and p + 6), and a pair of sexy primes (p and p + 6).
So we need to check that the 1st and 3rd number have a difference of 6, we also check that that difference between number 1 and 2 is 2 or 4. That query looks like this
-- 22 rows.. Prime Triplets SELECT t1.N AS N1,t2.N AS N2, t3.N AS N3 FROM #PrimeNumbers t1 join #PrimeNumbers t2 on t2.N > t1.N join #PrimeNumbers t3 on t3.N - t1.N = 6 and t3.N > t2.N and t2.n - t1.n IN (2,4) order by 1
Sexy prime triplets
Triplets of primes (p, p + 6, p + 12) such that p + 18 is composite are called sexy prime. p p, p+6 and p+12 are all prime, but p+18 is not
Those below 500 (sequence OEIS: A046118) are:
(7,13,19), (17,23,29), (31,37,43), (47,53,59), (67,73,79), (97,103,109), (101,107,113), (151,157,163), (167,173,179), (227,233,239), (257,263,269), (271,277,283), (347,353,359), (367,373,379)
The query looks like this.. instead of a self join, we do a triple self join, we also check that p + 18 is not a prime number in the line before the order by
-- 14 rows.. Sexy prime triplets SELECT t1.N AS N1,t2.N AS N2, t3.N AS N3 FROM #PrimeNumbers t1 join #PrimeNumbers t2 on t2.n - t1.n = 6 join #PrimeNumbers t3 on t3.N - t1.N = 12 and t3.N > t2.N AND NOT EXISTS( SELECT null FROM #PrimeNumbers p WHERE p.n = t1.n +18) order by 1
And that's it for this post.
Thursday, February 14, 2019
Finding rows where the column starts or ends with a 'bad' character
A coworker came to me asking me for some help. He had some issues trying to convert some data from a staging table to numeric. I asked him to show me the data in SSMS and at first glance it looked good to me. Then I asked where the data came from, he said it came from Excel.
Aha... I have plenty of war stories with Excel so I said, it's probably some non printable character that is in the column.. either a tab (char(9)) or a non breaking space (char(160))..especially if the value was copied from the internet
He said isnumeric was returning 0 for rows that looked valid, I then told him to run this query on those rows
That would give them the ascii numerical value. For example a tab is 9, linefeed = 10....
Here is a chart for the characters between 0 and 32
Source: https://en.wikipedia.org/wiki/ASCII
He then ran the following to grab all the rows that ended or started with tabs
So let's take another look at this to see how we can make this a little better
Let's create a table that will hold these bad characters that we don't want, in my case ACII values 1 untill 32
Here is what we will do to create and populate the table
Now let's create our staging table and insert some data so that we can do some tests
That inserted 4194304 rows on my machine
Time to insert some of that bad data
Here is what some of the data inserted will look like
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
And this is the query to generated and insert those bad rows, on my machine it generated 1089 such rows
Now it's time to find these bad rows, but before we do that, let's add an index
OK, we are ready...
Of course I here you saying, why don't we just do this
Well, yes that gives me everything that can't be converted to numeric, but I want to see what those characters are
Before we start, let's set statistics io on so that we can look at some performance
Here are the queries to find the bad characters at the start
Here is what the reads look like
(1089 row(s) affected)
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StagingData'. Scan count 9, logical reads 10851, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1089 row(s) affected)
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StagingData'. Scan count 33, logical reads 135, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see from the stats, the top query is non-SARGable and generates a lot more reads, the bottom query can use the index. Always make sure to write your queries in a way so that SQL Server can you an index
What about the last character, how can we find those
Here are the stats again
(1089 row(s) affected)
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StagingData'. Scan count 9, logical reads 10851, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1089 row(s) affected)
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StagingData'. Scan count 33, logical reads 445863, physical reads 0, read-ahead reads 13, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
So both of these queries suck the life out of your SQL Server instance, so what can be done?
One thing we can do is add a computed column to the table that will hold just the last character of the column, then we can index the computed column
Here are the commands to do that
And now we can just run the same queries again
Here are the stats
(1089 row(s) affected)
Table 'StagingData'. Scan count 33, logical reads 1223, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1089 row(s) affected)
Table 'StagingData'. Scan count 33, logical reads 1223, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Did you expect to get the same exact reads for both queries?
So what is going on? Well lets take a look
In both cases, the optimizer was smart enough to use the index on the computed column
Hopefully this will make someone's life easier and you can expand the table to add other character you consider bad. You can also add constraint to reject values or you can add triggers and then move those bad rows to a bad rows table
Finally if you need to worry about unicode you might want to change the table to be nvarchar
Enjoy.. importing that data..... we all know..it's only getting bigger and bigger
Aha... I have plenty of war stories with Excel so I said, it's probably some non printable character that is in the column.. either a tab (char(9)) or a non breaking space (char(160))..especially if the value was copied from the internet
He said isnumeric was returning 0 for rows that looked valid, I then told him to run this query on those rows
SELECT ASCII(LEFT(SomeColumn,1)), ASCII(RIGHT(SomeColumn,1)),* FROM StagingData s
That would give them the ascii numerical value. For example a tab is 9, linefeed = 10....
Here is a chart for the characters between 0 and 32
Binary | Oct | Dec | Hex | Abbreviation | [b] | [c] | [d] | Name (1967) | ||
---|---|---|---|---|---|---|---|---|---|---|
1963 | 1965 | 1967 | ||||||||
000 0000 | 000 | 0 | 00 | NULL | NUL | ␀ | ^@ | 0円 | Null | |
000 0001 | 001 | 1 | 01 | SOM | SOH | ␁ | ^A | Start of Heading | ||
000 0010 | 002 | 2 | 02 | EOA | STX | ␂ | ^B | Start of Text | ||
000 0011 | 003 | 3 | 03 | EOM | ETX | ␃ | ^C | End of Text | ||
000 0100 | 004 | 4 | 04 | EOT | ␄ | ^D | End of Transmission | |||
000 0101 | 005 | 5 | 05 | WRU | ENQ | ␅ | ^E | Enquiry | ||
000 0110 | 006 | 6 | 06 | RU | ACK | ␆ | ^F | Acknowledgement | ||
000 0111 | 007 | 7 | 07 | BELL | BEL | ␇ | ^G | \a | Bell | |
000 1000 | 010 | 8 | 08 | FE0 | BS | ␈ | ^H | \b | Backspace [e] [f] | |
000 1001 | 011 | 9 | 09 | HT/SK | HT | ␉ | ^I | \t | Horizontal Tab [g] | |
000 1010 | 012 | 10 | 0A | LF | ␊ | ^J | \n | Line Feed | ||
000 1011 | 013 | 11 | 0B | VTAB | VT | ␋ | ^K | \v | Vertical Tab | |
000 1100 | 014 | 12 | 0C | FF | ␌ | ^L | \f | Form Feed | ||
000 1101 | 015 | 13 | 0D | CR | ␍ | ^M | \r | Carriage Return [h] | ||
000 1110 | 016 | 14 | 0E | SO | ␎ | ^N | Shift Out | |||
000 1111 | 017 | 15 | 0F | SI | ␏ | ^O | Shift In | |||
001 0000 | 020 | 16 | 10 | DC0 | DLE | ␐ | ^P | Data Link Escape | ||
001 0001 | 021 | 17 | 11 | DC1 | ␑ | ^Q | Device Control 1 (often XON) | |||
001 0010 | 022 | 18 | 12 | DC2 | ␒ | ^R | Device Control 2 | |||
001 0011 | 023 | 19 | 13 | DC3 | ␓ | ^S | Device Control 3 (often XOFF) | |||
001 0100 | 024 | 20 | 14 | DC4 | ␔ | ^T | Device Control 4 | |||
001 0101 | 025 | 21 | 15 | ERR | NAK | ␕ | ^U | Negative Acknowledgement | ||
001 0110 | 026 | 22 | 16 | SYNC | SYN | ␖ | ^V | Synchronous Idle | ||
001 0111 | 027 | 23 | 17 | LEM | ETB | ␗ | ^W | End of Transmission Block | ||
001 1000 | 030 | 24 | 18 | S0 | CAN | ␘ | ^X | Cancel | ||
001 1001 | 031 | 25 | 19 | S1 | EM | ␙ | ^Y | End of Medium | ||
001 1010 | 032 | 26 | 1A | S2 | SS | SUB | ␚ | ^Z | Substitute | |
001 1011 | 033 | 27 | 1B | S3 | ESC | ␛ | ^[ | \e [i] | Escape [j] | |
001 1100 | 034 | 28 | 1C | S4 | FS | ␜ | ^\ | File Separator | ||
001 1101 | 035 | 29 | 1D | S5 | GS | ␝ | ^] | Group Separator | ||
001 1110 | 036 | 30 | 1E | S6 | RS | ␞ | ^^ [k] | Record Separator | ||
001 1111 | 037 | 31 | 1F | S7 | US | ␟ | ^_ | Unit Separator |
Source: https://en.wikipedia.org/wiki/ASCII
He then ran the following to grab all the rows that ended or started with tabs
SELECT * FROM StagingData s WHERE LEFT(SomeColumn,1) = char(9) OR RIGHT(SomeColumn,1) = char(9)
So let's take another look at this to see how we can make this a little better
Let's create a table that will hold these bad characters that we don't want, in my case ACII values 1 untill 32
Here is what we will do to create and populate the table
CREATE TABLE BadCharacters( BadChar char(1) NOT NULL, ASCIINumber int NOT NULL, CONSTRAINT pk_BadCharacters PRIMARY KEY CLUSTERED( BadChar ) ) GO INSERT BadCharacters SELECT char(number),number FROM master..SPT_VALUES WHERE type = 'P' AND number BETWEEN 1 AND 32 OR number = 160A quick look at the data looks like this
SELECT * FROM BadCharacters
Now let's create our staging table and insert some data so that we can do some tests
CREATE TABLE StagingData (SomeColumn varchar(255) ) INSERT StagingData SELECT CONVERT(VARCHAR(10),s1.number) + '.' + CONVERT(VARCHAR(10),s2.number) FROM master..SPT_VALUES s1 CROSS JOIN master..SPT_VALUES s2 WHERE s1.type = 'P' AND s2.type = 'P'
That inserted 4194304 rows on my machine
Time to insert some of that bad data
Here is what some of the data inserted will look like
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
And this is the query to generated and insert those bad rows, on my machine it generated 1089 such rows
;WITH cte as(SELECT CONVERT(VARCHAR(10),number) as num FROM master..SPT_VALUES WHERE type = 'P' AND number BETWEEN 1 AND 1000) --INSERT StagingData SELECT b.BadChar + c1.num + '.' + c2.num + b2.BadChar FROM cte c1 CROSS JOIN cte c2 JOIN BadCharacters b on c1.num = b.ASCIINumber JOIN BadCharacters b2 on c2.num = b2.ASCIINumberThe query create a value by using a bad value, a number a dot a number and a bad value, you can see those values above
Now it's time to find these bad rows, but before we do that, let's add an index
CREATE INDEX ix_StagingData on StagingData(SomeColumn)
OK, we are ready...
Of course I here you saying, why don't we just do this
SELECT * FROM StagingData WHERE TRY_CONVERT(numeric(20,10),SomeColumn) IS NULL
Before we start, let's set statistics io on so that we can look at some performance
SET STATISTICS IO ON GO
Here are the queries to find the bad characters at the start
SELECT * FROM StagingData s JOIN BadCharacters b on b.BadChar = LEFT(s.SomeColumn,1) SELECT * FROM StagingData s JOIN BadCharacters b on s.SomeColumn like b.BadChar +'%'
Here is what the reads look like
(1089 row(s) affected)
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StagingData'. Scan count 9, logical reads 10851, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1089 row(s) affected)
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StagingData'. Scan count 33, logical reads 135, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see from the stats, the top query is non-SARGable and generates a lot more reads, the bottom query can use the index. Always make sure to write your queries in a way so that SQL Server can you an index
What about the last character, how can we find those
SELECT * FROM StagingData s JOIN BadCharacters b on b.BadChar = RIGHT(s.SomeColumn,1) SELECT * FROM StagingData s JOIN BadCharacters b on s.SomeColumn like +'%' + b.BadChar
Here are the stats again
(1089 row(s) affected)
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StagingData'. Scan count 9, logical reads 10851, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1089 row(s) affected)
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StagingData'. Scan count 33, logical reads 445863, physical reads 0, read-ahead reads 13, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
So both of these queries suck the life out of your SQL Server instance, so what can be done?
One thing we can do is add a computed column to the table that will hold just the last character of the column, then we can index the computed column
Here are the commands to do that
ALTER TABLE StagingData ADD RightChar as RIGHT(SomeColumn,1) GO CREATE INDEX ix_RightChar on StagingData(RightChar) GO
And now we can just run the same queries again
SELECT * FROM StagingData s JOIN BadCharacters b on b.BadChar = RIGHT(s.SomeColumn,1) SELECT * FROM StagingData s JOIN BadCharacters b on s.SomeColumn like +'%' + b.BadChar
Here are the stats
(1089 row(s) affected)
Table 'StagingData'. Scan count 33, logical reads 1223, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1089 row(s) affected)
Table 'StagingData'. Scan count 33, logical reads 1223, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'BadCharacters'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Did you expect to get the same exact reads for both queries?
So what is going on? Well lets take a look
In both cases, the optimizer was smart enough to use the index on the computed column
Hopefully this will make someone's life easier and you can expand the table to add other character you consider bad. You can also add constraint to reject values or you can add triggers and then move those bad rows to a bad rows table
Finally if you need to worry about unicode you might want to change the table to be nvarchar
Enjoy.. importing that data..... we all know..it's only getting bigger and bigger
Tuesday, February 12, 2019
Using SonarQube, SonarQube Scanner and the sonar-tsql-plugin to run static code analysis
In the previous post Scripting out procs and user defined functions with PowerShell, we scripted out some procs and functions from the Adventureworks database so that we can run some static code analysis. Today we will install SonarQube, SonarQube Scanner and the sonar-tsql-plugin. First thing we need is to grab Java if you don't have it installed. I know I know... I uninstalled it as well.. sigh
Anyway.. after you have Java installed, you can grab SonarCube here: https://www.sonarqube.org/downloads/
Create a folder named sonarqube-7.6 on the C drive, download SonarCube and extract it in C:\sonarqube-7.6\
Next we need Sonar Cube Scanner, You can download it here: https://docs.sonarqube.org/display/SCAN/Analyzing+with+SonarQube+Scanner
Create a folder name C:\sonar-scanner-cli-3.3.0.1492-windows and extract the Sonar Cube Scanner file there
Finally we need the sonar-tsql-plugin, you can download that here https://github.com/gretard/sonar-tsql-plugin/releases
Grab the file named: sonar-tsqlopen-plugin-0.9.0.jar and download it
Place the jar file in the folder C:\sonarqube-7.6\sonarqube-7.6\extensions\plugins\
Now it's time to create some environmental variables. In an explorer window, paste this into an address bar
Control Panel\System and Security\System
Click on Advanced System Settings, click on Environment Variable, click on new
In the variable name add SONAR_RUNNER_HOME
In the variable value add C:\sonar-scanner-cli-3.3.0.1492-windows
It will look like this
There is one more thing to do, we need to add something to the path
On older versions of windows... add the line below at the end of the path variable, on newer versions, just click on New and paste the line below
;%SONAR_RUNNER_HOME%\bin;
Ok time to run (and fail) SonarQube finally
Go to the folder C:\sonarqube-7.6\sonarqube-7.6\bin\windows-x86-32 and kick off the script StartSonar.bat
If you get an error about 32 or 64 bit, then run the script from the windows-x86-64 folder
If you run the script, if you are lucky, you won't get an error, but if you do is it this one?
jvm 1 | Error: missing `server' JVM at `C:\Program Files (x86)\Java\jre1.8.0_201\bin\server\jvm.dll'.
jvm 1 | Please install or use the JRE or JDK that contains these missing components.
C:\sonarqube-7.6\sonarqube-7.6\bin\windows-x86-32>StartSonar.bat
wrapper | ERROR: Another instance of the SonarQube application is already running.
Press any key to continue . . .
C:\sonarqube-7.6\sonarqube-7.6\bin\windows-x86-32>StartSonar.bat
wrapper | --> Wrapper Started as Console
wrapper | Launching a JVM...
jvm 1 | Wrapper (Version 3.2.3) http://wrapper.tanukisoftware.org
jvm 1 | Copyright 1999-2006 Tanuki Software, Inc. All Rights Reserved.
jvm 1 |
jvm 1 | 2019年02月12日 13:22:02 INFO app[][o.s.a.AppFileSystem] Cleaning or creating temp directory C:\sonarqube-7.6\sonarqube-7.6\temp
jvm 1 | 2019年02月12日 13:22:02 INFO app[][o.s.a.es.EsSettings] Elasticsearch listening on /127.0.0.1:9001
jvm 1 | 2019年02月12日 13:22:02 INFO app[][o.s.a.p.ProcessLauncherImpl] Launch process[[key='es', ipcIndex=1, logFilenamePrefix=es]] from [C:\sonarqube-7.6\sonarqube-7.6\elasticsearch]: C:\Program Files (x86)\Java\jre1.8.0_201\bin\java -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -server -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -Djdk.io.permissionsUseCanonicalPath=true -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j.skipJansi=true -Xms512m -Xmx512m -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.path.home=C:\sonarqube-7.6\sonarqube-7.6\elasticsearch -cp lib/* org.elasticsearch.bootstrap.Elasticsearch -Epath.conf=C:\sonarqube-7.6\sonarqube-7.6\temp\conf\es
jvm 1 | 2019年02月12日 13:22:02 INFO app[][o.s.a.SchedulerImpl] Waiting for Elasticsearch to be up and running
jvm 1 | Error: missing `server' JVM at `C:\Program Files (x86)\Java\jre1.8.0_201\bin\server\jvm.dll'.
jvm 1 | Please install or use the JRE or JDK that contains these missing components.
So to quickly fix this create a server folder in the java bin location from the error message
Now grab the files from the client folder and copy them to the server folder
Ok, we are ready to run the file again, rerun the command and the last 2 lines should be something like this
jvm 1 | 2019年02月12日 13:25:53 INFO app[][o.s.a.SchedulerImpl] Process[ce] is up
jvm 1 | 2019年02月12日 13:25:53 INFO app[][o.s.a.SchedulerImpl] SonarQube is up
Navigate to http://localhost:9000/ login with admin for username and password
Now we need to do one more thing and we are ready, open notepad or you favorite text editor, paste the following
Save that as sonar-project.properties in the folder where your code is located, in our case it is in C:\temp
Alright.. it's time to run the static code analysis...
Open a command window, cd to the C;\temp folder, and paste following
C:\sonar-scanner-cli-3.3.0.1492-windows\sonar-scanner-3.3.0.1492-windows\bin\sonar-scanner.bat
You should see something like the following
C:\temp>C:\sonar-scanner-cli-3.3.0.1492-windows\sonar-scanner-3.3.0.1492-windows\bin\sonar-scanner.bat
INFO: Scanner configuration file: C:\sonar-scanner-cli-3.3.0.1492-windows\sonar-scanner-3.3.0.1492-windows\bin\..\conf\sonar-scanner.properties
INFO: Project root configuration file: C:\temp\sonar-project.properties
INFO: SonarQube Scanner 3.3.0.1492
INFO: Java 1.8.0_121 Oracle Corporation (64-bit)
INFO: Windows 10 10.0 amd64
INFO: User cache: C:\Users\denis\.sonar\cache
INFO: SonarQube server 7.6.0
INFO: Default locale: "en_US", source code encoding: "windows-1252" (analysis is platform dependent)
INFO: Load global settings
INFO: Load global settings (done) | time=78ms
INFO: Server id: BF41A1F2-AWji9AZ8kkLV5J16bA1i
INFO: User cache: C:\Users\denis\.sonar\cache
INFO: Load/download plugins
INFO: Load plugins index
INFO: Load plugins index (done) | time=31ms
INFO: Load/download plugins (done) | time=47ms
INFO: Process project properties
INFO: Execute project builders
INFO: Execute project builders (done) | time=0ms
INFO: Project key: StaticCodeAnalysis.project
INFO: Base dir: C:\temp
INFO: Working dir: C:\temp\.scannerwork
INFO: Load project settings
INFO: Load project settings (done) | time=16ms
INFO: Load project repositories
INFO: Load project repositories (done) | time=47ms
INFO: Load quality profiles
INFO: Load quality profiles (done) | time=63ms
INFO: Load active rules
INFO: Load active rules (done) | time=1922ms
INFO: Load metrics repository
INFO: Load metrics repository (done) | time=32ms
WARN: SCM provider autodetection failed. Please use "sonar.scm.provider" to define SCM of your project, or disable the SCM Sensor in the project settings.
INFO: Language is forced to tsql
INFO: Indexing files...
INFO: Project configuration:
INFO: 23 files indexed
INFO: Quality profile for tsql: Sonar Way
INFO: ------------- Run sensors on module Static Code Analysis project
INFO: Sensor JaCoCo XML Report Importer [jacoco]
INFO: Sensor JaCoCo XML Report Importer [jacoco] (done) | time=0ms
INFO: Sensor MsIssuesLoaderSensor [tsqlopen]
INFO: Found 0 issues
INFO: Sensor MsIssuesLoaderSensor [tsqlopen] (done) | time=15ms
INFO: Sensor CodeGuardIssuesLoaderSensor [tsqlopen]
INFO: SQL Code guard path is empty, trying to search directories instead
INFO: Found 0 issues
INFO: Sensor CodeGuardIssuesLoaderSensor [tsqlopen] (done) | time=0ms
INFO: Sensor CustomChecksSensor [tsqlopen]
WARN: Property 'sonar.tsql.customrules.paths' is not declared as multi-values/property set but was read using 'getStringArray' method. The SonarQube plugin declaring this property should be updated.
INFO: Total 1 custom rules repositories with total 15 checks
INFO: Sensor CustomChecksSensor [tsqlopen] (done) | time=21548ms
INFO: Sensor CoverageSensor [tsqlopen]
INFO: Sensor CoverageSensor [tsqlopen] (done) | time=16ms
INFO: Sensor JavaXmlSensor [java]
INFO: Sensor JavaXmlSensor [java] (done) | time=0ms
INFO: Sensor HTML [web]
INFO: Sensor HTML [web] (done) | time=15ms
INFO: Sensor Zero Coverage Sensor
INFO: Sensor Zero Coverage Sensor (done) | time=16ms
INFO: ------------- Run sensors on project
INFO: No SCM system was detected. You can use the 'sonar.scm.provider' property to explicitly specify it.
INFO: 21 files had no CPD blocks
INFO: Calculating CPD for 2 files
INFO: CPD calculation finished
INFO: Analysis report generated in 250ms, dir size=127 KB
INFO: Analysis report compressed in 51ms, zip size=40 KB
INFO: Analysis report uploaded in 47ms
INFO: ANALYSIS SUCCESSFUL, you can browse http://localhost:9000/dashboard?id=StaticCodeAnalysis.project
INFO: Note that you will be able to access the updated dashboard once the server has processed the submitted analysis report
INFO: More about the report processing at http://localhost:9000/api/ce/task?id=AWjjFKvAkkLV5J16bDDx
INFO: Analysis total time: 26.228 s
INFO: ------------------------------------------------------------------------
INFO: EXECUTION SUCCESS
INFO: ------------------------------------------------------------------------
INFO: Total time: 28.017s
INFO: Final Memory: 36M/1173M
INFO: ------------------------------------------------------------------------
C:\temp>
When you get the prompt back, it's time to go to the http://localhost:9000/projects URL
You should have 1 project there, the name matches what we had in our properties file
When you click on the project, you will get some information about bugs, code smells and duplication
Clicking on code smells brings back the following.. you can then act on those or not
I added 2 of my own bad procs to see what it would flag
I also added this one Unique_Login_IPs, you can grab it here https://social.msdn.microsoft.com/Forums/en-US/55cfe1b0-402a-4468-bf7a-cc0966d4a487/faster-way-to-do-this
As you can see we got some warnings for those procs
SELECT *.. No need to comment on this one
No ASC/DESC in the order by... this defaults to ASC anyway but I guess for clarity it's better to specify ASC
Positional reference is used... I do this all the time with ad-hoc queries but I don't do it with procs
Non-sargeable argument found - column referenced in a function.
That is this line
This you would ideally rewrite by doing something like first creating the variable @startdate and let it have the value @year.03/01 in other words '20190301'
Then the WHERE clause would be something like that
That is all for this post. Of course you can do some of this stuff with other tools and even with policy management. But if you use SQL Server and many languages, you could do static code analysis from one tool
Anyway.. after you have Java installed, you can grab SonarCube here: https://www.sonarqube.org/downloads/
Create a folder named sonarqube-7.6 on the C drive, download SonarCube and extract it in C:\sonarqube-7.6\
Next we need Sonar Cube Scanner, You can download it here: https://docs.sonarqube.org/display/SCAN/Analyzing+with+SonarQube+Scanner
Create a folder name C:\sonar-scanner-cli-3.3.0.1492-windows and extract the Sonar Cube Scanner file there
Finally we need the sonar-tsql-plugin, you can download that here https://github.com/gretard/sonar-tsql-plugin/releases
Grab the file named: sonar-tsqlopen-plugin-0.9.0.jar and download it
Place the jar file in the folder C:\sonarqube-7.6\sonarqube-7.6\extensions\plugins\
Now it's time to create some environmental variables. In an explorer window, paste this into an address bar
Control Panel\System and Security\System
Click on Advanced System Settings, click on Environment Variable, click on new
In the variable name add SONAR_RUNNER_HOME
In the variable value add C:\sonar-scanner-cli-3.3.0.1492-windows
It will look like this
There is one more thing to do, we need to add something to the path
On older versions of windows... add the line below at the end of the path variable, on newer versions, just click on New and paste the line below
;%SONAR_RUNNER_HOME%\bin;
Ok time to run (and fail) SonarQube finally
Go to the folder C:\sonarqube-7.6\sonarqube-7.6\bin\windows-x86-32 and kick off the script StartSonar.bat
If you get an error about 32 or 64 bit, then run the script from the windows-x86-64 folder
If you run the script, if you are lucky, you won't get an error, but if you do is it this one?
jvm 1 | Error: missing `server' JVM at `C:\Program Files (x86)\Java\jre1.8.0_201\bin\server\jvm.dll'.
jvm 1 | Please install or use the JRE or JDK that contains these missing components.
C:\sonarqube-7.6\sonarqube-7.6\bin\windows-x86-32>StartSonar.bat
wrapper | ERROR: Another instance of the SonarQube application is already running.
Press any key to continue . . .
C:\sonarqube-7.6\sonarqube-7.6\bin\windows-x86-32>StartSonar.bat
wrapper | --> Wrapper Started as Console
wrapper | Launching a JVM...
jvm 1 | Wrapper (Version 3.2.3) http://wrapper.tanukisoftware.org
jvm 1 | Copyright 1999-2006 Tanuki Software, Inc. All Rights Reserved.
jvm 1 |
jvm 1 | 2019年02月12日 13:22:02 INFO app[][o.s.a.AppFileSystem] Cleaning or creating temp directory C:\sonarqube-7.6\sonarqube-7.6\temp
jvm 1 | 2019年02月12日 13:22:02 INFO app[][o.s.a.es.EsSettings] Elasticsearch listening on /127.0.0.1:9001
jvm 1 | 2019年02月12日 13:22:02 INFO app[][o.s.a.p.ProcessLauncherImpl] Launch process[[key='es', ipcIndex=1, logFilenamePrefix=es]] from [C:\sonarqube-7.6\sonarqube-7.6\elasticsearch]: C:\Program Files (x86)\Java\jre1.8.0_201\bin\java -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -server -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -Djdk.io.permissionsUseCanonicalPath=true -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j.skipJansi=true -Xms512m -Xmx512m -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.path.home=C:\sonarqube-7.6\sonarqube-7.6\elasticsearch -cp lib/* org.elasticsearch.bootstrap.Elasticsearch -Epath.conf=C:\sonarqube-7.6\sonarqube-7.6\temp\conf\es
jvm 1 | 2019年02月12日 13:22:02 INFO app[][o.s.a.SchedulerImpl] Waiting for Elasticsearch to be up and running
jvm 1 | Error: missing `server' JVM at `C:\Program Files (x86)\Java\jre1.8.0_201\bin\server\jvm.dll'.
jvm 1 | Please install or use the JRE or JDK that contains these missing components.
So to quickly fix this create a server folder in the java bin location from the error message
Now grab the files from the client folder and copy them to the server folder
jvm 1 | 2019年02月12日 13:25:53 INFO app[][o.s.a.SchedulerImpl] Process[ce] is up
jvm 1 | 2019年02月12日 13:25:53 INFO app[][o.s.a.SchedulerImpl] SonarQube is up
Navigate to http://localhost:9000/ login with admin for username and password
Now we need to do one more thing and we are ready, open notepad or you favorite text editor, paste the following
# Required metadata sonar.projectKey=StaticCodeAnalysis.project sonar.projectName=Static Code Analysis project sonar.projectVersion=1.0 sonar.sources=StoredProcedures,UserDefinedFunctions sonar.host.url=http://localhost:9000 #sonar.exclusions=**/bin/**/*.*,**/obj/**/*.*,**/*.sqlproj # Comma-separated paths to directories of source codes to be analyzed. # Path is relative to the sonar-project.properties file. # Replace "\" by "/" on Windows. # Since SonarQube 4.2, this property is optional. # If not set, SonarQube starts looking for source code # from the directory containing the sonar-project.properties file. # Language sonar.language=tsql #Encoding of the source code #sonar.sourceEncoding=UTF-8
Save that as sonar-project.properties in the folder where your code is located, in our case it is in C:\temp
Open a command window, cd to the C;\temp folder, and paste following
C:\sonar-scanner-cli-3.3.0.1492-windows\sonar-scanner-3.3.0.1492-windows\bin\sonar-scanner.bat
You should see something like the following
C:\temp>C:\sonar-scanner-cli-3.3.0.1492-windows\sonar-scanner-3.3.0.1492-windows\bin\sonar-scanner.bat
INFO: Scanner configuration file: C:\sonar-scanner-cli-3.3.0.1492-windows\sonar-scanner-3.3.0.1492-windows\bin\..\conf\sonar-scanner.properties
INFO: Project root configuration file: C:\temp\sonar-project.properties
INFO: SonarQube Scanner 3.3.0.1492
INFO: Java 1.8.0_121 Oracle Corporation (64-bit)
INFO: Windows 10 10.0 amd64
INFO: User cache: C:\Users\denis\.sonar\cache
INFO: SonarQube server 7.6.0
INFO: Default locale: "en_US", source code encoding: "windows-1252" (analysis is platform dependent)
INFO: Load global settings
INFO: Load global settings (done) | time=78ms
INFO: Server id: BF41A1F2-AWji9AZ8kkLV5J16bA1i
INFO: User cache: C:\Users\denis\.sonar\cache
INFO: Load/download plugins
INFO: Load plugins index
INFO: Load plugins index (done) | time=31ms
INFO: Load/download plugins (done) | time=47ms
INFO: Process project properties
INFO: Execute project builders
INFO: Execute project builders (done) | time=0ms
INFO: Project key: StaticCodeAnalysis.project
INFO: Base dir: C:\temp
INFO: Working dir: C:\temp\.scannerwork
INFO: Load project settings
INFO: Load project settings (done) | time=16ms
INFO: Load project repositories
INFO: Load project repositories (done) | time=47ms
INFO: Load quality profiles
INFO: Load quality profiles (done) | time=63ms
INFO: Load active rules
INFO: Load active rules (done) | time=1922ms
INFO: Load metrics repository
INFO: Load metrics repository (done) | time=32ms
WARN: SCM provider autodetection failed. Please use "sonar.scm.provider" to define SCM of your project, or disable the SCM Sensor in the project settings.
INFO: Language is forced to tsql
INFO: Indexing files...
INFO: Project configuration:
INFO: 23 files indexed
INFO: Quality profile for tsql: Sonar Way
INFO: ------------- Run sensors on module Static Code Analysis project
INFO: Sensor JaCoCo XML Report Importer [jacoco]
INFO: Sensor JaCoCo XML Report Importer [jacoco] (done) | time=0ms
INFO: Sensor MsIssuesLoaderSensor [tsqlopen]
INFO: Found 0 issues
INFO: Sensor MsIssuesLoaderSensor [tsqlopen] (done) | time=15ms
INFO: Sensor CodeGuardIssuesLoaderSensor [tsqlopen]
INFO: SQL Code guard path is empty, trying to search directories instead
INFO: Found 0 issues
INFO: Sensor CodeGuardIssuesLoaderSensor [tsqlopen] (done) | time=0ms
INFO: Sensor CustomChecksSensor [tsqlopen]
WARN: Property 'sonar.tsql.customrules.paths' is not declared as multi-values/property set but was read using 'getStringArray' method. The SonarQube plugin declaring this property should be updated.
INFO: Total 1 custom rules repositories with total 15 checks
INFO: Sensor CustomChecksSensor [tsqlopen] (done) | time=21548ms
INFO: Sensor CoverageSensor [tsqlopen]
INFO: Sensor CoverageSensor [tsqlopen] (done) | time=16ms
INFO: Sensor JavaXmlSensor [java]
INFO: Sensor JavaXmlSensor [java] (done) | time=0ms
INFO: Sensor HTML [web]
INFO: Sensor HTML [web] (done) | time=15ms
INFO: Sensor Zero Coverage Sensor
INFO: Sensor Zero Coverage Sensor (done) | time=16ms
INFO: ------------- Run sensors on project
INFO: No SCM system was detected. You can use the 'sonar.scm.provider' property to explicitly specify it.
INFO: 21 files had no CPD blocks
INFO: Calculating CPD for 2 files
INFO: CPD calculation finished
INFO: Analysis report generated in 250ms, dir size=127 KB
INFO: Analysis report compressed in 51ms, zip size=40 KB
INFO: Analysis report uploaded in 47ms
INFO: ANALYSIS SUCCESSFUL, you can browse http://localhost:9000/dashboard?id=StaticCodeAnalysis.project
INFO: Note that you will be able to access the updated dashboard once the server has processed the submitted analysis report
INFO: More about the report processing at http://localhost:9000/api/ce/task?id=AWjjFKvAkkLV5J16bDDx
INFO: Analysis total time: 26.228 s
INFO: ------------------------------------------------------------------------
INFO: EXECUTION SUCCESS
INFO: ------------------------------------------------------------------------
INFO: Total time: 28.017s
INFO: Final Memory: 36M/1173M
INFO: ------------------------------------------------------------------------
C:\temp>
When you get the prompt back, it's time to go to the http://localhost:9000/projects URL
You should have 1 project there, the name matches what we had in our properties file
sonar.projectName=Static Code Analysis project
When you click on the project, you will get some information about bugs, code smells and duplication
Clicking on code smells brings back the following.. you can then act on those or not
I added 2 of my own bad procs to see what it would flag
create proc BadProc as select * from Person.Address order by 2,3
I also added this one Unique_Login_IPs, you can grab it here https://social.msdn.microsoft.com/Forums/en-US/55cfe1b0-402a-4468-bf7a-cc0966d4a487/faster-way-to-do-this
As you can see we got some warnings for those procs
SELECT *.. No need to comment on this one
No ASC/DESC in the order by... this defaults to ASC anyway but I guess for clarity it's better to specify ASC
Positional reference is used... I do this all the time with ad-hoc queries but I don't do it with procs
Non-sargeable argument found - column referenced in a function.
That is this line
WHERE (method = 'LOGIN') AND (YEAR(logged) = @year1) AND (MONTH(logged) = 3)) as tmpy1_3
This you would ideally rewrite by doing something like first creating the variable @startdate and let it have the value @year.03/01 in other words '20190301'
Then the WHERE clause would be something like that
WHERE (method = 'LOGIN') AND logged >= @startdate and logged < dateadd(mm,1,@startdate)
That is all for this post. Of course you can do some of this stuff with other tools and even with policy management. But if you use SQL Server and many languages, you could do static code analysis from one tool
Subscribe to:
Posts (Atom)