Matching TSQL comment statements with regular expressions

Time:2022-1-2

Let’s look at some examples:

Copy codeThe code is as follows:
–Get the count information of the table
select count(*) from T with(nolock)

–Get count information for a specific value
select count(*) from T with(nolock)
where v = ‘–value’

–Get count information for table’t ‘
select count(*) from T with(nolock)

Select * from T — get table t
Where P

Let’s try to give a simple match:

Copy codeThe code is as follows:
\-\-[^\r\n]*$

You will find that it matches the second SQL, which is wrong. It seems that we should exclude the in ‘”, and we will change it again:

Copy codeThe code is as follows:
\-\-[^\’\r\n]{0,}$

It’s still wrong. Although the SQL of Article 2 doesn’t match, it doesn’t even match that of Article 3. It still looks wrong.

So how can we really match all SQL comments?

First, let’s summarize some features of SQL annotation:

1. Start with –

2. The contents of comments should not be included in a pair of ”

3. Comments should only appear at the end. Some optional statements can appear in front of them

Well, after collecting these, the syntax of our final SQL annotation is also shown:

Copy codeThe code is as follows:
\-\-([^\’\r\n]{0,}(\'[^\’\r\n]{0,}\’){0,1}[^\’\r\n]{0,}){0,}$

Now, all four SQL comments match. Regular expressions are too powerful.

A small problem with this SQL statement is that there cannot be a single quotation mark behind it, otherwise there will be a matching problem. (because people’s habit is that such separators appear in pairs by default, this small problem can be ignored)