Extracting a database name from a file

Question 1

I am developing a project that converts SQL server file to MySQL file based on some assumptions. I have written a function that takes the string source as input and it extracts the database name. Is there any way to improve the code?

public String getDatabaseNameFromMySqlServerFile(String source) {
 String sourceFile = null;
 String databaseName = "";
 Path file = Paths.get(source);
 try (InputStream in = Files.newInputStream(file);
 BufferedReader reader
 = new BufferedReader(new InputStreamReader(in))) {
 String line = null;
 while ((line = reader.readLine()) != null) {
 if (line.toLowerCase().contains("create database")) {
 String[] Arrline = line.split(" ");
 for (String s : Arrline) {
 if (!s.toLowerCase().equals("create") && !s.toLowerCase().equals("database")) {
 databaseName = s;
 return s;
 }
 }
 }
 sourceFile = sourceFile + line;
 }
 } catch (IOException x) {
 System.err.println(x);
 }
 return null;
}

Question 2

databaseName is a pointless local variable: you can simply delete all lines using it.

The same goes for sourceFile.

Arrline and s are poorly named variables, and s is lowercased twice:

String[] Arrline = line.split(" ");
for (String s : Arrline) {
 if (!s.toLowerCase().equals("create") && !s.toLowerCase().equals("database")) {
 return s;
 }
}

Cleaning these up the code becomes:

String[] segments = line.split(" ");
for (String origSegment : segments) {
 String segment = origSegment.toLowerCase();
 if (!segment.equals("create") && !segment.equals("database")) {
 return origSegment;
 }
}

The current method of extracting the database name is quite awkward and inefficient. Look at these steps:

lowercase the line
check if it contains a string
split the line to tokens
for each token
- if the lowercased token is neither "create" nor "database" then return it

You can do much better using regular expressions:

Pattern DBNAME_PATTERN = Pattern.compile("create +database +(\\w+)", Pattern.CASE_INSENSITIVE);
private String extractDatabaseName(String input) {
 Matcher matcher = DBNAME_PATTERN.matcher(input);
 if (matcher.find()) {
 return matcher.group(1);
 }
 return null;
}

The initialization of line is unnecessary, because it gets assigned on the next line anyway:

String line = null;

So a simple declaration is enough:

String line;

Question 3

Very informative answer. I wasn't aware of the matcher class.

Question 4

First of all, you really need to space that code, it is hard to read when there's no space. Also, because of the try block, the indentation is messing with my eyes. Try to keep it in one line, it would be easier to read, even if it is a little long line.

Appending strings together is never a good idea, it is heavy in terms of performance. You should use a StringBuilder. In fact, why do you keep the sourceFile string? It doesn't seem to be used anywhere. You should remove this line as it might be expensive on large files.

Also, I don't think you should catch the IOException. How do I know what happened if null is returned. Was it the file that was locked, non-existant, was there no create table in the script? Let the exception be, so people can catch it and do whatever they want with it. Catching it now seems irrelevant to me.

Question 5

I think your second paragraph is a bit restricting. I guess I know what you mean, but your statement as is would criticize the (perfectly fine) implementation of Object.toString().

Question 6

It's not exactly the same thing. Though you get the point string concat using += is worst than using the StringBuilder

Question 7

Using the native features of Java is important. Consider the new-to-Java8 features of a streaming file-by-line reader:

Stream<String> stream = Files.lines(Paths.get(filepath))

That will process each line, as a line, in the stream. In Java 8 this would almost always be superior to a BUfferedReader and other wrappers on your file. You can then chain up the filters you want on it. I suggest using Regular Expressions, they are good for this. A pattern that looks like:

private static final Pattern DATABASE = Pattern.compile(
 ".*?\\bcreate\\s+database\\s+(\\S+).*",
 Pattern.CASE_INSENSITIVE | Pattern.DOTALL);

That pattern will find create database XXX in the line, and remember whatever the XXX is, in group 1.

You can now convert each line in to a matcher, and filter those lines which match, and recall the database.

Additionally, the Streaming API is able to terminate when the first match is found. The code would all look like:

Files.lines(Paths.get(filepath))
 .map(line -> DATABASE.matcher(line))
 .filter(Matcher::matches)
 .map(mat -> mat.group(1))
 .findFirst()
 .orElse(null);

That would need to be wrapped up in to a method that handles the IOException, and cleanly closes the file:

public static String locateDatabase(String filepath) {
 try (Stream<String> stream = Files.lines(Paths.get(filepath))) {
 return stream
 .map(line -> DATABASE.matcher(line))
 .filter(Matcher::matches)
 .map(mat -> mat.group(1))
 .findAny()
 .orElse(null);
 } catch (IOException ioe) {
 ioe.printStackTrace();
 return null;
 }
}

janos janosjanos 113k15 gold badges154 silver badges396 bronze badges · Accepted Answer · 2015-09-16 19:17:56Z

databaseName is a pointless local variable: you can simply delete all lines using it.

The same goes for sourceFile.

Arrline and s are poorly named variables, and s is lowercased twice:

String[] Arrline = line.split(" ");
for (String s : Arrline) {
 if (!s.toLowerCase().equals("create") && !s.toLowerCase().equals("database")) {
 return s;
 }
}

Cleaning these up the code becomes:

String[] segments = line.split(" ");
for (String origSegment : segments) {
 String segment = origSegment.toLowerCase();
 if (!segment.equals("create") && !segment.equals("database")) {
 return origSegment;
 }
}

The current method of extracting the database name is quite awkward and inefficient. Look at these steps:

lowercase the line
check if it contains a string
split the line to tokens
for each token
- if the lowercased token is neither "create" nor "database" then return it

You can do much better using regular expressions:

Pattern DBNAME_PATTERN = Pattern.compile("create +database +(\\w+)", Pattern.CASE_INSENSITIVE);
private String extractDatabaseName(String input) {
 Matcher matcher = DBNAME_PATTERN.matcher(input);
 if (matcher.find()) {
 return matcher.group(1);
 }
 return null;
}

The initialization of line is unnecessary, because it gets assigned on the next line anyway:

String line = null;

So a simple declaration is enough:

String line;

Very informative answer. I wasn't aware of the matcher class.

Stack Exchange Network

Extracting a database name from a file

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Extracting a database name from a file

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions